Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karlsruhe scraper is broken #40

Open
derhuerst opened this issue Nov 23, 2023 · 5 comments
Open

Karlsruhe scraper is broken #40

derhuerst opened this issue Nov 23, 2023 · 5 comments

Comments

@derhuerst
Copy link

I have run ParkAPI (the upcoming v3, but that shouldn't matter, right?) with only the karlsruhe scraper, It crashes with the following log output. The lower part of the output is probably related to ParkAPI v3's handling of the error.

Unfortunately, I don't know the value that the num_occupied data validation fails with. (Would it be possible to add it to the error's attributes or message?)

INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
2023-11-23 17:52:14.559312 requesting GET https://github.com/offenesdresden/ParkAPI/raw/master/park_api/cities/Karlsruhe.geojson 
2023-11-23 17:52:15.059527 requesting GET https://web1.karlsruhe.de/service/Parken/ 
2023-11-23 17:52:15.559918 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=K01 
2023-11-23 17:52:16.060491 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=K02 
2023-11-23 17:52:16.561037 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=K03 
2023-11-23 17:52:17.061621 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=K04 
2023-11-23 17:52:17.562188 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=N02 
2023-11-23 17:52:18.062794 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=N03 
2023-11-23 17:52:18.563311 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=N05 
2023-11-23 17:52:19.063877 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=N06 
2023-11-23 17:52:19.564441 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=N07 
2023-11-23 17:52:20.064992 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S01 
2023-11-23 17:52:20.565538 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S02 
2023-11-23 17:52:21.066095 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S03 
2023-11-23 17:52:21.566602 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S04 
2023-11-23 17:52:22.067094 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S05 
2023-11-23 17:52:22.567621 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S06 
2023-11-23 17:52:23.068201 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=S07 
2023-11-23 17:52:23.568675 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=W01 
2023-11-23 17:52:24.069244 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=W02 
2023-11-23 17:52:24.569735 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=W03 
2023-11-23 17:52:25.070297 requesting GET https://web1.karlsruhe.de/service/Parken/detail.php?id=W04 
2023-11-23 17:52:25.570690 requesting GET https://web1.karlsruhe.de/service/Parken/ 
Traceback (most recent call last):
  File "/app/webapp/services/import_service/generic/parking_site_generic_import_service.py", line 224, in update_parking_site_realtime
    lot_data_input = self.lot_data_validator.validate(lot_info.to_dict())
  File "/usr/local/lib/python3.10/dist-packages/validataclass/validators/dataclass_validator.py", line 196, in validate
    validated_dict = self._pre_validate(input_data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/validataclass/validators/dataclass_validator.py", line 182, in _pre_validate
    validated_dict = super().validate(input_data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/validataclass/validators/dict_validator.py", line 163, in validate
    raise DictFieldsValidationError(field_errors=field_errors)
validataclass.exceptions.dict_exceptions.DictFieldsValidationError: DictFieldsValidationError(code='field_errors', field_errors={'num_occupied': NumberRangeError(code='number_range_error', min_value=0, max_value=2147483647)})

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/webapp/services/import_service/generic/parking_site_generic_import_service.py", line 212, in update_source_realtime
    self.update_parking_site_realtime(source, lot_data)
  File "/app/webapp/services/import_service/generic/parking_site_generic_import_service.py", line 226, in update_parking_site_realtime
    raise ImportDatasetException(dataset=lot_info.to_dict(), exception=e) from e
webapp.services.import_service.exceptions.ImportDatasetException: dataset_import_error: dataset_validation_error (500)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/flask", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/flask/cli.py", line 1064, in main
    cli.main()
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/flask/cli.py", line 358, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/app/webapp/cli/source.py", line 26, in cli_source_init_converters
    parking_site_import_generic_service.update_sources_realtime()
  File "/app/webapp/services/import_service/generic/parking_site_generic_import_service.py", line 183, in update_sources_realtime
    self.update_source_realtime(source)
  File "/app/webapp/services/import_service/generic/parking_site_generic_import_service.py", line 216, in update_source_realtime
    f'source {source.id} {source.uid} dataset {e.dataset} failed to import because of {e.exception.message}',
AttributeError: 'DictFieldsValidationError' object has no attribute 'message'
@the-infinity
Copy link
Contributor

Reason: their website is completely broken, it just shows ... nothing: https://web1.karlsruhe.de/service/Parken/

@hbruch
Copy link
Contributor

hbruch commented Dec 9, 2023

Originally, their website should parkings, but more free parkings than capacity. Their page not showing any is another, new, issue. I'll contact them.

@the-infinity
Copy link
Contributor

@hbruch
Copy link
Contributor

hbruch commented Dec 9, 2023

I hope they'll bring it online. Besides this, Karlsruhe plans to publish their parking data via the mobilithek (though for now it's still labeled "Testangebot". If we change the scraper, we should shift to the structured publication.

@the-infinity
Copy link
Contributor

The official source seems to be https://transparenz.karlsruhe.de/dataset/parkhaeuser .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants