Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration #41

Merged
merged 133 commits into from
Sep 11, 2024
Merged

Integration #41

merged 133 commits into from
Sep 11, 2024

Conversation

shsdev
Copy link
Contributor

@shsdev shsdev commented Jun 7, 2024

Merge integration branch into main

carlwilson and others added 30 commits January 4, 2024 18:21
- created new module for API model types `eark_validator.model` that contains types for:
  - `Level` (requirements level);
  - `PackageDetails`;
  - `Severity` (of test result);
  - `StructResults` (for structural validation results);
  - `StructStatus` (for structural validation status);
  - `TestResult` (for validation test results); and
  - `ValidationReport` (for final validation report);
- refactored structural validation to use new types;
- small fix to schematron test for `CSIP11` in `mets_metsHdr_rules.xml`;
- new `eark_validator.packages` module for package validation;
- introduced a `PackageHandler` type to start abstraction of package parsing (replaces `STRUCT.ArchivePackageHandler`);
- introduced a `ValidationReport` type to handle final aggregation of validation results;
- added missing `__init__.py` files with appropriate commenting;
- removed some defunct types;
- removed unused imports;
- introduced light use of pydantic, better use to come; and
- unit test improvements and fixes.
- converted struct types to Pydantic; and
- fixed constructors where necessary.
- let the validator handle package unpacking;
- added empty list default for `Checksum`; and
- replaced `pprint` with `print` for now.
- added model types for Checkums and ChecksumAlg types;
- removed/refactored defunct Checksum types; and
- refactored tests to use new Checksum types.
- refactored model code to group related types in modules;
- better inclusing of model code in `__init__.py`;
- completed addition of type hints to methods;
- output of CLI now mostly JSON;
- terminate processing of bad input files quickly; and
- removed unused imports.
- fixed use of `to_upper` and use of `model_validate()` to force upper case checksum values;
- catch `str` initialization of `Checksummer` so that a valid `ChecksumAlg` is always used; and
- added test for case insensitivity of checksum values.
… the deprecated CSIP86 requirement. Replaced with text from CSIP88.
- `eark_validator/cli/app.py`:
  - validation now uses `eark_validator.model.ValidationReport` module;
  - improved `argparse` docs a little;
  - updated epilog date;
- `eark_validator/infopacks/manifest.py`:
  - `FileItem` and `Manifest` types moved to appropriate `model` modules;
  - `ManifestEntries` and `Manifests` classes now hold API/factory methods for above;
  - no more `Checksum`s from METS files;
  - uses `Path` rather than `str` for file paths;
- `eark_validator/mets.py b/eark_validator/mets.py`:
  - METS validation moved to `MetsFiles` class;
  - use `FileItem` rather than `FileEntry` types for file lists;
  - utility methods for `FileEntry` added;
- added file headers where missing;
- added a `SchematronRuleset.get_reports()` genrator method;
- improved type hints here and there;
- better/refactored tests; and
- fixed imports post-refactoring.
CI: Python build matrix, 3.9 to 3.12
- added `pickle` based serialisation and deserialisation for `Manifest` class;
- manifests can now be validated against an alternative root;
- fixed faulty `|` as `or` in `MetsFiles.from_file()` method;
- model `MetsFile.default_ns` is now a dictionary of namespace prefixes and URIs;
- fixed handling of relative paths during Manifest creation/validation;
- improved testing of manifest classes; and
- typehints here and there.
- loose model types for InformationPackages, plus removed defunct types;
- JSON output from Pydantic types;
- changed application name from `ip-check` to `eark-validator`;
- streamline application/project versioning;
- updated `./README.md` to reflect name change and pre-requisites;
- removed Python 3.9 build and updated docs to Python 3.10; and
- updated lxml dependency.
- fixed workflow to get tags when available;
- recofigured `setuptools-git-versioning` so no VERSION file; and
- removed VERSION file.
- added a `model_config` paramter to allow both `rule_id` and `ruleId` as keys in Pydantic model validation;
- fixes issue with deserialisation of `rule_id` in `Rule` model.
- added a `model_validator` to the `PackageDetails` class to convert incoming JSON dictionary to a `List` for now;
- added PyDantic config to `Result` to allow multiple names for `rule_id` during validation;
- reverted change to dictionary as it was not necessary;
- added tests and data for deserialisation of commons-ip types; and
- fixed minor compiler warnings.
- refactored `MetadataResults` to match `commons-ip`, it's probably better as well;
- moved `name` from `InformationPackage` to `PackageDetails` class;
- renamed `InformationPackage.package` to `InformationPackage.details`;
- renamed existing `ValidationReport.convert_dict` validator to `ValidationReport.convert_representations_dict` (more explicit);
- added a second validator, `VaidationReport.convert_checksum_ids`, to convert `commons-ip` checksum ids to `eark_validator` hyphenated form;
- `is_valid` convenience property to `ValidationReport` class;
- string constants for 'VALID' and 'INVALID'; and
- fixed tests to accomodate.
- added `convert_status` validator to `MetadataResults` class to convert commons-ip `NOTVALID` status to `INVALID`;
- moved checksum algorithm ID validation to the `PackageDetails` class;
- added test and test data for status conversion; and
- fixed type hinting for validation methods to `Any`.
As suggested in this [review comment](#60 (comment)), thanks.
FEAT: Commons IP schema validation
FIX: Expand Pydantic dependency versions.
@carlwilson carlwilson merged commit c10ae9a into main Sep 11, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants