Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration #41

Merged
merged 133 commits into from
Sep 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
133 commits
Select commit Hold shift + click to select a range
282abaa
REFACT: Structure types and more
carlwilson Jan 4, 2024
c78cff5
Fixed casing for mdRef references in CSIP schematron tests
Sunday-Crunk Jan 10, 2024
7b274ca
FIX: Pylint errors.
carlwilson Jan 10, 2024
1af45b0
Merge pull request #12 from E-ARK-Software/fix/pylint
carlwilson Jan 10, 2024
707b168
FEAT: Pydantic types
carlwilson Jan 10, 2024
2620887
FIX: Structure validation with Pydantic
carlwilson Jan 10, 2024
3d05912
FEAT: Checksum Types
carlwilson Jan 11, 2024
b8ead67
REFACT: Model and code cleanup
carlwilson Jan 12, 2024
7a275f7
FIX: Checksum value validation
carlwilson Jan 12, 2024
776faf3
CSIP88 schematron test is correct, but the validation message is from…
Sunday-Crunk Jan 15, 2024
2d4b30f
added missing div (CSIP84) to context of rule containing CSIP94+96
Sunday-Crunk Jan 15, 2024
117935e
Added missing div container (CSIP84) to structmap metadata div rules …
Sunday-Crunk Jan 15, 2024
b7a3ec9
Added missing main structural div to remaining structMap contexts
Sunday-Crunk Jan 19, 2024
796a024
CSIP100 validation message erroneously referred to documentation inst…
Sunday-Crunk Jan 19, 2024
555c7ac
added missing mets NS to folder representation fptr elements in rule …
Sunday-Crunk Jan 19, 2024
fc68fbe
added mets NS to mptr refs in structMap rules
Sunday-Crunk Jan 29, 2024
210dd90
REFACT: Manifest classes
carlwilson Feb 2, 2024
04e1cca
CI: Python build matrix, 3.9 to 3.12
carlwilson Feb 2, 2024
accd400
Merge pull request #17 from E-ARK-Software/ci/python-versions
carlwilson Feb 2, 2024
998e868
Merge remote-tracking branch 'origin/integration' into feat/struct-types
carlwilson Feb 2, 2024
c1f5571
Merge branch 'feat/struct-types' into feat/pydantic-types
carlwilson Feb 2, 2024
69e69fb
Merge remote-tracking branch 'origin/integration' into refact/manifests
carlwilson Feb 2, 2024
a7cadae
CI: Disable fail-fast for matrix.
carlwilson Feb 2, 2024
ef73179
REFACT: Final Manifest tweaks plus minor fixes
carlwilson Feb 9, 2024
025d757
Including files as in setup.py
dockmd Feb 14, 2024
c8e2883
Reduced patterns of copied files
dockmd Feb 14, 2024
3533d6b
MAINT: Quick package publication
carlwilson Feb 15, 2024
5d8a86b
MAINT: Package publish
carlwilson Feb 15, 2024
913fcc8
FIX: Package Publication.
carlwilson Feb 15, 2024
cbe9c42
FIX: Bad test for virtenv.
carlwilson Feb 15, 2024
21610a9
Merge branch 'integration' into feat/fixing-setup-file
carlwilson Feb 16, 2024
cb67e98
Merge pull request #20 from E-ARK-Software/feat/fixing-setup-file
carlwilson Feb 16, 2024
eb3f43a
Merge branch 'integration' into refact/publication
carlwilson Feb 16, 2024
ccfc573
Merge branch 'integration' into fix/CSIP88-Deprecated-assertion-message
carlwilson Feb 16, 2024
24a3f8c
Merge branch 'integration' into fix/structMap-context-fix
carlwilson Feb 16, 2024
b4ad134
Merge pull request #15 from Sunday-Crunk/fix/CSIP88-Deprecated-assert…
carlwilson Feb 16, 2024
ec61378
Merge branch 'integration' into feat/pydantic-types
carlwilson Feb 16, 2024
eb7b1d1
MAINT: Removed Python 3.9 build.
carlwilson Feb 16, 2024
59cd0bd
Merge pull request #22 from E-ARK-Software/maint/python-310
carlwilson Feb 16, 2024
8d4b6ac
Merge branch 'integration' into feat/pydantic-types
carlwilson Feb 16, 2024
da7c45e
Merge pull request #13 from E-ARK-Software/feat/pydantic-types
carlwilson Feb 16, 2024
3272c43
Merge branch 'integration' into fix/structMap-context-fix
carlwilson Feb 16, 2024
c849891
Merge branch 'integration' into refact/publication
carlwilson Feb 16, 2024
3e7b310
Delete MANIFEST.in
carlwilson Feb 16, 2024
dfa7003
Merge pull request #21 from E-ARK-Software/refact/publication
shsdev Feb 16, 2024
81a97a4
Merge branch 'integration' into fix/structMap-context-fix
carlwilson Feb 19, 2024
dbf5a13
FIX: Valid METS example
carlwilson Feb 19, 2024
6f719f4
Merge pull request #23 from E-ARK-Software/merge/sunday-crunk
carlwilson Feb 19, 2024
c1dd237
REFACT: Specification types
carlwilson Feb 20, 2024
09bbfb5
FIX: Non-iterable computed_field.
carlwilson Feb 20, 2024
969dfe5
Merge pull request #24 from E-ARK-Software/refact/specifications
shsdev Feb 21, 2024
946027a
REFACT: Cleanup rulesets and requirements.
carlwilson Feb 21, 2024
89597c6
Merge branch 'integration' into refact/rulesets
carlwilson Feb 21, 2024
13dc86b
FIX: Dump rich for now.
carlwilson Feb 21, 2024
9713943
Merge branch 'refact/rulesets' of github.com:E-ARK-Software/eark-vali…
carlwilson Feb 21, 2024
94f4eff
Added schematron and vocabs XMLs
dockmd Feb 21, 2024
aedbec4
Merge pull request #26 from E-ARK-Software/fix/missing-xml-files
carlwilson Feb 21, 2024
e4d034f
Merge branch 'integration' into refact/rulesets
carlwilson Feb 21, 2024
f7f6ae9
MAINT: Structure rules test coverage
carlwilson Feb 22, 2024
0f7b167
Merge branch 'integration' into maint/struct-test-coverage
carlwilson Feb 22, 2024
ff9fe95
Merge pull request #25 from E-ARK-Software/refact/rulesets
carlwilson Feb 22, 2024
96e098c
Merge branch 'integration' into maint/struct-test-coverage
carlwilson Feb 22, 2024
d0b9994
Merge pull request #27 from E-ARK-Software/maint/struct-test-coverage
carlwilson Feb 22, 2024
7ca9aa7
MAINT: Address Pylint issues
carlwilson Feb 28, 2024
a392cb1
Merge pull request #28 from E-ARK-Software/maint/pylint
carlwilson Feb 28, 2024
ff01324
Renamed list of schematron results
dockmd Feb 28, 2024
4867344
REFACT: StructuralRequirment -> Requirement
carlwilson Feb 28, 2024
b2685c8
Merge pull request #29 from E-ARK-Software/fix/Missing-schematron-res…
carlwilson Feb 28, 2024
4372083
Merge branch 'integration' into refact/parser-api
carlwilson Feb 28, 2024
e09de74
Always validating CSIP schematron rules
dockmd Mar 20, 2024
c3fca2f
Renamed validation profile initialization method
dockmd Mar 20, 2024
6e6edf1
Fixed tests
dockmd Apr 3, 2024
5773667
Updated TESTING readme guide
dockmd Apr 3, 2024
0405661
Added tests for unimplemented specifications
dockmd Apr 3, 2024
deeffd2
Fixed whitespaces
dockmd Apr 3, 2024
f83d6c3
Merge pull request #31 from E-ARK-Software/feat/csip-schematron
carlwilson Apr 11, 2024
3a1c637
Versioning support
dockmd Apr 26, 2024
bc0f257
Merge branch 'integration' into refact/parser-api
carlwilson May 8, 2024
6fd4a44
FIX: Response to review comments
carlwilson May 8, 2024
957aded
Merge pull request #30 from E-ARK-Software/refact/parser-api
carlwilson May 8, 2024
831bb4e
2.1.0 schematron rules
dockmd May 22, 2024
79911b5
Updated tests
dockmd May 22, 2024
f274d78
Merge branch 'integration' into feat/V2.1.0-specification
carlwilson May 23, 2024
a60e805
Fixed pylint errors
dockmd May 29, 2024
79d91f3
Fixed whitespaces
dockmd May 29, 2024
21c6437
Merge pull request #37 from E-ARK-Software/feat/V2.1.0-specification
dockmd May 31, 2024
a1b52c4
Missing DIP rules
dockmd May 31, 2024
74812a7
Missing CSIP rules
dockmd May 31, 2024
5e8b46f
FIX DIP2 rule
dockmd May 31, 2024
cb57644
Fixed valid mets xml file
dockmd May 31, 2024
c7ca708
Merge branch 'integration' into feat/V2.0.4-specification
carlwilson Jun 5, 2024
f6c4b6e
FEAT: Ouptut report JSON schema
carlwilson Jun 5, 2024
aaafdce
Merge pull request #38 from E-ARK-Software/feat/V2.0.4-specification
dockmd Jun 5, 2024
4e47d3d
Merge branch 'integration' into feat/json-schema-output
carlwilson Jun 5, 2024
811668c
Fixed resources paths
dockmd Jun 5, 2024
529bbe5
Merge pull request #40 from E-ARK-Software/fix/missing-resource-files
carlwilson Jun 6, 2024
b9ab88b
Merge branch 'integration' into feat/json-schema-output
carlwilson Jun 6, 2024
c20a656
Merge pull request #39 from carlwilson/feat/json-schema-output
carlwilson Jun 6, 2024
94da850
CSIP1-10 requrement fix
dockmd Jul 9, 2024
36c77c3
Loading vocabulary tests
dockmd Jul 16, 2024
c740ea7
Updated schematron assertions
dockmd Jul 16, 2024
bdd2fd2
Updated tests and resources
dockmd Jul 16, 2024
cded31e
Fixed whitespaces
dockmd Jul 16, 2024
daa06b0
Fixed CSIP14 validation
dockmd Aug 7, 2024
a2e0709
Fixed CSIP15 validation
dockmd Aug 7, 2024
fc316ca
Fixed CSIP20 validation
dockmd Aug 7, 2024
ff249af
Fixed validation for 2.0.4
dockmd Aug 7, 2024
298a7ae
Fixed validation for CSIP24
dockmd Aug 7, 2024
1583cf0
Fixed CSIP26 validation
dockmd Aug 7, 2024
cc2b539
Fixed CSIP34 validation
dockmd Aug 7, 2024
c14a8dc
Fixed CSIp40 validation
dockmd Aug 7, 2024
ced3b75
Fixed 'Other' inconsistiency for CSIP2 3 and 4
dockmd Aug 7, 2024
30d9618
Precommit settings
dockmd Aug 7, 2024
e31cf76
Precommit settings
dockmd Aug 7, 2024
491f9af
FEAT: Commons IP schema validation
carlwilson Aug 11, 2024
0e01c0c
Fixed status warn and error behaviour
dockmd Aug 21, 2024
c872fdc
Fixed thwn rule_id is None
dockmd Aug 21, 2024
9115919
Removed wrong file
dockmd Aug 21, 2024
35604c3
Merge pull request #48 from E-ARK-Software/FIX/requrements
carlwilson Aug 27, 2024
c4f6da9
MAINT: Use setuptools 2.
carlwilson Aug 27, 2024
e9dacbf
Merge remote-tracking branch 'origin/integration' into fix/commons-ip
carlwilson Aug 27, 2024
3f8ad91
MAINT: Merge integration into current branch.
carlwilson Aug 27, 2024
c7b6eb4
Merge pull request #62 from E-ARK-Software/maint/update-git-versioning
carlwilson Aug 27, 2024
834fdfa
FIX: Deserialisation of rule_id
carlwilson Aug 27, 2024
9d7d6a4
Merge branch 'integration' into fix/commons-ip
carlwilson Aug 27, 2024
2935fdb
FEAT: Convert commons-ip representations
carlwilson Aug 27, 2024
ea44511
Merge branch 'fix/commons-ip' of github.com:E-ARK-Software/eark-valid…
carlwilson Aug 27, 2024
56deacc
FEAT: Final commons-ip compatibility tweaks
carlwilson Aug 28, 2024
3d3ce76
FIX: Support commons-ip use of NOTVALID
carlwilson Aug 29, 2024
1c65fe9
REV: Tidier use of ROOT in structure.py
carlwilson Aug 29, 2024
d1cea32
Merge pull request #60 from E-ARK-Software/fix/commons-ip
carlwilson Aug 29, 2024
f13e9f9
FIX: Expand Pydantic dependency versions.
carlwilson Sep 2, 2024
34343eb
Merge pull request #63 from E-ARK-Software/maint/pydantic-range
carlwilson Sep 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,33 @@ jobs:

steps:
- uses: actions/[email protected]
with:
fetch-depth: 0
- name: Set up Python
uses: actions/[email protected]
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ".[testing]"
- name: Install python package
run: |
pip install --editable ".[testing]"
- name: Static Pylint code QA
run: |
pylint --errors-only eark_validator
- name: Run pre-commit tests
run: pre-commit run --all-files --verbose
- name: Test with pytest
run: |
pytest
- name: Test setuptools-git-versioning versioning
run: |
python -m pip install setuptools_git_versioning
python -m setuptools_git_versioning
- name: Install build utils
run: |
pip install build
- name: Build package
run: python -m build
Expand Down
9 changes: 6 additions & 3 deletions .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,16 @@ jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["pypy3.10", "3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v3
- name: Set up Python 3.10
uses: actions/setup-python@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: "3.10"
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ website of the Digital Information LifeCycle Interoperability Standards Board (D

### Pre-requisites

You must be running either a Debian/Ubuntu Linux distribution or Windows Subsystem for Linux on Windows to follow these commands.
Python 3.10 or later is required to run the E-ARK Python Information Package Validator.

You must be running either a Debian/Ubuntu Linux distribution or Windows Subsystem for Linux on Windows to follow these commands.
If you are running a different Linux distribution you must change the apt commands to your package manager.

For getting Windows Subsystem for Linux up and running, please follow the guide further down and then come back to this step.

### Getting up and running with the E-ARK Python Information Package Validator
Expand Down Expand Up @@ -88,7 +88,7 @@ pip install -U pip
pip install .
```

You are now able to run the application "ip-check". It will validate an Information Package for you.
You are now able to run the application "eark-validator". It will validate an Information Package for you.


#### Testing a valid package.
Expand All @@ -111,10 +111,10 @@ Delete the .zip-file you just downloaded:
rm mets-xml_metsHdr_agent_TYPE_exist.zip
```

Run the ip-check:
Run the eark-validator:

```shell
ip-check mets-xml_metsHdr_agent_TYPE_exist/
eark-validator mets-xml_metsHdr_agent_TYPE_exist/
```

Result:
Expand Down Expand Up @@ -146,7 +146,7 @@ user@machine:~$ tree input

If you do not have Linux and have not previously used WSL please perform the following steps. You must either be logged in as Administrator on the machine or as a user with Administrator rights on the machine.

Start er command prompt (cmd.exe) and then enter the following command:
Start a command prompt (cmd.exe) and then enter the following command:

```shell
wsl --install
Expand Down Expand Up @@ -199,4 +199,4 @@ pip install --editable ".[testing]"

### Running tests

You can run unit tests from the project root: `pytest ./tests/`, or generate test coverage figures by: `pytest --cov=ip_validation ./tests/`. If you want to see which parts of your code aren't tested then: `pytest --cov=ip_validation --cov-report=html ./tests/`. After this you can open the file [`<projectRoot>/htmlcov/index.html`](./htmlcov/index.html) in your browser and survey the gory details.
You can run unit tests from the project root: `pytest ./tests/`, or generate test coverage figures by: `pytest --cov=eark_validator ./tests/`. If you want to see which parts of your code aren't tested then: `pytest --cov=eark_validator --cov-report=html ./tests/`. After this you can open the file [`<projectRoot>/htmlcov/index.html`](./htmlcov/index.html) in your browser and survey the gory details.
1 change: 0 additions & 1 deletion VERSION

This file was deleted.

2 changes: 0 additions & 2 deletions eark_validator/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,3 @@
E-ARK : Python information package validation

"""

__version__ = '1.1.1'
97 changes: 67 additions & 30 deletions eark_validator/cli/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,49 +26,75 @@
E-ARK : Information package validation
Command line validation application
"""
import argparse
from pprint import pprint
import json
import os.path
from pathlib import Path
import sys
from typing import Optional, Tuple
import importlib.metadata

import argparse

import eark_validator.structure as STRUCT
from eark_validator.model import ValidationReport
import eark_validator.packages as PACKAGES
from eark_validator.infopacks.package_handler import PackageHandler
from eark_validator.specifications.specification import SpecificationVersion

__version__ = '0.1.0'
__version__ = importlib.metadata.version('eark_validator')

defaults = {
'description': """E-ARK Information Package validation (ip-check).
ip-check is a command-line tool to analyse and validate the structure and
'description': """E-ARK Information Package validation (eark-validator).
eark-validator is a command-line tool to analyse and validate the structure and
metadata against the E-ARK Information Package specifications.
It is designed for simple integration into automated work-flows.""",
'epilog': """
DILCIS Board (http://dilcis.eu)
See LICENSE for license information.
GitHub: https://github.com/E-ARK-Software/py-rest-ip-validator
Author: Carl Wilson (OPF), 2020-2023
Maintainer: Carl Wilson (OPF), 2020-2023"""
GitHub: https://github.com/E-ARK-Software/eark-validator
Author: Carl Wilson (OPF), 2020-2024
Maintainer: Carl Wilson (OPF), 2020-2024"""
}

# Create PARSER
PARSER = argparse.ArgumentParser(description=defaults['description'], epilog=defaults['epilog'])
PARSER = argparse.ArgumentParser(prog='eark-validator',
description=defaults['description'],
epilog=defaults['epilog'])

def parse_command_line():
"""Parse command line arguments."""
# Add arguments
PARSER.add_argument('-r', '--recurse',
action='store_true',
dest='inputRecursiveFlag',
default=True,
default=False,
help='When analysing an information package recurse into representations.')
PARSER.add_argument('-c', '--checksum',
action='store_true',
dest='inputChecksumFlag',
default=False,
help='Calculate and verify file checksums in packages.')
help='Calculate and verify package checksums.')
PARSER.add_argument('-m', '--manifest',
action='store_true',
dest='inputManifestFlag',
default=False,
help='Display package manifest information.')
PARSER.add_argument('-v', '--verbose',
action='store_true',
dest='outputVerboseFlag',
default=False,
help='report results in verbose format')
help='Verbose reporting for selected output options.')
PARSER.add_argument('--schema',
action='store_true',
dest='output_schema',
default=False,
help='Request display of the JSON schema of the output report.')
PARSER.add_argument('-s', '--specification_version',
nargs='?',
dest='specification_version',
default=SpecificationVersion.V2_1_0,
type=SpecificationVersion,
choices=list(SpecificationVersion),
help='Specification version used for validation. Default is %(default)s.')
PARSER.add_argument('--version',
action='version',
version=__version__)
Expand All @@ -89,37 +115,48 @@ def main():
# Get input from command line
args = parse_command_line()
# If no target files or folders specified then print usage and exit
if not args.files:
if _is_show_help(args):
PARSER.print_help()

if args.output_schema:
print(json.dumps(ValidationReport.model_json_schema(), indent=2))
sys.exit(0)

# Iterate the file arguments
for file_arg in args.files:
_loop_exit, _ = _validate_ip(file_arg)
_loop_exit, _ = _validate_ip(file_arg, args.specification_version)
_exit = _loop_exit if (_loop_exit > 0) else _exit
sys.exit(_exit)

def _validate_ip(info_pack):
ret_stat = _check_path(info_pack)
struct_details = STRUCT.validate_package_structure(info_pack)
pprint('Path {}, struct result is: {}'.format(info_pack,
struct_details.status))
for error in struct_details.errors:
pprint(error.to_json())
def _validate_ip(path: str, version: SpecificationVersion) -> Tuple[int, Optional[ValidationReport]]:
ret_stat, checked_path = _check_path(path)
if ret_stat > 0:
return ret_stat, None
report = PACKAGES.PackageValidator(checked_path, version).validation_report
print(f'Path {checked_path}, struct result is: {report.structure.status.value}')
# for message in report.structure.messages:
print(report.model_dump_json())

return ret_stat, struct_details
return ret_stat, report

def _check_path(path):
def _check_path(path: str) -> Tuple[int, Optional[Path]]:
if not os.path.exists(path):
# Skip files that don't exist
pprint('Path {} does not exist'.format(path))
return 1
print(_format_check_path_message(path, 'does not exist'))
return 1, None
if os.path.isfile(path):
# Check if file is a archive format
if not STRUCT.ArchivePackageHandler.is_archive(path):
if not PackageHandler.is_archive(path):
# If not we can't process so report and iterate
pprint('Path {} is not a file we can process.'.format(path))
return 2
return 0
print(_format_check_path_message(path, 'is not an archive file or directory'))
return 2, None
return 0, Path(path)

def _format_check_path_message(path: Path, message: str) -> str:
return f'Processing terminated, path: {path} {message}.'

def _is_show_help(args) -> bool:
return not args.files and not args.output_schema

# def _test_case_schema_checks():
if __name__ == '__main__':
Expand Down
2 changes: 1 addition & 1 deletion eark_validator/const.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
E-ARK (https://e-ark4all.eu/)
Open Preservation Foundation (http://www.openpreservation.org)
See LICENSE for license information.
Author: Carl Wilson (OPF), 2016-17
Author: Carl Wilson (OPF), 2016-24
This work was funded by the European commission project funded
as grant number LC-01390244 CEF-TC-2019-3 E-ARK3 under
CONNECTING EUROPE FACILITY (CEF) - TELECOMMUNICATIONS SECTOR
Expand Down
Loading
Loading