Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy OGC-API process using CWL payload #443

Merged
merged 18 commits into from
Jul 5, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,34 @@ Changes

Changes:
--------
- No change.
- Add support of official `CWL` IANA types to allow `Process` deployment with the relevant ``Content-Type`` header
for the submitted payload (see `common-workflow-language/common-workflow-language#421 (comment)
<https://github.com/common-workflow-language/common-workflow-language/issues/421#issuecomment-1122010820>`_,
relates to `opengeospatial/NamingAuthority#169 <https://github.com/opengeospatial/NamingAuthority/issues/169>`_,
resolves `#434 <https://github.com/crim-ca/weaver/issues/434>`_).
- Support `Process` deployment using only `CWL` content provided it contains an ``id`` field representing the target
`Process` ID as per recommendation in `OGC Best Practice for Earth Observation Application Package, CWL Document
<https://docs.ogc.org/bp/20-089r1.html#toc26>`_ (resolves `#434 <https://github.com/crim-ca/weaver/issues/434>`_).
- Support `Process` deployment with a payload using ``YAML`` content instead of ``JSON``. This ``YAML`` content
**MUST** be submitted in the request with a ``Content-Type`` header either equal to ``application/x-yaml`` or
``application/ogcapppkg+yaml`` for the |ogc-app-pkg|_ schema, or using ``application/cwl+yaml`` for
a `CWL`-only definition. The definition will be loaded and converted to ``JSON`` for schema validation. Otherwise,
``JSON`` contents is assumed to be directly provided in the request payload for validation as previously accomplished.
- Add partial support of `CWL` with ``$graph`` representation for the special case where the graph is composed of a list
of exactly one `Application Package`. Multi/nested-`CWL` definitions are **NOT** supported
(relates to `#56 <https://github.com/crim-ca/weaver/issues/56>`_).
- Add ``weaver.cwl_processes_dir`` configuration setting for preloading, registering or updating a set of
known `Process` definitions from `CWL` files stored in a nested directory structure. This allows a service provider
that uses `Weaver` to offer their `Processes` to directly maintain their definitions from the set of `CWL` files and
upload changes in the web application at startup without need to manually undeploy and redeploy each `Process`.
- Add ``weaver.cwl_processes_register_error`` to fail fast any `Process` registration error from `CWL` when loading
files at startup.

Fixes:
------
- Fix `Process` deployment using a `WPS-1/2` URL reference defining a ``GetCapabilities`` request to resolve
the corresponding ``DescribeProcess`` request if the `Process` ID can be inferred from other known locations
(relates to `#11 <https://github.com/crim-ca/weaver/issues/11>`_).
- Move ``WpsPackage`` properties to instance level to avoid potential referencing of attributes across same class
used by distinct running `Process`.

Expand Down Expand Up @@ -43,6 +67,7 @@ Changes:

Fixes:
------
- Fix ``Process.payload`` improperly encoded in case of special characters where allowed such as in `CWL` definition.
- Fix `CLI` operations assuming valid JSON response to instead return error response content and status code.
- Fix `CLI` rendering of various optional arguments and groups when displaying help messages.
- Fix invalid handling of ``Constants`` definitions mixed with ``classproperty`` such as in ``OutputFormat`` causing
Expand Down
10 changes: 6 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -512,8 +512,12 @@ check-security-only: check-security-code-only check-security-deps-only ## run s
# FIXME: safety ignore file (https://github.com/pyupio/safety/issues/351)
# ignored codes:
# 42194: https://github.com/kvesteri/sqlalchemy-utils/issues/166 # not fixed since 2015
# 42498: celery<5.2.0 bumps kombu>=5.2.1 with security fixes to {redis,sqs} # mongo is used by default in Weaver
# 42498: celery<5.2.0 bumps kombu>=5.2.1 with security fixes to {redis,sqs} # mongo is used by default in Weaver
# 43738: celery<5.2.2 CVE-2021-23727: trusts the messages and metadata stored in backends
# 45185: pylint<2.13.0: unrelated doc extension (https://github.com/PyCQA/pylint/issues/5322)
SAFETY_IGNORE := 42194 42498 43738 45185
SAFETY_IGNORE := $(addprefix "-i ",$(SAFETY_IGNORE))

.PHONY: check-security-deps-only
check-security-deps-only: mkdir-reports ## run security checks on package dependencies
@echo "Running security checks of dependencies..."
Expand All @@ -525,9 +529,7 @@ check-security-deps-only: mkdir-reports ## run security checks on package depen
-r "$(APP_ROOT)/requirements-dev.txt" \
-r "$(APP_ROOT)/requirements-doc.txt" \
-r "$(APP_ROOT)/requirements-sys.txt" \
-i 42194 \
-i 42498 \
-i 43738 \
$(SAFETY_IGNORE) \
1> >(tee "$(REPORTS_DIR)/check-security-deps.txt")'

.PHONY: check-security-code-only
Expand Down
5 changes: 5 additions & 0 deletions config/weaver.ini.example
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,11 @@ weaver.quote_sync_max_wait = 20
# (default: use cwltool auto-resolution according to running machine and current user/group)
weaver.cwl_euid =
weaver.cwl_egid =
# directory where to load predefined process definitions defined with CWL files
# default configuration directory is used if this entry is removed
# only CWL files are considered, lookup in directory is recursive
weaver.cwl_processes_dir =
weaver.cwl_processes_register_error = false

# --- Weaver WPS settings ---
weaver.wps = true
Expand Down
3 changes: 2 additions & 1 deletion docs/source/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,8 @@ Glossary
later retrieval using an access token.

.. seealso::
:ref:`vault_upload`
- :ref:`vault_upload`
- :ref:`file_vault_inputs`

WKT
Well-Known Text geometry representation.
Expand Down
133 changes: 119 additions & 14 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
.. include:: references.rst
.. _configuration:

.. default location to quickly reference items without the explicit and long prefix
.. py:currentmodule:: weaver.config

******************
Configuration
******************
Expand Down Expand Up @@ -76,8 +79,21 @@ they are optional and which default value or operation is applied in each situat

.. versionadded:: 4.0.0

- | ``weaver.cwl_euid = <int>`` [:class:`int`, *experimental*]
| (default: ``None``, auto-resolved by :term:`CWL` with effective machine user)
|
| Define the effective machine user ID to be used for running the :term:`Application Package`.

.. versionadded:: 1.9.0

- | ``weaver.cwl_egid = <int>`` [:class:`int`, *experimental*]
| (default: ``None``, auto-resolved by :term:`CWL` with the group of the effective user)
|
| Define the effective machine group ID to be used for running the :term:`Application Package`.

.. versionadded:: 1.9.0

- | ``weaver.wps = true|false``
- | ``weaver.wps = true|false`` [:class:`bool`-like]
| (default: ``true``)
|
| Enables the WPS-1/2 endpoint.
Expand Down Expand Up @@ -177,7 +193,7 @@ they are optional and which default value or operation is applied in each situat
|
| Prefix where process :term:`Job` worker should execute the :term:`Process` from.

- | ``weaver.wps_restapi = true|false``
- | ``weaver.wps_restapi = true|false`` [:class:`bool`-like]
| (default: ``true``)
|
| Enable the WPS-REST endpoint.
Expand Down Expand Up @@ -215,8 +231,8 @@ they are optional and which default value or operation is applied in each situat

.. versionadded:: 4.15.0

- | ``weaver.exec_sync_max_wait``
| (default: ``20``, :class:`int`, seconds)
- | ``weaver.exec_sync_max_wait = <int>`` [:class:`int`, seconds]
| (default: ``20``)
|
| Defines the maximum duration allowed for running a :term:`Job` execution in `synchronous` mode.
|
Expand All @@ -225,8 +241,8 @@ they are optional and which default value or operation is applied in each situat

.. versionadded:: 4.15.0

- | ``weaver.quote_sync_max_wait``
| (default: ``20``, :class:`int`, seconds)
- | ``weaver.quote_sync_max_wait = <int>`` [:class:`int`, seconds]
| (default: ``20``)
|
| Defines the maximum duration allowed for running a :term:`Quote` estimation in `synchronous` mode.
|
Expand Down Expand Up @@ -321,6 +337,8 @@ using the ``Weaver.data_sources`` configuration setting.
.. seealso::
More details about the implication of :term:`Data Source` are provided in :ref:`data-source`.

.. _conf_wps_processes:

Configuration of WPS Processes
=======================================

Expand All @@ -347,19 +365,77 @@ Please refer to `wps_processes.yml.example`_ for explicit format, keywords suppo
Using this registration method, the processes will always reflect the latest modification from the
remote WPS provider.

To specify a custom YAML file, you can define the setting named ``weaver.wps_processes_file`` with the appropriate path
within the employed ``weaver.ini`` file that starts your application. By default, this setting will look for the
provided path as absolute location, then will attempt to resolve relative path (corresponding to where the application
is started from), and will also look within the |weaver-config|_ directory. If none of the files can be found, the
operation is skipped.

To ensure that this feature is disabled and to avoid any unexpected auto-deployment provided by this functionality,
simply set setting ``weaver.wps_processes_file`` as *undefined* (i.e.: nothing after ``=`` in ``weaver.ini``).
- | ``weaver.wps_processes_file = <file-path>``
| (default: :py:data:`WEAVER_DEFAULT_WPS_PROCESSES_CONFIG` located in :py:data:`WEAVER_CONFIG_DIR`)
|
| Defines a custom :term:`YAML` file corresponding to `wps_processes.yml.example`_ schema to pre-load :term:`WPS`
processes and/or providers for registration at application startup.
|
| The value defined by this setting will look for the provided path as absolute location, then will attempt to
resolve relative path (corresponding to where the application is started from), and will also look within
the |weaver-config|_ directory. If none of the files can be found, the operation is skipped.
|
| To ensure that this feature is disabled and to avoid any unexpected auto-deployment provided by this functionality,
simply set setting ``weaver.wps_processes_file`` as *undefined* (i.e.: nothing after ``=`` in ``weaver.ini``).
The default value is employed if the setting is not defined at all.

.. seealso::
- `weaver.ini.example`_
- `wps_processes.yml.example`_

.. _conf_cwl_processes:

Configuration of CWL Processes
=======================================

.. versionadded:: 4.19.0

Although `Weaver` supports :ref:`Deployment <proc_op_deploy>` and dynamic management of :term:`Process` definitions
while the web application is running, it is sometime more convenient for service providers to offer a set of predefined
:ref:`application-package` definitions. In order to automatically register such definitions (or update them if changed),
without having to repeat any deployment requests after the application was started, it is possible to employ the
configuration setting ``weaver.cwl_processes_dir``. Registration of a :term:`Process` using this approach will result
in an identical definition as if it was :ref:`Deployed <proc_op_deploy>` using :term:`API` requests or using the
:ref:`cli` interfaces.

- | ``weaver.cwl_processes_dir = <dir-path>``
| (default: :py:data:`WEAVER_CONFIG_DIR`)
|
| Defines the root directory where to *recursively* and *alphabetically* load any :term:`CWL` file
to deploy the corresponding :term:`Process` definitions. Files at higher levels are loaded first before moving
down into lower directories of the structure.
|
| Any failed deployment from a seemingly valid :term:`CWL` will be logged with the corresponding error message.
Loading will proceed by ignoring failing cases according to ``weaver.cwl_processes_register_error`` setting.
The number of successful :term:`Process` deployments will also be reported if any should occur.
|
| The value defined by this setting will look for the provided path as absolute location, then will attempt to
resolve relative path (corresponding to where the application is started from). If no :term:`CWL` file could be
found, the operation is skipped.
|
| To ensure that this feature is disabled and to avoid any unexpected auto-deployment provided by this functionality,
simply set setting ``weaver.cwl_processes_dir`` as *undefined* (i.e.: nothing after ``=`` in ``weaver.ini``).
The default value is employed if the setting is not defined at all.

.. note::
When registering processes using :term:`CWL`, it is mandatory for those definitions to provide an ``id`` within
the file along other :term:`CWL` details to let `Weaver` know which :term:`Process` reference to use for deployment.

.. warning::
If a :term:`Process` depends on another definition, such as in the case of a :ref:`proc_workflow` definition, all
dependencies must be registered prior to this :term:`Process`. Consider naming your :term:`CWL` files to take
advantage of loading order to resolve such situations.

- | ``weaver.cwl_processes_register_error = true|false`` [:class:`bool`]
| (default: ``false``, *ignore failures*)
|
| Indicate if `Weaver` should ignore failing :term:`Process` deployments (when ``false``), due to unsuccessful
registration of :term:`CWL` files found within any sub-directory of ``weaver.cwl_processes_dir`` path, or
immediately fail (when ``true``) when an issue is raised during :term:`Process` deployment.

.. seealso::
- `weaver.ini.example`_

.. _conf_request_options:

Expand Down Expand Up @@ -391,7 +467,7 @@ etc. on a per-request basis, leave other requests unaffected and generally more
| Path of the :term:`Request Options` definitions to employ.


- | ``weaver.ssl_verify = true|false``
- | ``weaver.ssl_verify = true|false`` [:class:`bool`-like]
| (default: ``true``)
|
| Toggle the SSL certificate verification across all requests.
Expand All @@ -404,6 +480,35 @@ etc. on a per-request basis, leave other requests unaffected and generally more
basis using :term:`Request Options` for acceptable cases.


.. _conf_vault:

Configuration of File Vault
=======================================

.. versionadded:: 4.9.0

Configuration of the :term:`Vault` is required in order to obtain access to its functionalities
and to enable its :term:`API` endpoints. This feature is notably employed to push local files to a remote `Weaver`
instance when using the :ref:`cli` utilities, in order to use them for the :term:`Job` execution. Please refer to
below references for more details.

.. seealso::
- :ref:`vault_upload`
- :ref:`file_vault_inputs`

- | ``weaver.vault = true|false`` [:class:`bool`-like]
| (default: ``true``)
|
| Toggles the :term:`Vault` feature.

- | ``weaver.vault_dir = <dir-path>``
| (default: ``/tmp/vault``)
|
| Defines the default location where to write :ref:`files uploaded to the Vault <vault_upload>`.
|
| If the directory does not exist, it is created on demand by the feature making use of it.


Starting the Application
=======================================

Expand Down
9 changes: 6 additions & 3 deletions docs/source/processes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1325,12 +1325,15 @@ Note again that the more the :term:`Process` is verbose, the more tracking will
Uploading File to the Vault
-----------------------------

The :term:`Vault` is available as secured storage for uploading files to be employed later for :term:`Process`
execution (see also :ref:`file_vault_inputs`).

.. note::
The :term:`Vault` is a specific feature of `Weaver`. Other :term:`ADES`, :term:`EMS` and :term:`OGC API - Processes`
servers are not expected to provide this endpoint nor support the |vault_ref| reference format.

The :term:`Vault` is available as secured storage for uploading files to be employed later for :term:`Process`
execution (see also :ref:`file_vault_inputs`).
.. seealso::
Refer to :ref:`conf_vault` for applicable settings for this feature.

When upload succeeds, the response will return a :term:`Vault` UUID and an ``access_token`` to access the file.
Uploaded files cannot be accessed unless the proper credentials are provided. Requests toward the :term:`Vault` should
Expand All @@ -1346,7 +1349,7 @@ the file from the :term:`Vault`. For both HTTP methods, the ``X-Auth-Vault`` hea

.. note::
The :term:`Vault` acts only as temporary file storage. For this reason, once the file has been downloaded, it is
immediately deleted. Download can only occur once. It is assumed that the resource that must employ it will have
*immediately deleted*. Download can only occur once. It is assumed that the resource that must employ it will have
created a local copy from the download and the :term:`Vault` doesn't require to preserve it anymore. This behaviour
intends to limit the duration for which potentially sensitive data remains available in the :term:`Vault` as well
as performing cleanup to limit storage space.
Expand Down
Loading