earkweb is a repository for archiving digital objects. It offers basic functions for ingest, management and dissemination of information packages.
earkweb is a web application with a task execution backend based on Celery which supports the parallel processing of information.
Celery is an open-source, distributed task queue that allows you to run time-consuming or periodic tasks in the background.
Celery Beat is a scheduler that runs alongside Celery workers to execute periodic tasks at specified intervals.
Flower is a real-time web-based monitoring tool for Celery that provides detailed insights into task progress, worker status, and system performance.
Apache Solr is an open-source search platform built on Apache Lucene, designed for full-text search and indexing capabilities.
The following diagram illustrates the component architecture.
The user interface represented by the box on top of the diagram is a Python/Django-based web application which supports creation, management and exploration of information packages. Tasks can be assigned to Celery workers (green boxes with a "C") which share the same storage area and the result of the package transformation is stored in the information package’s working directory based on files. Full-text content included in information packages is indexed by SolR. A ResourceSync interface exposes the changelist of information packages managed by the repository.