A docker project for:
- building a TermSuite docker image including its 3rd-party dependencies,
- running TermSuite tools.
- Clone this docker project:
$ git clone https://github.com/termsuite/termsuite-docker.git
- Build the docker image:
$ cd termsuite-docker
$ bin/build
This will build the docker image termsuite:X.Y.Y
where X.Y.Z
is the version of TermSuite. The version of TermSuite (and of the docker image to build) can be set or changed in the Dockerfile
, within the instruction ENV
:
ENV \
TT_VERSION=3.2.1 \
TERMSUITE_VERSION=3.0.10 \
Once the image is built you can run TermSuite tools with bin/termsuite
.
$ bin/termsuite --help
There are currently three TermSuite tools available with the docker images:
preprocess
: runsfr.univnantes.termsuite.tools.PreprocessorCLI
(applies TermSuite preprocessing to documents),extract
: runsfr.univnantes.termsuite.tools.TerminologyExtractorCLI
(extracts terminologies from a domain-specific corpus),align
: runsfr.univnantes.termsuite.tools.AlignerCLI
(runs bilingual aligners).
Run one of these programs with:
$ bin/termsuite {preprocess,extract,align} OPTIONS
Where OPTIONS
are the options you need to run the given TermSuite tool,
except the tagger home option -t
, which is no more required here because
the tagger is installed inside the docker image.
See TermSuite Command Line API documentation to get details on how to run each of these programs.
You can also get each tool's instructions with option --help
:
$ bin/termsuite preprocess --help
$ bin/termsuite extract --help
$ bin/termsuite align --help
The command line below runs terminology extraction on the local corpus /path/to/my/corpus/
and language en
, and outputs the terminology to my-termino.tsv
.
bin/termsuite extract --tsv my-termino.tsv -c /path/to/my/corpus/ -l en