This release marks a major change to finalfusion-python
: the entire package has been rewritten in Python and is no longer a wrapper around finalfusion-rust
.
The API is now almost on par with finalfusion-rust
and in some places even goes beyond that.
Vocab
,Storage
,Metadata
andNorms
are now accessible as properties onEmbeddings
- Any of the chunks above can be loaded by themselves from a finalfusion file
- All chunks can be constructed from within Python
- It's possible to add, remove or change embeddings
Storage
types integrate directly withnumpy
arrays- Reading and writing to all common Embedding formats (word2vec, GloVe, fastText) is supported
- The API for vocabularies and subword indexers has been made mor ergonomic:
- vocab words and the word -> index mapping are accessible as properties
SubwordVocab
s expose the subword indexer throughvocab.subword_indexer
In addition to the overhauled API, finalfusion-python
now comes with executables:
ffp-convert
to convert between embedding formatsffp-similar
andffp-analogy
for similarity and analogy queriesffp-bucket-to-explicit
to convert from bucket subword to explicit subword embeddings
Check out the documentation at https://finalfusion-python.readthedocs.io for more information!