Adds support for stella_en_v5 embedding model -400M variant #2608

iskng · 2024-11-09T19:42:20Z

Stella_en_400m_v5 is #6 on MTEB as of 9th Nov 2024.

This PR adds support for the model along with some examples.

license: Model is licensed MIT

Authors example from the model card added and reproduced.

AnubhabB · 2024-11-10T08:59:38Z

@iskng let's try and figure out if we can have one single stella_en_v5 module instead of stella_en_v5 and stella_en_v5_400m. Allow me some time to go through this and discuss possible ways of merging this.

I guess that way, it'll be easier for end users and maintainers.

It would be great if you could mark this as a draft PR for the time being till we sort this out?

Thanks

LaurentMazare · 2024-11-10T09:02:30Z

@iskng let's try and figure out if we can have one single stella_en_v5 module instead of stella_en_v5 and stella_en_v5_400m. Allow me some time to go through this and discuss possible ways of merging this.

+1 to this, if it's easy to add support for the 400m model in the existing one that would make it simpler to maintain over time (though there is already a lot of duplication among models so if it's a significant effort to merge the two, I'm happy with the separate file).

iskng · 2024-11-12T04:50:00Z

Should have mentioned I only really tested this for inference on metal and cpu so not sure if the cuda implementation is right, had to disable the ues_efficient_memory because trying to get xformers on a mac was rough.

Also curious if its just my implementation, but its about 3 times slower than sentence transformers for the same model.
Would love to learn to make this faster, if you know of any resources I'm just starting to dig around candle. Thx

…into temp-stella-400M

AnubhabB · 2024-11-18T20:41:13Z

@iskng Sorry about the delay, I've been on the move and didn't get a chance to work on this.

I'm not sure about the correct way of proceeding with this PR (and my changes to it) so I just created a PR to your feature branch here.

This incorporates Stella 1.5B and Stella 400M under a single entry point and flow. Basically it's a single model now with minor divergence in implementation when required. Take a look, I've been able to reproduce the author (and your's) results.

Also curious if its just my implementation, but its about 3 times slower than sentence transformers for the same model.
Would love to learn to make this faster, if you know of any resources I'm just starting to dig around candle. Thx

It might not be strictly your implementation, Candle is still very very young and a bit rough around the edges when it comes to head-on performance comparisons with PyTorch based implementations. Having said that, I've attempted to simplify some of your current implementation (basically removing anything which is not directly composed out of Config, removed a bunch of redundant transpose, reshape etc. ops). Let me know if you see any performance improvements? I'll be curious.

Also, I've updated the stella_en_v5 example to support both the variants --which 400m and --which 1.5b and reported both the results. It should be safe to remove those files/ dir as well.

Your implementation of `Stella 400M` and the previous `Stella 1.5B` now supported in a single file and entry point

…la-en-v5

iskng · 2024-11-23T03:43:37Z

Its about 15% faster. Learned a lot from this thanks!

Removed the now redundant 400m

AnubhabB · 2024-11-23T07:34:08Z

@iskng great, if you are satisfied un-draft this PR and we should be ready for merge with Candle.

LaurentMazare · 2024-11-25T10:12:09Z

Looks pretty good, you may want to run rustfmt though. Anything else we should be waiting for before merging this one?

AnubhabB · 2024-11-25T13:09:32Z

I believe this is generally complete.

@iskng waiting on you to finalize this PR. As @LaurentMazare highlighted please run a cargo fmt --all and push when you un-mark this PR as WIP and we should be good to go.

AnubhabB · 2024-11-28T19:29:58Z

@LaurentMazare I think this is ready to merge. Whenever you have some time.

LaurentMazare · 2024-11-28T19:42:34Z

Please apply rustfmt and fix the clippy lints. Also please remove the commented out code.

AnubhabB · 2024-11-28T20:28:26Z

There are too many lints, from what I can tell its all over the place (not just related to this PR) - guessing it has to do with changes Rust 1.83 introduced (for example manual_div_ceil), but I don't seem to find anything in Clippy changelog to explain all the complains about needless_lifetimes.

I think that should be a separate PR by itself.

Parking this PR for now, will first work on the lints in a separate PR and come back to this!

LaurentMazare · 2024-11-29T08:01:24Z

Looks like all the clippy fixes are gone post merge, thanks!

AnubhabB and others added 6 commits October 12, 2024 19:34

Merge branch 'main' of github.com:AnubhabB/candle

41beca6

Merge branch 'main' of github.com:AnubhabB/candle

cdc5e11

Merge branch 'main' of github.com:AnubhabB/candle

4e60874

Merge branch 'main' of github.com:AnubhabB/candle

6a8bdfb

Merge branch 'main' of github.com:AnubhabB/candle

5c30e35

Adds support for stella_en_v5 embedding model -400M variant

91d4602

iskng mentioned this pull request Nov 9, 2024

[QUESTION] Protocol of adding a new model (Stella_en_<*>_v5 family) implementation with Candle #2525

Closed

iskng marked this pull request as draft November 10, 2024 18:40

AnubhabB added 5 commits November 17, 2024 15:22

Merge branch 'main' of github.com:AnubhabB/candle into temp-stella-400M

4acf8c7

Merge branch 'feature/stella-400m' of https://github.com/iskng/candle …

f42fd69

…into temp-stella-400M

Unified stella

f137f69

WIP: Unified Stella

14fc76b

Combined stella for both 1.5B and 400M variants

c92f569

AnubhabB and others added 3 commits November 19, 2024 02:18

Cargo fmt for the CI

ed7fd9b

Merge pull request #1 from AnubhabB/temp-stella-400M

437f5f1

Your implementation of `Stella 400M` and the previous `Stella 1.5B` now supported in a single file and entry point

removed redundant stella-400m model and example after merge into stel…

aec2aa2

…la-en-v5

cargo fmt --all

339b435

iskng marked this pull request as ready for review November 26, 2024 17:43

LaurentMazare approved these changes Nov 28, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into feature/stella-400m

7415f26

LaurentMazare merged commit 4f59ed3 into huggingface:main Nov 29, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds support for stella_en_v5 embedding model -400M variant #2608

Adds support for stella_en_v5 embedding model -400M variant #2608

iskng commented Nov 9, 2024

AnubhabB commented Nov 10, 2024

LaurentMazare commented Nov 10, 2024

iskng commented Nov 12, 2024 •

edited

Loading

AnubhabB commented Nov 18, 2024 •

edited

Loading

iskng commented Nov 23, 2024 •

edited

Loading

AnubhabB commented Nov 23, 2024

LaurentMazare commented Nov 25, 2024

AnubhabB commented Nov 25, 2024

AnubhabB commented Nov 28, 2024

LaurentMazare commented Nov 28, 2024

AnubhabB commented Nov 28, 2024

LaurentMazare commented Nov 29, 2024

Adds support for stella_en_v5 embedding model -400M variant #2608

Adds support for stella_en_v5 embedding model -400M variant #2608

Conversation

iskng commented Nov 9, 2024

AnubhabB commented Nov 10, 2024

LaurentMazare commented Nov 10, 2024

iskng commented Nov 12, 2024 • edited Loading

AnubhabB commented Nov 18, 2024 • edited Loading

iskng commented Nov 23, 2024 • edited Loading

AnubhabB commented Nov 23, 2024

LaurentMazare commented Nov 25, 2024

AnubhabB commented Nov 25, 2024

AnubhabB commented Nov 28, 2024

LaurentMazare commented Nov 28, 2024

AnubhabB commented Nov 28, 2024

LaurentMazare commented Nov 29, 2024

iskng commented Nov 12, 2024 •

edited

Loading

AnubhabB commented Nov 18, 2024 •

edited

Loading

iskng commented Nov 23, 2024 •

edited

Loading