diff --git a/README.md b/README.md index 5c4fb9261..0b1601d2d 100644 --- a/README.md +++ b/README.md @@ -288,73 +288,123 @@ Key: + original (float32) indexes: cached queries (πŸ«™), ONNX (πŸ…ΎοΈ) + quantized (int8) indexes: cached queries (πŸ«™), ONNX (πŸ…ΎοΈ) -See instructions below the table for how to reproduce results for a model on all BEIR corpora "in one go". - -| Corpus | F1 | F2 | MF | U1 | S1 | BGE (flat) | BGE (HNSW) | -|-------------------------|:------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| TREC-COVID | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-covid.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-covid.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-covid.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| BioASQ | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-bioasq.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-bioasq.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-bioasq.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| NFCorpus | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nfcorpus.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nfcorpus.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nfcorpus.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| NQ | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nq.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nq.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nq.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| HotpotQA | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-hotpotqa.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-hotpotqa.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-hotpotqa.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| FiQA-2018 | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fiqa.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fiqa.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fiqa.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| Signal-1M(RT) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-signal1m.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-signal1m.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-signal1m.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| TREC-NEWS | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-news.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-news.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-news.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| Robust04 | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-robust04.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-robust04.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-robust04.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| ArguAna | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-arguana.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-arguana.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-arguana.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| Touche2020 | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Android | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-English | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Gaming | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Gis | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Mathematica | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Physics | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Programmers | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Stats | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Tex | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Unix | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Webmasters | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| CQADupStack-Wordpress | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| Quora | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-quora.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-quora.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-quora.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| DBPedia | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| SCIDOCS | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scidocs.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scidocs.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scidocs.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| FEVER | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fever.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fever.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fever.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| Climate-FEVER | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-climate-fever.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-climate-fever.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-climate-fever.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw-int8.onnx.md) | -| SciFact | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scifact.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scifact.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scifact.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw-int8.onnx.md) | - -To reproduce the SPLADE++ CoCondenser-EnsembleDistil results, start by downloading the collection: +See instructions below the table for how to reproduce results programmatically. + +| Corpus | F1 | F2 | MF | U1 | S1 | BGE (flat) | BGE (HNSW) | +|-------------------------|:------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| TREC-COVID | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-covid.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-covid.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-covid.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| BioASQ | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-bioasq.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-bioasq.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-bioasq.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| NFCorpus | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nfcorpus.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nfcorpus.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nfcorpus.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| NQ | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nq.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nq.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-nq.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| HotpotQA | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-hotpotqa.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-hotpotqa.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-hotpotqa.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| FiQA-2018 | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fiqa.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fiqa.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fiqa.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| Signal-1M(RT) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-signal1m.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-signal1m.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-signal1m.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| TREC-NEWS | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-news.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-news.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-trec-news.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| Robust04 | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-robust04.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-robust04.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-robust04.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| ArguAna | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-arguana.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-arguana.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-arguana.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| Touche2020 | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Android | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-English | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Gaming | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Gis | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Mathematica | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Physics | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Programmers | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Stats | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Tex | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Unix | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Webmasters | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| CQADupStack-Wordpress | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| Quora | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-quora.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-quora.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-quora.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| DBPedia | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| SCIDOCS | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scidocs.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scidocs.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scidocs.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| FEVER | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fever.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fever.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-fever.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| Climate-FEVER | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-climate-fever.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-climate-fever.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-climate-fever.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | +| SciFact | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scifact.flat.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scifact.flat-wp.md) | [πŸ”‘](docs/regressions/regressions-beir-v1.0.0-scifact.multifield.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.unicoil-noexp.cached.md) | [πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.splade-pp-ed.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.splade-pp-ed.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md) | + +
+Deprecated BGE instructions using corpora in jsonl format + +| Corpus | BGE (flat) | BGE (HNSW) | +|-------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| TREC-COVID | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| BioASQ | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| NFCorpus | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| NQ | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| HotpotQA | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| FiQA-2018 | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| Signal-1M(RT) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| TREC-NEWS | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| Robust04 | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| ArguAna | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| Touche2020 | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Android | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-English | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Gaming | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Gis | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Mathematica | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Physics | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Programmers | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Stats | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Tex | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Unix | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Webmasters | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| CQADupStack-Wordpress | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| Quora | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| DBPedia | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| SCIDOCS | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| FEVER | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| Climate-FEVER | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw-int8.onnx.md) | +| SciFact | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.flat-int8.onnx.md) | full:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw.onnx.md) int8:[πŸ«™](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw-int8.cached.md)[πŸ…ΎοΈ](docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw-int8.onnx.md) | + +
+ +To reproduce the above results programmatically, use the following commands to download and unpack the data: ```bash -wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-splade-pp-ed.tar -P collections/ -tar xvf collections/beir-v1.0.0-splade-pp-ed.tar -C collections/ +wget https://rgw.cs.uwaterloo.ca/pyserini/data/$COLLECTION -P collections/ +tar xvf collections/$COLLECTION -C collections/ ``` -The tarball is 42 GB and has MD5 checksum `9c7de5b444a788c9e74c340bf833173b`. +Substitute the appropriate `$COLLECTION` from the table below. + +| `$COLLECTION` | Size | Checksum | +|:-------------------------------------------------------|-------:|:-----------------------------------| +| `beir-v1.0.0-corpus.tar` | 14 GB | `faefd5281b662c72ce03d22021e4ff6b` | +| `beir-v1.0.0-corpus-wp.tar` | 13 GB | `3cf8f3dcdcadd49362965dd4466e6ff2` | +| `beir-v1.0.0-unicoil-noexp.tar` | 30 GB | `4fd04d2af816a6637fc12922cccc8a83` | +| `beir-v1.0.0-splade-pp-ed.tar` | 43 GB | `9c7de5b444a788c9e74c340bf833173b` | +| `beir-v1.0.0-bge-base-en-v1.5.parquet.tar` | 194 GB | `c279f9fc2464574b482ec53efcc1c487` | +| `beir-v1.0.0-bge-base-en-v1.5.tar` (jsonl, deprecated) | 294 GB | `e4e8324ba3da3b46e715297407a24f00` | + Once you've unpacked the data, the following commands will loop over all BEIR corpora and run the regressions: ```bash -MODEL="splade-pp-ed"; CORPORA=(trec-covid bioasq nfcorpus nq hotpotqa fiqa signal1m trec-news robust04 arguana webis-touche2020 cqadupstack-android cqadupstack-english cqadupstack-gaming cqadupstack-gis cqadupstack-mathematica cqadupstack-physics cqadupstack-programmers cqadupstack-stats cqadupstack-tex cqadupstack-unix cqadupstack-webmasters cqadupstack-wordpress quora dbpedia-entity scidocs fever climate-fever scifact); for c in "${CORPORA[@]}" +MODEL="$MODEL"; CORPORA=(trec-covid bioasq nfcorpus nq hotpotqa fiqa signal1m trec-news robust04 arguana webis-touche2020 cqadupstack-android cqadupstack-english cqadupstack-gaming cqadupstack-gis cqadupstack-mathematica cqadupstack-physics cqadupstack-programmers cqadupstack-stats cqadupstack-tex cqadupstack-unix cqadupstack-webmasters cqadupstack-wordpress quora dbpedia-entity scidocs fever climate-fever scifact); for c in "${CORPORA[@]}" do echo "Running $c..." - python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-${c}.${MODEL}.onnx > logs/log.beir-v1.0.0-${c}-${MODEL}.onnx 2>&1 + python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-${c}.${MODEL} > logs/log.beir-v1.0.0-${c}-${MODEL} 2>&1 done ``` -You can verify the results by examining the log files in `logs/`. - -For the other models, modify the above commands as follows: - -| Key | Corpus | Checksum | `MODEL` | -|:----|:-------------------|:-----------------------------------|:------------------------| -| F1 | `corpus` | `faefd5281b662c72ce03d22021e4ff6b` | `flat` | -| F2 | `corpus-wp` | `3cf8f3dcdcadd49362965dd4466e6ff2` | `flat-wp` | -| MF | `corpus` | `faefd5281b662c72ce03d22021e4ff6b` | `multifield` | -| U1 | `unicoil-noexp` | `4fd04d2af816a6637fc12922cccc8a83` | `unicoil-noexp` | -| S1 | `splade-pp-ed` | `9c7de5b444a788c9e74c340bf833173b` | `splade-pp-ed` | -| BGE | `bge-base-en-v1.5` | `e4e8324ba3da3b46e715297407a24f00` | `bge-base-en-v1.5-hnsw` | - -The "Corpus" above should be substituted into the full file name `beir-v1.0.0-${corpus}.tar`, e.g., `beir-v1.0.0-bge-base-en-v1.5.tar`. -The above commands should work with some minor modifications: you'll need to tweak the `--regression` parameter to match the schema of the YAML config files in `src/main/resources/regression/`. +Substitute the appropriate `$MODEL` from the table below. + +| Key | `$MODEL` | +|:-------------------------|:--------------------------------------------| +| F1 | `flat` | +| F2 | `flat-wp` | +| MF | `multifield` | +| U1 (cached) | `unicoil-noexp.cached` | +| S1 (cached) | `splade-pp-ed.cached` | +| S1 (ONNX) | `splade-pp-ed.onnx` | +| BGE (flat, full; cached) | `bge-base-en-v1.5.parquet.flat.cached` | +| BGE (flat, int8; cached) | `bge-base-en-v1.5.parquet.flat-int8.cached` | +| BGE (HNSW, full; cached) | `bge-base-en-v1.5.parquet.hnsw.cached` | +| BGE (HNSW, int8; cached) | `bge-base-en-v1.5.parquet.hnsw-int8.cached` | +| BGE (flat, full; ONNX) | `bge-base-en-v1.5.parquet.flat.onnx` | +| BGE (flat, int8; ONNX) | `bge-base-en-v1.5.parquet.flat-int8.onnx` | +| BGE (HNSW, full; ONNX) | `bge-base-en-v1.5.parquet.hnsw.onnx` | +| BGE (HNSW, int8; ONNX) | `bge-base-en-v1.5.parquet.hnsw-int8.onnx` |
diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..87af45d09 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.6361 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9915 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9964 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..9a5f978c6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-arguana.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-arguana.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.6361 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9915 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9964 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..157e497d4 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.6361 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9915 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9964 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..8e3206741 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-arguana.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-arguana.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.6361 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9915 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.9964 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..824afae17 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.636 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.996 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..17e376afa --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-arguana.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-arguana.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.636 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.996 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..5ca0dd68f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-arguana.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.636 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.996 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..cf377630b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-arguana.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-arguana.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-arguana.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-arguana.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-arguana.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-arguana.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): ArguAna | 0.636 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): ArguAna | 0.996 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..454311f9f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.4149 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.6317 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.8059 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..1fa461552 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.4149 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.6317 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.8059 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..bed3d2ab8 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.4149 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.6317 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.8059 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..5fc4d6690 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-bioasq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-bioasq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.4149 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.6317 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.8059 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..43f2014f6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 2000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.415 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.632 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.806 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..bd9e1f3af --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -M 16 -efC 500 -quantize.int8 \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 2000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-bioasq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.415 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.632 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.806 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..70025ec42 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 2000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-bioasq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.415 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.632 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.806 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..76821c481 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-bioasq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-bioasq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-bioasq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-bioasq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 2000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-bioasq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-bioasq.test.txt runs/run.beir-v1.0.0-bioasq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-bioasq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): BioASQ | 0.415 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.632 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): BioASQ | 0.806 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..98d88ab30 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.3119 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.6362 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.8307 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..a6bcb929e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.3119 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.6362 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.8307 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..53fcece4c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.3119 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.6362 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.8307 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..0b5cf7d90 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-climate-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-climate-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.3119 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.6362 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.8307 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..bdc0e8fe5 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.636 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.831 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..77ca2640a --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-climate-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.636 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.831 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..3bbd75682 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-climate-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.636 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.831 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..3bec20c39 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-climate-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-climate-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-climate-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-climate-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-climate-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-climate-fever.test.txt runs/run.beir-v1.0.0-climate-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-climate-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Climate-FEVER | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.636 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Climate-FEVER | 0.831 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..bd949ad58 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.5075 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.8454 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.9611 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..8b836541e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.5075 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.8454 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.9611 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..6ea3f7c71 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.5075 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.8454 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.9611 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..456e74f03 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.5075 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.8454 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.9611 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..d32c98c88 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.507 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.845 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.961 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..e93a5db47 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.507 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.845 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.961 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..1610a8f3e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-android.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.507 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.845 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.961 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..320c8c9bc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-android.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-android.test.txt runs/run.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-android.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-android | 0.507 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.845 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-android | 0.961 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..f1dc65c6e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.4857 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.7587 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.8839 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..e6a484499 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.4857 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.7587 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.8839 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..c5f662e0a --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.4857 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.7587 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.8839 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..dff71cac5 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.4857 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.7587 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.8839 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..e5992e48e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.486 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.759 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.884 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..0ab069fa3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.486 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.759 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.884 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..3c3c2a15f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-english.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.486 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.759 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.884 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..e35847ce2 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-english.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-english.test.txt runs/run.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-english.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-english | 0.486 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.759 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-english | 0.884 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..bd09f15e5 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.5965 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9036 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9719 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..87e588263 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.5965 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9036 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9719 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..28b583fa9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.5965 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9036 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9719 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..fb0eee551 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.5965 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9036 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.9719 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..4efb81b63 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.596 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.904 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.972 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..dcd2f1096 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.596 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.904 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.972 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..d59a3ec30 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gaming.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.596 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.904 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.972 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..d00c8659c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gaming.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gaming.test.txt runs/run.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gaming.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gaming | 0.596 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.904 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gaming | 0.972 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..4f757e4f4 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.4127 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.7682 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.9117 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..51a46011b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.4127 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.7682 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.9117 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..09d0743f2 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.4127 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.7682 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.9117 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..68fa1ffbe --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.4127 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.7682 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.9117 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..c112d4667 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.768 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.912 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..8bab1db22 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.768 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.912 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..1a8ea33a7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-gis.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.768 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.912 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..47e2b252d --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-gis.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-gis.test.txt runs/run.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-gis.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-gis | 0.413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.768 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-gis | 0.912 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..e0e757a23 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.3163 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.6922 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.8810 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..437e98aee --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.3163 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.6922 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.8810 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..1a4fce0bc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.3163 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.6922 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.8810 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..1acddcc32 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.3163 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.6922 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.8810 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..f7b1978bd --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.316 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.692 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.881 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..203d35b6c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.316 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.692 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.881 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..7c88f8c80 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-mathematica.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.316 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.692 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.881 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..6ae1bd115 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-mathematica.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-mathematica.test.txt runs/run.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-mathematica.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.316 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.692 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-mathematica | 0.881 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..8c94e8e64 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.4722 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.8081 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.9406 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..8b5f09263 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.4722 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.8081 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.9406 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..db5b489cb --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.4722 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.8081 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.9406 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..df8cbbdd7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.4722 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.8081 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.9406 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..dbd13863f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.472 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.808 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.941 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..1f54bf8a0 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.472 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.808 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.941 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..d633d983a --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-physics.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.472 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.808 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.941 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..645222db9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-physics.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-physics.test.txt runs/run.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-physics.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-physics | 0.472 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.808 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-physics | 0.941 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..e443f2600 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.4242 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.7856 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.9348 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..04175a25e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.4242 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.7856 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.9348 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..7fb119976 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.4242 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.7856 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.9348 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..f1bfb146a --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.4242 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.7856 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.9348 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..273832b4c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.424 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.786 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.935 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..c59ff7179 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.424 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.786 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.935 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..c7001d0f5 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-programmers.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.424 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.786 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.935 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..ae0b42d7b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-programmers.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-programmers.test.txt runs/run.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-programmers.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-programmers | 0.424 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.786 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-programmers | 0.935 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..cdfe16a92 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.3732 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.6727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.8445 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..12da24869 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.3732 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.6727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.8445 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..de927e886 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.3732 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.6727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.8445 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..08a7ab0b6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.3732 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.6727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.8445 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..ac204f47b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.673 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.845 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..016db3302 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.673 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.845 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..c9b61e8d2 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-stats.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.673 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.845 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..c951c35a9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-stats.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-stats.test.txt runs/run.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-stats.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-stats | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.673 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-stats | 0.845 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..331200031 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.3115 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.6486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.8537 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..1d8026bbe --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.3115 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.6486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.8537 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..da635c8bb --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.3115 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.6486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.8537 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..2bddba384 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.3115 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.6486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.8537 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..542737177 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.649 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.854 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..bf5a9e373 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.649 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.854 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..0cd1fe7f7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-tex.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.649 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.854 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..886dd28c4 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-tex.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-tex.test.txt runs/run.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-tex.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-tex | 0.312 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.649 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-tex | 0.854 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..584d271b3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.4219 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.7797 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.9237 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..7651a8bd0 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.4219 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.7797 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.9237 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..6f425ee20 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.4219 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.7797 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.9237 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..70f1f11d3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.4219 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.7797 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.9237 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..9ef1745a8 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.422 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.780 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.924 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..43d5198ff --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.422 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.780 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.924 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..41e169ce9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-unix.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.422 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.780 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.924 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..8d15a42f1 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-unix.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-unix.test.txt runs/run.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-unix.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-unix | 0.422 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.780 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-unix | 0.924 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..df6867e13 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.7774 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.9380 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..aa802d755 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.7774 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.9380 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..4228a6658 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.7774 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.9380 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..cd5bd0627 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.7774 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.9380 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..135338142 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.777 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.938 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..c642228d4 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.777 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.938 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..9ab22eeff --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-webmasters.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.777 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.938 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..873f875a0 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-webmasters.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-webmasters.test.txt runs/run.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-webmasters.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.777 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-webmasters | 0.938 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..a68212b12 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.3547 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.7065 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.8861 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..f30260996 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.3547 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.7065 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.8861 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..8a8936d36 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.3547 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.7065 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.8861 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..4a06d7282 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.3547 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.7065 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.8861 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..20da877a5 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.355 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.706 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.886 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..2001a1fc1 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.355 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.706 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.886 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..ab31974e9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-cqadupstack-wordpress.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.355 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.706 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.886 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..5ca1b18a1 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-cqadupstack-wordpress.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-cqadupstack-wordpress.test.txt runs/run.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-cqadupstack-wordpress.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.355 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.706 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): CQADupStack-wordpress | 0.886 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..9bd270590 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.4074 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.5303 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.7833 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..7fae58cc0 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.4074 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.5303 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.7833 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..5d1b66af6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.4074 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.5303 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.7833 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..73659ad2d --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.4074 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.5303 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.7833 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..ed4dfe935 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.530 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.783 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..2bacd5175 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.530 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.783 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..eb3640660 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-dbpedia-entity.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.530 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.783 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..2b11b6bed --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-dbpedia-entity.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-dbpedia-entity.test.txt runs/run.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-dbpedia-entity.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): DBPedia | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.530 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): DBPedia | 0.783 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..c43fdd3c3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.8630 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9719 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9855 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..a7e2799f3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.8630 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9719 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9855 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..ae338d156 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.8630 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9719 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9855 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..8891f41d3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.8630 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9719 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.9855 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..cde6a1bde --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.863 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.972 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.985 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..000103759 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.863 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.972 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.985 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..8f185d8ff --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fever.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.863 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.972 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.985 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..5cf0a02ab --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fever.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-fever.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fever.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-fever.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fever.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fever.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fever.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fever.test.txt runs/run.beir-v1.0.0-fever.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fever.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FEVER | 0.863 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.972 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FEVER | 0.985 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..23028ab11 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.7415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.9083 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..0f2fb5096 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.7415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.9083 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..667c5d2cc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.7415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.9083 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..343761882 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fiqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-fiqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.4065 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.7415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.9083 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..03a8fbde1 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.742 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.908 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..03037cefc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-fiqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.742 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.908 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..6a184b35e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-fiqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.742 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.908 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..16c3b06a3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-fiqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-fiqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fiqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fiqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-fiqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): FiQA-2018 | 0.407 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.742 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): FiQA-2018 | 0.908 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..28fd26293 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.7259 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.8727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.9424 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..0b2b32c9c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.7259 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.8727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.9424 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..b3bb926bc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.7259 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.8727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.9424 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..47b226c08 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-hotpotqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.7259 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.8727 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.9424 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..7e6fcc574 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.726 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.873 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.942 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..a3810e8e4 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.726 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.873 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.942 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..e201a4ef1 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-hotpotqa.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.726 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.873 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.942 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..3b180e5f6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-hotpotqa.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-hotpotqa.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-hotpotqa.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-hotpotqa.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-hotpotqa.test.txt runs/run.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-hotpotqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): HotpotQA | 0.726 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.873 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): HotpotQA | 0.942 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..05f043618 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.3735 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.3368 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.6622 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..fee589f20 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.3735 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.3368 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.6622 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..0aa15c372 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.3735 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.3368 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.6622 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..675ed9a93 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nfcorpus.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.3735 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.3368 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.6622 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..2005d7fb7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.337 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.662 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..9913f9f6f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.337 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.662 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..9afad2c11 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nfcorpus.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.337 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.662 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..0b480fc79 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nfcorpus.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-nfcorpus.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nfcorpus.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nfcorpus.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nfcorpus.test.txt runs/run.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nfcorpus.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NFCorpus | 0.373 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.337 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NFCorpus | 0.662 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..c4dd6e789 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.5413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9859 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..be318186f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-nq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.5413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9859 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..2d7976992 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.5413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9859 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..4f137dbbe --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-nq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.5413 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9415 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.9859 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..3205974e1 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.541 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.942 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.986 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..1434b80c9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-nq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.541 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.942 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.986 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..d88868b6f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-nq.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.541 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.942 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.986 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..e1fe23e02 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-nq.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-nq.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-nq.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-nq.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-nq.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nq.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nq.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-nq.test.txt runs/run.beir-v1.0.0-nq.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-nq.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): NQ | 0.541 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.942 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): NQ | 0.986 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..3902069e2 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.8890 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9998 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..9643ab425 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-quora.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-quora.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.8890 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9998 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..fb9a048f8 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.8890 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9998 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..e9f5e1ca3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-quora.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-quora.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.8890 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.9998 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..d90a75596 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.889 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.997 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 1.000 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..caf07b221 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-quora.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-quora.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.889 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.997 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 1.000 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..1381c83f6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-quora.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.889 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.997 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 1.000 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..4618aafe0 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-quora.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-quora.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-quora.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-quora.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-quora.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-quora.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-quora.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-quora.test.txt runs/run.beir-v1.0.0-quora.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-quora.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Quora | 0.889 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 0.997 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Quora | 1.000 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..58e53f761 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.4465 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.3507 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.5981 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..38a163df7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-robust04.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-robust04.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.4465 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.3507 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.5981 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..8ef5e6214 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.4465 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.3507 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.5981 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..607dbb167 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-robust04.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-robust04.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.4465 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.3507 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.5981 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..03e9068d7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.447 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.351 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.598 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..68e9b560c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-robust04.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-robust04.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.447 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.351 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.598 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..c0591f10d --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-robust04.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.447 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.351 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.598 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..98e30d6e6 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-robust04.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-robust04.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-robust04.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-robust04.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-robust04.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-robust04.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-robust04.test.txt runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-robust04.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Robust04 | 0.447 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.351 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Robust04 | 0.598 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..4e2931577 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.2170 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.4959 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.7824 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..d05b5a0aa --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.2170 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.4959 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.7824 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..c0b1fcd1f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.2170 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.4959 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.7824 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..edb65059a --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scidocs.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scidocs.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.2170 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.4959 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.7824 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..1cb3dd499 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.217 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.496 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.782 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..a021f0566 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scidocs.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.217 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.496 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.782 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..21bfeb2e3 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scidocs.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.217 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.496 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.782 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..9830e6a87 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scidocs.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-scidocs.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scidocs.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scidocs.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scidocs.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scidocs.test.txt runs/run.beir-v1.0.0-scidocs.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scidocs.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SCIDOCS | 0.217 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.496 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SCIDOCS | 0.782 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..87db3a698 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.7408 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9667 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9967 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..4617aeab7 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scifact.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-scifact.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.7408 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9667 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9967 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..fa19d7bcc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.7408 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9667 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9967 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..edbbd40fc --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scifact.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-scifact.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.7408 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9667 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.9967 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..0b245f928 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.741 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.997 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..f2c5b62f9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scifact.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-scifact.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.741 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.997 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..f80a440e0 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-scifact.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.741 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.997 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..51034c957 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-scifact.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-scifact.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-scifact.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-scifact.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scifact.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scifact.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-scifact.test.txt runs/run.beir-v1.0.0-scifact.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-scifact.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): SciFact | 0.741 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.967 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): SciFact | 0.997 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..e8e30364e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.2886 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.3112 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.5331 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..55c5ab64c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.2886 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.3112 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.5331 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..099061d7b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.2886 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.3112 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.5331 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..3d3110c08 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-signal1m.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-signal1m.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.2886 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.3112 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.5331 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..9ecbece3c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.289 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.311 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.533 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..8fa0a3043 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-signal1m.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.289 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.311 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.533 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..9ec1b9e6d --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-signal1m.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.289 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.311 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.533 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..daffa2ed4 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-signal1m.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-signal1m.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-signal1m.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-signal1m.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-signal1m.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-signal1m.test.txt runs/run.beir-v1.0.0-signal1m.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-signal1m.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Signal-1M | 0.289 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.311 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Signal-1M | 0.533 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..c76028e40 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.7814 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.1406 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.4768 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..a90628573 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.7814 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.1406 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.4768 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..dd06eb8d5 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.7814 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.1406 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.4768 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..f764ec176 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-covid.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-covid.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.7814 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.1406 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.4768 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..3969d6a75 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.781 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.141 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.477 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..90ff92c0b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-covid.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.781 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.141 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.477 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..fab0ddb37 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-covid.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.781 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.141 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.477 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..f491f064c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-covid.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-covid.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-covid.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-covid.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-covid.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-covid.test.txt runs/run.beir-v1.0.0-trec-covid.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-covid.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-COVID | 0.781 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.141 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-COVID | 0.477 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..639e5f34b --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.4425 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.4992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.7875 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..786087c37 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.4425 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.4992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.7875 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..08c414355 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.4425 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.4992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.7875 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..87f07b210 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-news.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-trec-news.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.4425 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.4992 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.7875 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..5f07bdafb --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.442 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.499 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.788 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..ac988b277 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-trec-news.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.442 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.499 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.788 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..6ad519c3c --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-trec-news.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.442 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.499 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.788 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..2690ff20e --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-trec-news.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-trec-news.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-trec-news.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-news.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-news.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-trec-news.test.txt runs/run.beir-v1.0.0-trec-news.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-trec-news.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): TREC-NEWS | 0.442 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.499 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): TREC-NEWS | 0.788 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.md new file mode 100644 index 000000000..a95fa066f --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.2570 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.4857 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.8298 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.md new file mode 100644 index 000000000..9ed6ef6ee --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.md @@ -0,0 +1,84 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -quantize.int8 \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.2570 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.4857 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.8298 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.md new file mode 100644 index 000000000..75ad5a252 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.md @@ -0,0 +1,81 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.2570 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.4857 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.8298 | + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.md new file mode 100644 index 000000000..6bbfcb230 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.md @@ -0,0 +1,82 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +bin/run.sh io.anserini.index.IndexFlatDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-flat.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchFlatDenseVectors \ + -index indexes/lucene-flat.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.2570 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.4857 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.8298 | + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.md new file mode 100644 index 000000000..a373df051 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.257 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.830 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md new file mode 100644 index 000000000..e3af27baa --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -M 16 -efC 100 -quantize.int8 \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw-int8.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-int8-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.257 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.830 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.md new file mode 100644 index 000000000..dae72b2e9 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.gz \ + -topicReader JsonStringVector \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt \ + -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-cached.topics.beir-v1.0.0-webis-touche2020.test.bge-base-en-v1.5.jsonl.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.257 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.830 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.md b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.md new file mode 100644 index 000000000..69d065b22 --- /dev/null +++ b/docs/regressions/regressions-beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.md @@ -0,0 +1,86 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.yaml). +Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +bin/run.sh io.anserini.index.IndexHnswDenseVectors \ + -threads 16 \ + -collection ParquetDenseVectorCollection \ + -input /path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 \ + -generator ParquetDenseVectorDocumentGenerator \ + -index indexes/lucene-hnsw.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -M 16 -efC 100 \ + >& logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5 & +``` + +The path `/path/to/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +bin/run.sh io.anserini.search.SearchHnswDenseVectors \ + -index indexes/lucene-hnsw.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5/ \ + -topics tools/topics-and-qrels/topics.beir-v1.0.0-webis-touche2020.test.tsv.gz \ + -topicReader TsvString \ + -output runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt \ + -encoder BgeBaseEn15 -hits 1000 -efSearch 1000 -removeQuery -threads 16 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-webis-touche2020.test.txt runs/run.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.bge-hnsw-onnx.topics.beir-v1.0.0-webis-touche2020.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| **nDCG@10** | **BGE-base-en-v1.5**| +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): Webis-Touche2020 | 0.257 | +| **R@100** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.486 | +| **R@1000** | **BGE-base-en-v1.5**| +| BEIR (v1.0.0): Webis-Touche2020 | 0.830 | + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/python/regressions-batch04.txt b/src/main/python/regressions-batch04.txt index 921ff7858..bc302192f 100644 --- a/src/main/python/regressions-batch04.txt +++ b/src/main/python/regressions-batch04.txt @@ -1,3 +1,280 @@ +# BEIR w/ parquet +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.txt 2>&1 + +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.txt 2>&1 + +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.txt 2>&1 + +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.txt 2>&1 + +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.txt 2>&1 + +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.txt 2>&1 + +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.txt 2>&1 + +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 +python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached > logs/log.beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.txt 2>&1 + +# MIRACL +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ar > logs/log.miracl-v1.0-ar.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-bn > logs/log.miracl-v1.0-bn.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-en > logs/log.miracl-v1.0-en.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-es > logs/log.miracl-v1.0-es.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fa > logs/log.miracl-v1.0-fa.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fi > logs/log.miracl-v1.0-fi.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fr > logs/log.miracl-v1.0-fr.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-hi > logs/log.miracl-v1.0-hi.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-id > logs/log.miracl-v1.0-id.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ja > logs/log.miracl-v1.0-ja.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ko > logs/log.miracl-v1.0-ko.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ru > logs/log.miracl-v1.0-ru.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-sw > logs/log.miracl-v1.0-sw.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-te > logs/log.miracl-v1.0-te.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-th > logs/log.miracl-v1.0-th.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-zh > logs/log.miracl-v1.0-zh.txt 2>&1 + +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ar-aca > logs/log.miracl-v1.0-ar-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-bn-aca > logs/log.miracl-v1.0-bn-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-en-aca > logs/log.miracl-v1.0-en-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-es-aca > logs/log.miracl-v1.0-es-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fa-aca > logs/log.miracl-v1.0-fa-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fi-aca > logs/log.miracl-v1.0-fi-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fr-aca > logs/log.miracl-v1.0-fr-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-hi-aca > logs/log.miracl-v1.0-hi-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-id-aca > logs/log.miracl-v1.0-id-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ja-aca > logs/log.miracl-v1.0-ja-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ko-aca > logs/log.miracl-v1.0-ko-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ru-aca > logs/log.miracl-v1.0-ru-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-sw-aca > logs/log.miracl-v1.0-sw-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-te-aca > logs/log.miracl-v1.0-te-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-th-aca > logs/log.miracl-v1.0-th-aca.txt 2>&1 +python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-zh-aca > logs/log.miracl-v1.0-zh-aca.txt 2>&1 + +# BEIR w/ jsonl python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana.bge-base-en-v1.5.flat.onnx > logs/log.beir-v1.0.0-arguana.bge-base-en-v1.5.flat.onnx.txt 2>&1 python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq.bge-base-en-v1.5.flat.onnx > logs/log.beir-v1.0.0-bioasq.bge-base-en-v1.5.flat.onnx.txt 2>&1 python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat.onnx > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.flat.onnx.txt 2>&1 @@ -418,41 +695,6 @@ python src/main/python/run_regression.py --verify --search --regression beir-v1. python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw-int8.cached > logs/log.beir-v1.0.0-climate-fever.bge-base-en-v1.5.hnsw-int8.cached.txt 2>&1 python src/main/python/run_regression.py --verify --search --regression beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw-int8.cached > logs/log.beir-v1.0.0-scifact.bge-base-en-v1.5.hnsw-int8.cached.txt 2>&1 -# MIRACL -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ar > logs/log.miracl-v1.0-ar.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-bn > logs/log.miracl-v1.0-bn.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-en > logs/log.miracl-v1.0-en.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-es > logs/log.miracl-v1.0-es.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fa > logs/log.miracl-v1.0-fa.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fi > logs/log.miracl-v1.0-fi.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fr > logs/log.miracl-v1.0-fr.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-hi > logs/log.miracl-v1.0-hi.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-id > logs/log.miracl-v1.0-id.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ja > logs/log.miracl-v1.0-ja.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ko > logs/log.miracl-v1.0-ko.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ru > logs/log.miracl-v1.0-ru.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-sw > logs/log.miracl-v1.0-sw.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-te > logs/log.miracl-v1.0-te.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-th > logs/log.miracl-v1.0-th.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-zh > logs/log.miracl-v1.0-zh.txt 2>&1 - -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ar-aca > logs/log.miracl-v1.0-ar-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-bn-aca > logs/log.miracl-v1.0-bn-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-en-aca > logs/log.miracl-v1.0-en-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-es-aca > logs/log.miracl-v1.0-es-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fa-aca > logs/log.miracl-v1.0-fa-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fi-aca > logs/log.miracl-v1.0-fi-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-fr-aca > logs/log.miracl-v1.0-fr-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-hi-aca > logs/log.miracl-v1.0-hi-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-id-aca > logs/log.miracl-v1.0-id-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ja-aca > logs/log.miracl-v1.0-ja-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ko-aca > logs/log.miracl-v1.0-ko-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-ru-aca > logs/log.miracl-v1.0-ru-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-sw-aca > logs/log.miracl-v1.0-sw-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-te-aca > logs/log.miracl-v1.0-te-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-th-aca > logs/log.miracl-v1.0-th-aca.txt 2>&1 -python src/main/python/run_regression.py --index --verify --search --regression miracl-v1.0-zh-aca > logs/log.miracl-v1.0-zh-aca.txt 2>&1 - # Mr.TyDi python src/main/python/run_regression.py --index --verify --search --regression mrtydi-v1.1-ar > logs/log.mrtydi-v1.1-ar.txt 2>&1 python src/main/python/run_regression.py --index --verify --search --regression mrtydi-v1.1-bn > logs/log.mrtydi-v1.1-bn.txt 2>&1 diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..bf20c6eec --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..ad85f6509 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..af889ca4c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..7df7e557c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..12f6a86a8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..69dd22fff --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..2580b6696 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..4dbe4820e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-arguana.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — ArguAna + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — ArguAna](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..978dca1d3 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..af504ccce --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..73c1c884f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..d95b9dbe5 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..c8becd906 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..b893840cf --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..b92fe4489 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..1b21d4585 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — BioASQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — BioASQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..d80e56fa3 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..dae192e00 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..8d39d23ba --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..dfd173832 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..5242596b7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..b071a91b5 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..e2351fd5d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..31b1e7b16 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-climate-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Climate-FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Climate-FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..3000dc339 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..0815bf7c0 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..278ed7096 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..096868a81 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..29bba2c01 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..22fbecee7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..dca85fdf6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..084226d12 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-android.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-android + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-android](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..b28650bcf --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..37ab4b2b4 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..d545029af --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..03586514b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..0167601b0 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..a96f1fdc2 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..32d849c8a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..633e1169b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-english.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-english + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-english](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..acc08b38f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..8dd3d2c62 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..99356bb67 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..c000a407f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..a87d8720a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..9f41f50da --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..aa222cb34 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..697ab2551 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gaming.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gaming + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gaming](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..e95cb6083 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..4ece82a8e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..6ecd80bed --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..9bfb2df21 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..4c7029ebc --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..af1aa345a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..316c84dda --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..bdd8082ea --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-gis.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-gis + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-gis](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..90916b9f0 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..f88cdc557 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..f3655ed7e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..94675c589 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..64a747323 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..856add95e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..24afdc730 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..4e24e2e4d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-mathematica.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-mathematica + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-mathematica](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..c9b9730be --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..1d81575a1 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..2a2b2cfce --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..bcd313a1b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..805f321c5 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..8307ac9c1 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..eee371275 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..ad4468b1a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-physics.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-physics + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-physics](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..096db3e67 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..3c1a40d91 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..d72c6a474 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..2547bff64 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..c52814944 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..2ac2447c8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..db81559fc --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..7f99cbed9 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-programmers.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-programmers + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-programmers](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..9be69485a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..71a9e77f6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..1debbd437 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..e3d3db381 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..37291b7cd --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..e5766809b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..6626fbb72 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..6c534c0ac --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-stats.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-stats + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-stats](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..0f923e066 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..5d8a1aee5 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..60cda4785 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..22b3ad713 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..ee423f7b2 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..9c582f6b9 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..ce01d2342 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..4a3169a61 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-tex.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-tex + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-tex](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..5631d90b1 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..c01594a65 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..3056880c6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..c447ee16a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..af4998954 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..5b95735dc --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..d5aef804d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..17393bc57 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-unix.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-unix + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-unix](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..29bd9555f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..fb87a3bbb --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..d70ac777d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..4f31df65e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..7abf6fd7a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..f4a89049e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..cee9ccb39 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..6196fb1d7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-webmasters.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-webmasters + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-webmasters](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..6771a8083 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..055e760e7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..483470743 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..e7dd3c5ed --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..d1bfb602a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..a0498eda4 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..fd13e23bb --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..a54a28e62 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-cqadupstack-wordpress.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — CQADupStack-wordpress + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — CQADupStack-wordpress](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..2cd490a6b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..775f0594f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..f482dc7c7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..28924af3e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..940a769af --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..354984040 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..6edaab631 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..0b90c0f0e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-dbpedia-entity.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — DBPedia + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — DBPedia](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..b87640616 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..d85414b36 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..86e17c72f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..95668fedf --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..81126b308 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..a74af8c81 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..21014f2a8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..b6d5ddc77 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fever.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FEVER + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FEVER](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..75c3ef042 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..fe6f2d8f8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..4c0dadf13 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..3bdfba72d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..2c433d3d3 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..a76b7d1ed --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..94aeed5a6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..ba952d22d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — FiQA-2018 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — FiQA-2018](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..99d3ef5f8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..8ac67d3db --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..9e4a244c7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..1521af52a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..10e6b9d7a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..958e4d0d6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..dfd8773d5 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..18fdf1fa8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-hotpotqa.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — HotpotQA + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — HotpotQA](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..5934aa689 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..25bb5e229 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..1a0633c50 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..9b464d1d9 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..f3150bf1c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..4e23a97f6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..95554889d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..68d755f06 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nfcorpus.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NFCorpus + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NFCorpus](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..5824e0f70 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..60fb09930 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..864bfd12a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..28da0fb3d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..56c67e782 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..91fed1950 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..874ca1b81 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..5ae86b04a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-nq.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — NQ + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — NQ](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..3ed7f476d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..4e35e8961 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..7f7662bd2 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..48fd04dce --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..5366975aa --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..902f4bee2 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..ca82e1db7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..ffbec1438 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-quora.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Quora + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Quora](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..ef83f5c1d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..b31f44f83 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..8c5263347 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..30f66c08c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..0d69a84b0 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..46977479c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..cc9182019 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..86f95778a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-robust04.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Robust04 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Robust04](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..2b4b3cc48 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..070838e50 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..acaae3df7 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..595258fc1 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..44b20d191 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..65f1ad814 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..348a1569b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..e1ae055fb --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scidocs.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SCIDOCS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SCIDOCS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..9a87611ae --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..5d0f051fe --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..543810549 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..c862043b5 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..b634fd2cf --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..808af590c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..5a990396b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..635748427 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-scifact.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — SciFact + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — SciFact](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..7ef594ba8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..31842c38f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..8f8a6817f --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..7cbb96833 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..06e956671 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..10c332c3e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..79bf2dc69 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..92a8ef06a --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-signal1m.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Signal-1M + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Signal-1M](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..caaff8bf4 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..9721e55ee --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..bae61ce5e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..67182e401 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..3275b84d8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..b4a6fe2b3 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..cc60affe4 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..04da38e36 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-covid.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-COVID + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-COVID](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..d4d884bef --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..279af01d8 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..0ddb513d1 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..f5c118a6b --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..953b8128e --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..69b156314 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..aef306add --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..224e45183 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — TREC-NEWS + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — TREC-NEWS](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.template new file mode 100644 index 000000000..62d5db9e6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.cached.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With cached queries on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.template new file mode 100644 index 000000000..24fea3342 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat-int8.onnx.template @@ -0,0 +1,64 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.004 of the results reported above (with some outliers). +Note that quantization is non-deterministic due to sampling (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.template new file mode 100644 index 000000000..be65c1ba1 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.cached.template @@ -0,0 +1,62 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +Note that since we're running brute-force search with cached queries on non-quantized flat indexes, the results should be reproducible _exactly_. diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.template new file mode 100644 index 000000000..ad62caac6 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.flat.onnx.template @@ -0,0 +1,63 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with flat indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building flat indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized flat indexes. +With ONNX query encoding on non-quantized flat indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.001 of the results reported above (with some outliers). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.template new file mode 100644 index 000000000..80714d118 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template new file mode 100644 index 000000000..0cb3b786c --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw-int8.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with quantized HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building quantized HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.005 of the results reported above (with some outliers). +Note that both HNSW indexing and quantization are non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.template new file mode 100644 index 000000000..fc597d529 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.cached.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using cached queries) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using cached queries (i.e., cached results of query encoding). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With cached queries on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.template b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.template new file mode 100644 index 000000000..d516f9b62 --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-webis-touche2020.bge-base-en-v1.5.parquet.hnsw.onnx.template @@ -0,0 +1,66 @@ +# Anserini Regressions: BEIR (v1.0.0) — Webis-Touche2020 + +**Model**: [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) with HNSW indexes (using ONNX for on-the-fly query encoding) + +This page describes regression experiments, integrated into Anserini's regression testing framework, using the [BGE-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model on [BEIR (v1.0.0) — Webis-Touche2020](http://beir.ai/), as described in the following paper: + +> Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. [C-Pack: Packaged Resources To Advance General Chinese Embedding.](https://arxiv.org/abs/2309.07597) _arXiv:2309.07597_, 2023. + +In these experiments, we are using ONNX to perform query encoding on the fly. + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +All the BEIR corpora, encoded by the BGE-base-en-v1.5 model and stored in Parquet format, are available for download: + +```bash +wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -P collections/ +tar xvf collections/beir-v1.0.0-bge-base-en-v1.5.parquet.tar -C collections/ +``` + +The tarball is 194 GB and has MD5 checksum `c279f9fc2464574b482ec53efcc1c487`. +After download and unpacking the corpora, the `run_regression.py` command above should work without any issue. + +## Indexing + +Sample indexing command, building HNSW indexes: + +``` +${index_cmds} +``` + +The path `/path/to/${corpus}/` should point to the corpus downloaded above. +Note that here we are explicitly using Lucene's `NoMergePolicy` merge policy, which suppresses any merging of index segments. +This is because merging index segments is a costly operation and not worthwhile given our query set. + +## Retrieval + +Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule. + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} + +The above figures are from running brute-force search with cached queries on non-quantized **flat** indexes. +With ONNX query encoding on non-quantized HNSW indexes, observed results may differ slightly (typically, lower), but scores should generally be within 0.003 of the results reported above (with some outliers). +Note that HNSW indexing is non-deterministic (i.e., results may differ slightly between trials). diff --git a/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.cached.yaml b/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.cached.yaml index 3696cfacf..6a774ca3e 100644 --- a/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.cached.yaml +++ b/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.hnsw-int8.cached.yaml @@ -55,6 +55,6 @@ models: nDCG@10: - 0.03 R@100: - - 0.035 + - 0.04 R@1000: - 0.06 diff --git a/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml b/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml index e721ae266..5583623d4 100644 --- a/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml +++ b/src/main/resources/regression/beir-v1.0.0-bioasq.bge-base-en-v1.5.parquet.hnsw-int8.cached.yaml @@ -55,6 +55,6 @@ models: nDCG@10: - 0.03 R@100: - - 0.035 + - 0.04 R@1000: - 0.06 diff --git a/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.cached.yaml b/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.cached.yaml index 38408467b..b1f674df1 100644 --- a/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.cached.yaml +++ b/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.hnsw.cached.yaml @@ -53,7 +53,7 @@ models: - 0.7875 tolerance: nDCG@10: - - 0.001 + - 0.002 R@100: - 0.01 R@1000: diff --git a/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.yaml b/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.yaml index 78b2d2e26..189b213f1 100644 --- a/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.yaml +++ b/src/main/resources/regression/beir-v1.0.0-trec-news.bge-base-en-v1.5.parquet.hnsw.cached.yaml @@ -53,7 +53,7 @@ models: - 0.7875 tolerance: nDCG@10: - - 0.001 + - 0.002 R@100: - 0.01 R@1000: