Skip to content

Latest commit

 

History

History
56 lines (39 loc) · 1.94 KB

README.md

File metadata and controls

56 lines (39 loc) · 1.94 KB

Benchmark

the benchmark results are available in benchmark.csv. You can visualize the results in the notebook

How to reproduce the CLIP benchmark results

Webdataset evaluation: VTAB+ and retrieval datasets (MSCOCO, Flickr8k, Flickr30k)

clip_benchmark eval --pretrained_model openai openclip_base \
    --dataset "webdatasets.txt" \
    --dataset_root "https://huggingface.co/datasets/clip-benchmark/wds_{dataset_cleaned}/tree/main" \
    --output "benchmark_{dataset}_{pretrained}_{model}_{language}_{task}.json"

Once the evaluation finishes, you can construct a CSV with all the results:

clip_benchmark build benchmark_*.json --output benchmark.csv

Notes: Pascal VOC 2007 multilabel is not yet included in the webdataset test suite. Multilingual support with webdataset is in progress.

Alternative: Local download

clip_benchmark eval --pretrained_model  openai openclip_base  --dataset vtab+ retrieval \
--dataset_root "clip_benchmark_datasets/{dataset}" \
--output "benchmark_{dataset}_{pretrained}_{model}_{language}_{task}.json"

(Change --dataset_root accordingly)

Multilingual ImageNet benchmark

To run the multilingual ImageNet benchmark, use:

clip_benchmark eval --pretrained_model openclip_multilingual openclip_base openai  --dataset imagenet1k --language cn it jp en ar\
--dataset_root "clip_benchmark_datasets/{dataset}" \
--output "multilingual_{dataset}_{pretrained}_{model}_{language}_{task}.json"

(Change --dataset_root accordingly)

Multilingual MS-COCO benchmark

To run the multilingual MS-COCO benchmark, use:

clip_benchmark eval --pretrained_model openclip_multilingual openclip_base openai --dataset multilingual_mscoco_captions --language es it ko pl ru tr zh en \
--dataset_root "clip_benchmark_datasets/{dataset}" \
--output "multilingual_{dataset}_{pretrained}_{model}_{language}_{task}.json"

(Change --dataset_root accordingly)