Cannot reproduce distillgpt2 LM Numbers using --knn #14

HossamAmer12 · 2024-10-17T16:37:44Z

I am trying to build on your knn-transfomers repo.

When I run the distill gpt with the given setup in the repo but with --knn flag, I get around 21.xx preplexity. This number is different than the one reported in the repository.

MODEL=neulab/distilgpt2-finetuned-wikitext103
python -u run_clm.py \
  --model_name_or_path ${MODEL} \
  --dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 \
  --output_dir checkpoints/${MODEL}_knn \
  --do_eval --eval_subset validation \
  --dstore_dir /tmp/distillgpt2/ --dstore_size 116988150

I am able to reproduce the other numbers (baseline + retomaton) for distill gpt.

Could you please let me know if you have any clue here?

The text was updated successfully, but these errors were encountered:

urialon · 2024-10-17T17:27:46Z

Hi Hossam, Thank you for your interest in our work. I believe that you need to rebuild the KNN datastore specifically for distill-GPT. Have you done that? Best, Uri

…

On Thu, Oct 17, 2024 at 12:38 Hossam Amer ***@***.***> wrote: I am trying to build on your knn-transfomers repo <https://github.com/neulab/knn-transformers/tree/master?tab=readme-ov-file> . When I run the distill gpt with the given setup in the repo but with --knn flag, I get around 21.xx preplexity. This number is different than the one reported in the repository. I am able to reproduce the other numbers for distill gpt. Could you please let me know if you have any clue here? — Reply to this email directly, view it on GitHub <#14>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSOXMHVGBTAFDUHXM4CTSTZ37RW5AVCNFSM6AAAAABQEE3NW6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU4TKMRSG44TEOA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

HossamAmer12 · 2024-10-17T21:24:36Z

Thanks @urialon for getting back.

The model that I was using in the previous (sorry I edited my post above) is the one given in the repo. That said, the scores are different.

Based on your suggestion, I tried building the dstore myself. everytime I get to this error:
UserScriptFilledDisk: User script filled the disk. Consider using Virtual Machine SKU with larger disk size.

That's the command I used for building the datastore:

MODEL=neulab/distilgpt2-finetuned-wikitext103
path_to=""

CUDA_VISIBLE_DEVICES=0 python -u run_clm.py \
  --model_name_or_path ${MODEL} \
  --dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 \
  --do_eval --eval_subset train \
  --output_dir $path_to/checkpoints/${MODEL} \
  --dstore_dir $path_to/checkpoints/${MODEL} \
  --save_knnlm_dstore --dstore_size 116988150

Does it require too much size?

Question: Do I have to specify the dstore size here? What does the dstore size indicate? Number of contexts?
Another question. When running knn-lm + given distill gpt, should I use a specific temperature or lambda? I saw you posting on this so that we reproduce the scores.

HossamAmer12 · 2024-10-18T09:02:17Z

Just want to update on the issue. Using the following did not result into the size issue:

MODEL=neulab/distilgpt2-finetuned-wikitext103
CUDA_VISIBLE_DEVICES=0 python -u run_clm.py \
  --model_name_or_path ${MODEL} \
  --dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 \
  --do_eval --eval_subset validation \
  --output_dir ${path}/checkpoints/${MODEL}\_SAVE0 \
  --dstore_dir ${path}/checkpoints/${MODEL}\_SAVE0 \
  --save_knnlm_dstore --dstore_size 116988150

I guess that's due to the small size of the validation split (I know that's not realistic setup). Do you know the size of the training set and what our limits are?

HossamAmer12 · 2024-10-20T17:35:05Z

Hi Uri,

I tried to construct the datastore with the wikitext validation set and given distill gpt model. Then run knn using the same set. The final perplexity scores are not good relative to baseline.

What could be the problem?

Even though the setup is not practical, I expected that the perplexity to be a lot better given the datastore set and eval set are the same.

That of course not being able to using the training set for knn datastore due to memory problems. I have not yet figured out the reason.

I kindly ask for your helpful advice.

Thanks,
Hossam

urialon · 2024-10-22T19:00:07Z

I just replied to you in a different thread, let me if anything is still unclear.

…

On Sun, Oct 20, 2024 at 13:35 Hossam Amer ***@***.***> wrote: Hi Uri, I tried to construct the datastore with the wikitext validation set and given distill gpt model. Then run knn using the same set. The final perplexity scores are not good relative to baseline. What could be the problem? Even though the setup is not practical, I expected that the perplexity to be a lot better given the datastore set and eval set are the same. That of course not being able to using the training set for knn datastore due to memory problems. I have not yet figured out the reason. I kindly ask for your helpful advice. Thanks, Hossam — Reply to this email directly, view it on GitHub <#14 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSOXMDFFAWVEGPJ2QWOILDZ4PSV7AVCNFSM6AAAAABQEE3NW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRVGE2DKMBSGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

HossamAmer12 changed the title ~~Cannot reproduce distillgpt2 LM Numbers~~ Cannot reproduce distillgpt2 LM Numbers using --knn Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reproduce distillgpt2 LM Numbers using --knn #14

Cannot reproduce distillgpt2 LM Numbers using --knn #14

HossamAmer12 commented Oct 17, 2024 •

edited

Loading

urialon commented Oct 17, 2024 via email

HossamAmer12 commented Oct 17, 2024

HossamAmer12 commented Oct 18, 2024 •

edited

Loading

HossamAmer12 commented Oct 20, 2024

urialon commented Oct 22, 2024 via email

Cannot reproduce distillgpt2 LM Numbers using --knn #14

Cannot reproduce distillgpt2 LM Numbers using --knn #14

Comments

HossamAmer12 commented Oct 17, 2024 • edited Loading

urialon commented Oct 17, 2024 via email

HossamAmer12 commented Oct 17, 2024

HossamAmer12 commented Oct 18, 2024 • edited Loading

HossamAmer12 commented Oct 20, 2024

urialon commented Oct 22, 2024 via email

HossamAmer12 commented Oct 17, 2024 •

edited

Loading

HossamAmer12 commented Oct 18, 2024 •

edited

Loading