-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce distillgpt2 LM Numbers using --knn #14
Comments
Hi Hossam,
Thank you for your interest in our work.
I believe that you need to rebuild the KNN datastore specifically for
distill-GPT.
Have you done that?
Best,
Uri
…On Thu, Oct 17, 2024 at 12:38 Hossam Amer ***@***.***> wrote:
I am trying to build on your knn-transfomers repo
<https://github.com/neulab/knn-transformers/tree/master?tab=readme-ov-file>
.
When I run the distill gpt with the given setup in the repo but with --knn
flag, I get around 21.xx preplexity. This number is different than the one
reported in the repository.
I am able to reproduce the other numbers for distill gpt.
Could you please let me know if you have any clue here?
—
Reply to this email directly, view it on GitHub
<#14>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSOXMHVGBTAFDUHXM4CTSTZ37RW5AVCNFSM6AAAAABQEE3NW6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU4TKMRSG44TEOA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks @urialon for getting back. The model that I was using in the previous (sorry I edited my post above) is the one given in the repo. That said, the scores are different. Based on your suggestion, I tried building the dstore myself. everytime I get to this error: That's the command I used for building the datastore:
Does it require too much size? Question: Do I have to specify the dstore size here? What does the dstore size indicate? Number of contexts? |
Just want to update on the issue. Using the following did not result into the size issue:
I guess that's due to the small size of the validation split (I know that's not realistic setup). Do you know the size of the training set and what our limits are? |
Hi Uri, I tried to construct the datastore with the wikitext validation set and given distill gpt model. Then run knn using the same set. The final perplexity scores are not good relative to baseline. What could be the problem? Even though the setup is not practical, I expected that the perplexity to be a lot better given the datastore set and eval set are the same. That of course not being able to using the training set for knn datastore due to memory problems. I have not yet figured out the reason. I kindly ask for your helpful advice. Thanks, |
I just replied to you in a different thread, let me if anything is still
unclear.
…On Sun, Oct 20, 2024 at 13:35 Hossam Amer ***@***.***> wrote:
Hi Uri,
I tried to construct the datastore with the wikitext validation set and
given distill gpt model. Then run knn using the same set. The final
perplexity scores are not good relative to baseline.
What could be the problem?
Even though the setup is not practical, I expected that the perplexity to
be a lot better given the datastore set and eval set are the same.
That of course not being able to using the training set for knn datastore
due to memory problems. I have not yet figured out the reason.
I kindly ask for your helpful advice.
Thanks,
Hossam
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSOXMDFFAWVEGPJ2QWOILDZ4PSV7AVCNFSM6AAAAABQEE3NW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRVGE2DKMBSGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am trying to build on your knn-transfomers repo.
When I run the distill gpt with the given setup in the repo but with --knn flag, I get around 21.xx preplexity. This number is different than the one reported in the repository.
I am able to reproduce the other numbers (baseline + retomaton) for distill gpt.
Could you please let me know if you have any clue here?
The text was updated successfully, but these errors were encountered: