-
Notifications
You must be signed in to change notification settings - Fork 224
Issues: bigcode-project/bigcode-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
HumanEval-X generation appears to not time out called subproceses
#290
opened Nov 24, 2024 by
nielstron
Could you share a completed file of generations_mbppplus.json
#287
opened Nov 13, 2024 by
marybloodyzz
Continuing / Extending Previous Results from Generating and Evaluating?
#284
opened Nov 1, 2024 by
RylanSchaeffer
ValueError: Infilling not yet supported for:/Meta-Llama-3.1-8B
#275
opened Sep 23, 2024 by
kbmlcoding
Evaluation result of bigcode/starcoder2-3b on gsm8k_pal does not matched the paper
#272
opened Sep 13, 2024 by
nongfang55
Evaluating a Model with a Local Dataset in an Offline Environment
#271
opened Sep 12, 2024 by
ankush13r
[Possibly system specific] Wild (12% vs 20%) run-to-run swings in
multiple-cpp
reported scores
#258
opened Jul 18, 2024 by
alat-rights
Using the humanevalpack to test the ChatGLM3 model results in an abnormal score.
#251
opened Jul 5, 2024 by
burger-pb
Previous Next
ProTip!
Follow long discussions with comments:>50.