Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding models/methods/datasets #27

Open
dirkgr opened this issue May 16, 2022 · 0 comments
Open

Adding models/methods/datasets #27

dirkgr opened this issue May 16, 2022 · 0 comments

Comments

@dirkgr
Copy link
Member

dirkgr commented May 16, 2022

Motivation: Various people have asked for various additions to Catwalk already. It's risky because nobody is using Catwalk yet. But we have several people who said they want to (Pradeep, Matt/Hamish, Iz?, Ludwig).

Here are the sub-projects in order of importance:

  • Promptsource: This is the most requested one. The task would be to add a promptsource instance format to as many tasks as possible, and then evaluate various models with that format.
  • Make sure we have all the tasks in P3. This might be a no-op after the first item.
  • Few-shot prompting (for in-context learning). Nobody has explicitly asked for this. I think nobody asks because it's obvious that catwalk would have this.
  • Crossfit: Pradeep wants to use Crossfit. Those tasks should be fairly easy to add.
  • T-Few seems like a good baseline for a lot of our work, so it might become a good benchmark set for a while.
  • BigBench: Pradeep asked about those too, but backed off from it later. Could be a nice addition, but is at the bottom of this list on purpose.
  • Prompt format that uses the "channel method" for decoder-only models. Nobody has asked for it, but it came up in our reading group. I thought I could verify it with a quick experiment, but I could not.
@dirkgr dirkgr closed this as completed May 23, 2022
@dirkgr dirkgr reopened this May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant