Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Letta LLM Leaderboards #2013

Open
distributev opened this issue Nov 7, 2024 · 2 comments
Open

Provide Letta LLM Leaderboards #2013

distributev opened this issue Nov 7, 2024 · 2 comments
Assignees

Comments

@distributev
Copy link

distributev commented Nov 7, 2024

So that people will clearly see which LLMs work well with Letta and which do not.

Similar idea with

https://aider.chat/docs/leaderboards/

Meanwhile 'Berkeley Function-Calling Leaderboard' is good enough because it ranks LLMs based
on their ability to do Function-Calling (which is fundamental for Letta's well functioning)

https://gorilla.cs.berkeley.edu/leaderboard.html

Here you can see that GPT-4o-mini-2024-07-18 (FC) is a great choice for Letta because it ranks 4th and
it has a very good price.

P.S - I lost a huge amount of time trying MemGPT/Letta with lots or random LLMs,
including the "stars" LLMs, just to get back stack traces up to the level of almost giving up.

Until, by luck, I tried GPT-4o-mini and it worked great + when I saw the price I knew this is it.

@sarahwooders
Copy link
Collaborator

sarahwooders commented Nov 10, 2024

We just started working on this recently, so will have updates toon. Anecdotally, we've also seen gpt-4o-mini (but not gpt-4o) have great cost/performance generally recommend that.

@distributev
Copy link
Author

Another good idea and not difficult to put in practice is again what aider is doing (even if they have the leaderboard). The 3rd sentence on their home page is

'Aider works best with GPT-4o & Claude 3.5 Sonnet and can connect to almost any LLM.'

They tell people upfront what works well and then, with the leaderboards, they provide more details for people
which need more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants