-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Ollama, Palm, Claude-2, Cohere, Replicate Llama2 CodeLlama (100+LLMs) - using LiteLLM #53
base: main
Are you sure you want to change the base?
Conversation
@@ -66,7 +67,7 @@ def run(self, *args, **kwargs) -> Dict[str, Any]: | |||
num_max_token = num_max_token_map[self.model_type.value] | |||
num_max_completion_tokens = num_max_token - num_prompt_tokens | |||
self.model_config_dict['max_tokens'] = num_max_completion_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also expose a util called get_max_tokens()
happy to expose this in this PR too
Thank you. Does litellm support more personalized parameters? such as temperature, top_n, etc,. |
Yes, the following
|
@@ -15,6 +15,7 @@ | |||
from typing import Any, Dict | |||
|
|||
import openai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this import isn't used anymore it seems
Stoked to see this PR get merged! |
bump @ishaan-jaff |
@qianc62 yes we support all params OpenAI supports + we allow you to pass provider specific params if necessary more info here: https://docs.litellm.ai/docs/completion/input |
@qianc62 any blockers to merging ? anything you need from me ? |
Couple things to update here. First problem: I needed to ignore OPEN_AI_API_KEY by setting it to some arbitrary value. Second problem: ChatDev was sending too many arguments to the Ollama which I handled with: Third problem: As I don't know how to create a real model class for the LiteLLM models with all required information, I just used GPT_3_5_TURBO as my model but then in the model_backend.py I replaced the response with: Fourth (bigger problem) I encountered: LiteLLM's OpenAI API seems to be newer version than ChatDev's, which causes response (completion) to return "logprobs" inside the "choises list" back to the ChatDev which then causes multiple errors as ChatDev doesn't support logprobs. With a crude hack (removing the "logprobs" from the response) I managed to get past this error. Anyway here is the early chat with my Mistral 7B (Chief Product Officer) writing some crude code for my request. |
Hey @venim1103 did the proxy not work for you? |
I am extremely interested in this PR |
Hey @venim1103 i've filed your issue re: logprobs. I'll make sure we have a fix for this on our (litellm) end. Extremely sorry for the frustration that must've caused. |
Wait, we don't need to change openai_api_base to local url? |
@krrishdholakia Thank you! Anyway my initial testing with Mistral 7B model has some issues (the model itself doesn't really understand the "<INFO" context and is mostly too chatty or starts changing the subject too early thus not moving trough the process). |
@cielonet i'm the maintainer of litellm. can't see the exact issue you faced. Is this because we raise errors for unmapped params? |
some context would be helpful - i'd like to get this fixed on our end asap. |
@krrishdholakia No prob. I'm currently out of town and will be back on Monday. I'll repost the error msg I was getting. It looked to me like the msg "expected string or buffer" was a msg generated by litellm because a value (I think it was part of the logging key) in the api call was not correctly formatted. When I ran it with raiseExceptions=False the api calls never sent that particular field and the system started working again. I did use the logging http copy/paste so if you have access to the logs/feedback people submit you should see mine from Thursday when I was working on this (e.g. focus on looking for "expected string or buffer") Anyways like I said I will be back Monday and will provide more feedback. I suggest adding a timeout to your telemetry as well if internet is not avaiable because otherwise it freezes this system and it as a pain to figure out that the telemetry was causing everything to pause until it finds an internet connection. :-/ Thanks again. |
how about the PR? |
I've tried those changes locally and trying to run the code with azure openai service doesn't seem to work. I'll let you know if I get it to function. |
@OhNotWilliam we don't log any of the responses - it's all client-side (even the web url you saw was just an encoded url string). If you have the traceback, please let me know - happy to help debug. |
We've also had people running this via the local OpenAI-proxy - https://docs.litellm.ai/docs/proxy_server |
@OhNotWilliam: Check my PR #192, which gets Azure working |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it, with several little fix to go.
Any movement on getting this PR merged? |
Where do we stand on this? What is still outstanding/how can I help? |
Hi, is this still open ? Very confused |
Any update on when this will be implemented? |
Ollama annouced OpenAI compability making LiteLLM irrelevant |
@TGM thank you for the headsup. |
If anyone has any documentation on clearly how to implement this what is provided in the title of this issue/PR please provide with that . thanks a lot. Happy coding . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
litellm is great. I'm looking forward to seeing this get introduced to ChatDev. Not sure if you're looking to add support directly right now, but if so you may want to either add or generalize entries in ModelTypes, num_max_token_map, etc.
@@ -66,7 +67,7 @@ def run(self, *args, **kwargs) -> Dict[str, Any]: | |||
num_max_token = num_max_token_map[self.model_type.value] | |||
num_max_completion_tokens = num_max_token - num_prompt_tokens | |||
self.model_config_dict['max_tokens'] = num_max_completion_tokens | |||
response = openai.ChatCompletion.create(*args, **kwargs, | |||
response = litellm.completion(*args, **kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When testing this for Claude, line 78 below ("if not isinstance(response, Dict)") fails because the response is an instance of ModelResponse. Similar thing happening in chat_agent.py line 192.
If this feature is still desired, I'd be happy to help facilitate the merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all is good
what you mean ? merged or not ? |
So need update jiter-0.6.1 openai-1.52.1 in requirements.txt, without it the chatdev is not started. Check with llmstudio, still trying to connect to OpenAI not to local server on LLMStudio. I set OPENAI_API_KEY, OPENAI_API_BASE and MODEL. What else needs to be done? |
This PR adds support for the above mentioned LLMs using LiteLLM https://github.com/BerriAI/litellm/
LiteLLM is a lightweight package to simplify LLM API calls - use any llm as a drop in replacement for gpt-3.5-turbo.
Example