-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird tool call arguments, resulting in UnexpectedModelBehaviour / validation error #81
Comments
@samuelcolvin ^^^^^ |
weird, not sure what's going on, I ran your code, and it worked first time. I ran it directly as a script, and it worked fine: from enum import Enum
from textwrap import dedent
from typing import List
from pydantic import BaseModel, Field
from pydantic_ai import Agent, CallContext
class Question(BaseModel):
reflection: str = Field(..., description='Considering the questions and answers so far, what are things we can ask next?')
question: str = Field(..., description='The question to ask the other player')
asking_agent = Agent('openai:gpt-4o', result_type=Question)
@asking_agent.system_prompt
async def asking_agent_system_prompt(ctx: CallContext[List]) -> str:
turns = ctx.deps
prompt = dedent(f"""
You are playing a game of 20 questions.
You are trying to guess the object the other player is thinking of.
In each turn, you can ask a yes or no question.
The other player will answer with "yes", "no".
""").strip()
if len(turns) > 0:
prompt += f"\nHere are the questions you have asked so far and the answers you have received:\n"
prompt += '\n'.join([' * ' + turn for turn in turns])
return prompt
class Answer(str, Enum):
YES = 'yes'
NO = 'no'
YOU_WIN = 'you win'
class AnswerResponse(BaseModel):
reflection: str = Field(..., description=(
'Considering the question, what is the answer? '
'Is it "yes" or "no"? Or did they guess the '
'object and the answer is "you win"?'))
answer: Answer = Field(..., description='The answer to the question - "yes", "no", or "you win"')
ansering_agent = Agent('openai:gpt-4o', result_type=AnswerResponse)
@ansering_agent.system_prompt
async def answering_agent_system_prompt(ctx: CallContext[str]) -> str:
prompt = dedent(f"""
You are playing a game of 20 questions.
The other player is trying to guess the object you are thinking of.
The object you are thinking of is: {ctx.deps}.
Answer with "yes" or "no", or "you win" if the other player has guessed the object.
""").strip()
return prompt
def twenty_questions(mytery_object):
turns = []
while True:
question = asking_agent.run_sync('Ask the next question', deps=turns).data.question
answer = ansering_agent.run_sync(question, deps=mytery_object).data.answer.value
if answer == Answer.YOU_WIN:
print('You Win!')
break
elif len(turns) >= 20:
print('You Lose!')
break
else:
turns.append(f'{question} - {answer}')
print(f'{len(turns)}. QUESTION: {question}\nANSWER: {answer}\n')
twenty_questions('a cat') output:
|
Yes, it also works most of the time for me. Just not all the time. The LLMs
are non-deterministic and sometimes they do weird things. My point is that
it's good to be defensive, and strict mode is one way to do this.
…On Thu, 21 Nov 2024 at 19:28, Samuel Colvin ***@***.***> wrote:
weird, not sure what's going on, I ran your code, and it worked first time.
I ran it directly as a script, and it worked fine:
from enum import Enumfrom textwrap import dedentfrom typing import List
from pydantic import BaseModel, Fieldfrom pydantic_ai import Agent, CallContext
class Question(BaseModel):
reflection: str = Field(..., description='Considering the questions and answers so far, what are things we can ask next?')
question: str = Field(..., description='The question to ask the other player')
asking_agent = Agent('openai:gpt-4o', result_type=Question)
@asking_agent.system_promptasync def asking_agent_system_prompt(ctx: CallContext[List]) -> str:
turns = ctx.deps
prompt = dedent(f""" You are playing a game of 20 questions. You are trying to guess the object the other player is thinking of. In each turn, you can ask a yes or no question. The other player will answer with "yes", "no". """).strip()
if len(turns) > 0:
prompt += f"\nHere are the questions you have asked so far and the answers you have received:\n"
prompt += '\n'.join([' * ' + turn for turn in turns])
return prompt
class Answer(str, Enum):
YES = 'yes'
NO = 'no'
YOU_WIN = 'you win'
class AnswerResponse(BaseModel):
reflection: str = Field(..., description=(
'Considering the question, what is the answer? '
'Is it "yes" or "no"? Or did they guess the '
'object and the answer is "you win"?'))
answer: Answer = Field(..., description='The answer to the question - "yes", "no", or "you win"')
ansering_agent = Agent('openai:gpt-4o', result_type=AnswerResponse)
@ansering_agent.system_promptasync def answering_agent_system_prompt(ctx: CallContext[str]) -> str:
prompt = dedent(f""" You are playing a game of 20 questions. The other player is trying to guess the object you are thinking of. The object you are thinking of is: {ctx.deps}. Answer with "yes" or "no", or "you win" if the other player has guessed the object. """).strip()
return prompt
def twenty_questions(mytery_object):
turns = []
while True:
question = asking_agent.run_sync('Ask the next question', deps=turns).data.question
answer = ansering_agent.run_sync(question, deps=mytery_object).data.answer.value
if answer == Answer.YOU_WIN:
print('You Win!')
break
elif len(turns) >= 20:
print('You Lose!')
break
else:
turns.append(f'{question} - {answer}')
print(f'{len(turns)}. QUESTION: {question}\nANSWER: {answer}\n')
twenty_questions('a cat')
output:
1. QUESTION: Is it something commonly found indoors?
ANSWER: yes
2. QUESTION: Does it use electricity?
ANSWER: no
3. QUESTION: Is it used for storage?
ANSWER: no
4. QUESTION: Is it used for entertainment purposes?
ANSWER: no
5. QUESTION: Is it used for cleaning?
ANSWER: no
6. QUESTION: Is it a piece of furniture?
ANSWER: no
7. QUESTION: Is it used for writing or drawing?
ANSWER: no
8. QUESTION: Is it used for personal grooming or hygiene?
ANSWER: no
9. QUESTION: Is it used in the kitchen?
ANSWER: no
10. QUESTION: Is it related to health or safety?
ANSWER: no
11. QUESTION: Is it used for decoration?
ANSWER: no
12. QUESTION: Is it used for organizing?
ANSWER: no
13. QUESTION: Is it used for communication?
ANSWER: no
14. QUESTION: Is it used for comfort or relaxation?
ANSWER: yes
15. QUESTION: Is it something you can wear indoors?
ANSWER: no
16. QUESTION: Is it something you can sit or lie on?
ANSWER: no
17. QUESTION: Is it something you can hold or carry?
ANSWER: yes
18. QUESTION: Is it a textile item like a pillow or a blanket?
ANSWER: no
19. QUESTION: Is it used to provide warmth?
ANSWER: no
20. QUESTION: Is it something you use to hold or support things?
ANSWER: no
You Lose!
—
Reply to this email directly, view it on GitHub
<#81 (comment)>
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAALJOCW454F26AQQMIUAFT2BYQ47BFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVE4DCOBTGMYTCOJYQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRSGY3TSOBRGM2TGM5HORZGSZ3HMVZKMY3SMVQXIZI>
.
You are receiving this email because you authored the thread.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>
.
|
Thanks, yup I'll look into it. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
See https://github.com/intellectronica/pydantic-ai-experiments/blob/main/scratch.ipynb
See the prefixed
"_" :
? It's not there on earlier calls. Possibly a hallucination.I think this can be avoided with strict mode. Would be great to have it as an option for OpenAI calls.
The text was updated successfully, but these errors were encountered: