Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization Issues with PydanticObjectId and Link in Beanie 1.27.0 [BUG] #1043

Open
martincpt opened this issue Oct 8, 2024 · 7 comments

Comments

@martincpt
Copy link

Describe the bug

Since upgrading to Beanie 1.27.0, we've encountered several serialization problems with PydanticObjectId and Link objects.

It seems that PydanticObjectId no longer automatically casts itself to a string primitive during Pydantic serialization.

To Reproduce

Example 1:

class MyDoc(Document):
    related_id: str

MyDoc(related_id=PydanticObjectId())

Ends up with pydantic_core._pydantic_core.ValidationError.

Example 2:

# When testing with FastAPI's TestClient - converting dict to JSON
client.post(endpoint, json={"related_id": PydanticObjectId()})

Ends up with TypeError: Object of type PydanticObjectId is not JSON serializable.

The above snippets worked fine in version 1.26.0 but now fails in 1.27.0.

Expected behavior
The expected behavior is that PydanticObjectId should automatically cast itself to a string during Pydantic serialization, as it did in version 1.26.0.

Additional context
We traced the issue to the introduction of a new when_used="json" attribute in the __get_pydantic_core_schema__ methods, which seems to be the core cause of the problem.

We acknowledge that our type casting may not have been very strict, but the previous behavior was convenient and worked seamlessly. Without this attribute, all serialization appears to function as expected.

Questions:

  1. What was the rationale behind introducing when_used="json"?
  2. Was this change—specifically the loss of automatic string serialization—intentional, or can we expect a fix in future releases?

Thank you for your hard work and for maintaining such an awesome package!

@danielxpander
Copy link

Same for me. Rollbacked to 1.26.0

@adeelsohailahmed adeelsohailahmed added the bug Something isn't working label Oct 10, 2024
@07pepa
Copy link
Member

07pepa commented Oct 10, 2024

hi @danielxpander @martincpt

  1. rationale begind when_used="json" is it should convert only in json serialization cases otherwise it should conserve its native type for naive types. PydanticObjectId is de facto naive type for mongo
  2. imho No , Mongo can save PydanticObjectId in non _id field i consider this an anipatern (since you should store stuff as close as original of its form as possible. I strogly recomending to run pymongo or compas and fix type of related_id in your database (it can bring speed improvement since it is stored natively)

BTW secondly this makes valid MyDoc(related_id="🔥") i am guessing this will not work and by looks of things i am guessing this is referencing another table and that wont work

lets recapitulate
Imho

class MyDoc(Document):
    related_id: str

MyDoc(related_id=PydanticObjectId())

Should expected to fail in same way putting arbitrary object into another object that does not inherit 1st without conversion.
It was bug from start.

also Here is workaround i am proposing for you code if you do not want to convert your database should work as well

class MyDoc(Document):
    related_id: PydanticObjectId

MyDoc(related_id=PydanticObjectId())

this is probably what you wanted in 1st place but i would double check and convert data

@07pepa 07pepa removed the bug Something isn't working label Oct 10, 2024
@07pepa
Copy link
Member

07pepa commented Oct 14, 2024

@danielxpander @martincpt please share your thoughts... but there is definite rationale and i consider this was serialization bug and carry over from pydantic v1

@bkis
Copy link

bkis commented Oct 23, 2024

Rationale or not, this is still a breaking change in a minor release.

While I can see how it's debatable whether type(model.model_dump()["id"]) should be <class 'beanie.odm.fields.PydanticObjectId'> or str, it is a breaking change that is causing problems.

I have a really big codebase relying on PydanticObjectId to be "python-serialized" as str and this is now a problem as this change is not backwards compatible.

@martincpt
Copy link
Author

@07pepa

In this specific case, we have a logging protocol where we chose to store the value as a string because it's not always a PydanticObjectId. Of course, in other cases, we prefer to store data in its original form, specifically using a Link type (a.k.a DBRef).

I understand your perspective, but I feel it might be a bit strict. I don't think the previous, more lenient serialization was a significant issue. As long as it's not introducing a security risk, I would be in favor of keeping it as is—or at least giving users the option to choose whether the package should allow serialization from a string or not.

@marwan-alloreview
Copy link
Contributor

Just want to make sure that the 1.27.0 behaviour will not change ?

I agree that this was a breaking change. But in my opinion, when using model_dump with mode "python" it should not serialize ObjectIds to string. So the new behaviour makes more sense (at least to me).

I am considering relying on the 1.27.0 behaviour but I want to make sure it won't be changed back to the 1.26.0 behaviour in future versions ?

@martincpt
Copy link
Author

I have to admit, I hadn’t noticed that model_dump was converting ObjectId to a string, which is indeed not ideal—I’d also prefer the original format.

However, this raises a question for me: could there be a solution where model_dump retains the ObjectId, but in the opposite case—if the string format is valid—it could convert a string back to an ObjectId?

I believe this wouldn’t necessarily conflict with fundamental principles. What do you think, @07pepa?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants