Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to determine whether field name interning is enabled for JsonParser #1211

Open
chokoswitch opened this issue Feb 9, 2024 · 3 comments
Labels
pr-needed Feature request for which PR likely needed (no active development but idea is workable)

Comments

@chokoswitch
Copy link

I have custom deserialization logic of protocol buffers and I took advantage of field names being interned to improve performance.

https://github.com/curioswitch/protobuf-jackson/blob/main/src/main/java/org/curioswitch/common/protobuf/json/ParseSupport.java#L458

Actually, at the time I didn't know it is possible to disable interning. I was able to create a failing test for it

https://github.com/curioswitch/protobuf-jackson/blob/main/src/test/java/org/curioswitch/common/protobuf/json/JacksonInteropTest.java#L20

It would be great to be able to know if the parser is interning to be able to handle this case correctly while also skipping non-reference equality check when it's fine to do so. Currently JsonParser only exposes parser features:

https://fasterxml.github.io/jackson-core/javadoc/2.8/com/fasterxml/jackson/core/JsonParser.html#getFeatureMask()

But there is no access to the factory features. Naturally JsonParser is not the right level for that but I wonder would it be possible to expose factory features from ParserBase? Or it could just be the three standard parser implementations individually exposing it. If we could check the factory features, or more specifically whether the symbol finder is interning strings, then it would be possible to skip .equals for name comparisons.

I looked at databind, and while I'm not completely sure I understand the code, I think this optimization could be applied there too

https://github.com/FasterXML/jackson-databind/blob/2.17/src/main/java/com/fasterxml/jackson/databind/deser/impl/BeanPropertyMap.java#L575

The .equals could be skipped when it's known the fields are interned.

@cowtowncoder
Copy link
Member

Right, factory features are expected to only be accessible via factory.
But I can see why it might be useful to know it from parser instance.

I am open to adding a mechanism but probably not by passing factory features -- rather it'd probably be by exposing symbol table in use since that I think ultimately has that information. And yes, that'd need to be at level where this can be accessed.

@chokoswitch
Copy link
Author

Thanks @cowtowncoder so I guess it could be an accessor to the symbol table on the three main parser implementations. I'm a bit unsure what the exact form of the appropriate API would be, if you have a suggestion I could try to send a PR for it.

@cowtowncoder
Copy link
Member

@chokoswitch I haven't looked into this but... ok, there doesn't seem to be any really good places to do this since in 2.x there is no common JSON-specific intermediate parser base class (3.0 adds one). So specific symbol table is separately passed and used by all 4 backends (ReaderBasedJsonParser etc).
Further, ByteQuadsCanonicalizer and CharsToNameCanonicalizer do not implement common interfaces.

... and of course you originally mentioned this in context of Protobuf, not JSON. So it'd need to be in JsonParser.

So perhaps there'd be need for separate new introspection method (capability) similar to, say

public boolean canParseAsync();

but more like

public boolean doesInternFieldNames();

and that'd be implemented by backends (but also requires default implementation that just returns false, for backwards compatibility.

@cowtowncoder cowtowncoder added the pr-needed Feature request for which PR likely needed (no active development but idea is workable) label Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-needed Feature request for which PR likely needed (no active development but idea is workable)
Projects
None yet
Development

No branches or pull requests

2 participants