-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use codegen to generate doument validation function. #139
base: master
Are you sure you want to change the base?
Conversation
…antly faster then old code.
Thanks @colprog! I like the idea of spiting validate in smaller chunks, I was thinking of spliting it in a different way though (group the check for undefined/null together). That being said, I have a few questions:
|
In my current project. thinky is a major bottleneck, it is easily taking 70% of my cpu time. and validation is one of the hot spots both when parsing and saving. I'm using generate-function to dynamically generate validation function for each model specifically, so for a simple model like this:
it generates a tailored validation function :
thus making it fast. In my micro benchmark of parsing a User document. validation function runs 10x faster. making parsing 30% faster. this performance gain should be larger with a more complex schema. |
Hum, I don't get why generating a function would make thing faster. Do you have any hint on that? From what I can see, the reason it's faster is because the function I did spend quite some time optimizing rethinkdbdash for v8, but didn't do it for thinky because it was some work and no one had this problem before. |
That's great. I enjoys using thinky but now it is way too slow for production use. the generated version not only is more jit friendly, it also avoids the overhead of traveling the schema every time. it only contains what really needs to be done. I would think the latter is more important here. since thinky's schema language is flexible, __validate contains a lot of type checking code which are all avoided in the generated version. This is very measurable, this single change increase my application throughput by 10%. FYI, i know rethinkdb has got rid of protobuf. but you should check this out https://github.com/mafintosh/protocol-buffers, purely js but it outperforms node-protobuf by at least 5x. |
Just a heads up:
I'm almost done, but for a little script that creates 40k documents, and validate them, I get the same performance but without using generated functions. There is also one more thing that we could maybe do. Currently creating a new document copy the value given. So you can do something like
But it's not the most common case (I think?), and for big documents, doing deep copies can become expensive. |
That is great news. |
got to test the new validator in 1.15.1. it is about 10-15% slower then the codegen version. but i think this is still a nice speed up! good work neumino! |
Thanks @colprog! |
If you have some time @colprog, could you test 1.15.2 -- or master?
|
I just refactored my test a few days ago, Now your version is about 10% faster then mine. but I still think code gen is the way to go for doing validation. I will try to integrate your recent changes to code gen and see how it goes. |
this is significantly faster then old code for non trivial schema.