-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should cargo-criterion support baselines? #10
Comments
We're currently improving our benchmarking suite at https://github.com/timberio/vector, and we figured is we'd like a way to compare the benchmarking results in the long run. This is tricky, and we're looking into correct ways to implement it. We're currently thinking about storing the bench data output from the PR in question + I also find myself in the position where I need to compare seemingly arbitrary benchmarks against each other. For instance, I have a suite where the same benching code is run in different "environments" (i.e. tracing on and tracing off), and sometimes we iterate on those layers. I find myself in need to compare both two benches against each other, and also two versions of the code against each other. I.e. the flow looks like this: $ git checkout master
$ cargo bench
# produces:
# - env1/bench1
# - env1/bench2
# - env2/bench1
# - env2/bench2
# (to be used as base)
$ git checkout mychange
$ cargo bench
# produces:
# - env1/bench1
# - env1/bench2
# - env2/bench1
# - env2/bench2
# (to be used as new)
$ critcmp (or similar) It would make sense to compare the Does it make sense? I hope this feedback will be helpful. If you'd like to chat - we're at http://chat.vector.dev/ cc @jszwedko |
So I thought I might just explain my workflow here. Seeing this issue now made me remember an email I got from you that I never responded to. :-( Sorry about that. It slipped down my inbox and I ended up forgetting about it. I'll do my best to explain my workflow. I've been using this kind of flow for a long time. So basically, I start off by running all the benchmarks and save their output. I usually call this But there are other workflows too. Only being able to compare benchmarks with the same name across distinct runs is incredibly limiting. I also want to be able to compare benchmarks within runs. For example, I might have And there's also the presentation aspect:
Happy to answer any questions about my workflow. It's a little hard to describe, so I'd be happy to elaborate on any unclear points or why I didn't use X feature in Criterion. (It is plausible that I didn't know about X, whatever it is. :-)) |
I had this idea kicking around that baselines could be replaced with alternate "timelines" (since cargo-criterion will hopefully soon support a full-history chart).
I was never very satisfied with the workflow of Criterion.rs' baselines (and others seem to agree on that - eg. critcmp largely exists to make up for deficiencies in the workflow supported by baselines). Thing is, I have no idea what sort of workflow would work better.
This will require some design work. Probably won't be available in 1.0.0.
The text was updated successfully, but these errors were encountered: