-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark polling system powered Ember and friends #3692
Comments
CI is green in both http4s and Skunk. |
I wrote the laziest server 😅 //> using dep co.fs2::fs2-io::3.8-1af22dd
//> using dep org.http4s::http4s-ember-server::0.23.20
import cats.effect.*
import org.http4s.ember.server.EmberServerBuilder
object App extends IOApp.Simple:
def run = EmberServerBuilder.default[IO].build.useForever Then I benchmarked it with FS2 3.7.0
FS2 3.8-1af22dd
|
Thanks to @ChristopherDavenport for doing some additional benchmarking! tl;dr Ember has now surpassed Blaze 🔥 https://gist.github.com/ChristopherDavenport/2e5ad15cd293aa0816090f8677b8cc3b |
w00t! 🔥
On throughput but not latency, right? Also, has something changed to Ember's latency? cc @ChristopherDavenport Or maybe I'm just misreading the results... Ember Before
Ember After
|
Nothing that I'm aware of on the server which is what this tested. This was just a couple one-off tests on my local machine. Its very possible I screwed something up by actually using the machine for something during the period.(Although I tried not to) To get more concrete answers we'd need to actually spin something up to better test with something like gatling, and in terms of latency I'd need to switch from wrk to wrk2. I'll see if I can run the latter later. |
I've been spending quite a bit of time on this today, and I'm not seeing anything like 20% improvement. It's possible that results are much better at low levels of concurrency, as TFB results are quite similar, but it's looks like there's a small improvement when enabling polling. There's some variance, perhaps more so than usual because it's a very warm day here in Oslo with a bit over 30 degrees Celsius, so I ran it several times. I was wondering if maybe there was some important fix in 0.23 that hadn't been merged into main yet, so I also ran it with 0.23.20, but the overall picture is similar. I then ran the Ember benchmark ("simplest") by @ChristopherDavenport and using wrk with 300 connections, otherwise the same settings as TFB is using. I first ran it with Eclipse Temurin JDK 17 (just starting RealApp from within IntelliJ, using They seem to be close enough that I'd be hesitant to call it anything other than a draw. Enabling pipelining (as TFB does) improves the numbers and seems to make them more consistent, but there's still not much of a difference when using polling: Results get better, but variance increases, when running with I think results are promising, as nothing seems to have gotten worse and it looks like it quite likely is better, but I'd love to see more benchmark results. 20% may be possible, but only in some circumstances, it seems? |
Thank you for putting all the work into that! That's some really good information! |
For folks following along here: we've also published a snapshot of @antoniojimeneznieto's work implementing a JVM polling system based on io_uring. The initial prototype piggy-backs on Netty's internal io_uring APIs. Please give it a try! The linked PR demonstrates how to create a JVM Ember server using the fs2-io_uring snapshot. |
I ran the TFB benchmarks with my branch using your latest and greatest snapshots last weekend, and it wasn't worth writing about, slightly worse results than before. Well, turns out I didn't notice that Baseline: https://www.techempower.com/benchmarks/#section=test&shareid=9b70928b-24e8-4a39-a5dc-7832d8b02cd6&test=plaintext On "meaningless" benchmarks you've made Ember approximately 340% faster, and I'd bet real money that it translates into real-world improvements as well. Whereas before, my CPU cores (or vCPU cores) were at most 30% loaded when running those benchmarks, they're now blazing along at 100%, as they should. Fantastic. Amazing. Well done. |
What the holy crap. I expected io_uring to make things faster but I didn't expect it to be that much faster. Long way to go to productionalize this and I'm sure there's loads of stuff that'll get faster and slower along the way, but WOW. The fact that this is happening while we're still going through Netty's indirection is pretty impressive. |
@wjoel I had an interesting thought: these results almost certainly say something about our syscall overhead. If we're reaping this magnitude of performance improvement just from swapping out the polling system, then it's basically saying that Ember is almost entirely bounded by the overhead of NIO. We won't know for sure until we get some more polling systems into the mix (this is a three variable algebra problem and we only have two equations so far), but this is really fascinating. |
I've published a snapshot of FS2 I/O that integrates with the new polling system (based on CE
3.6-e9aeb8c
).You can drop-in this dependency to any application using the latest stable versions of http4s Ember, Skunk, etc. Please try it out and report back!! We would really appreciate it 😊
Follow-up to:
The text was updated successfully, but these errors were encountered: