Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DI] Batch outgoing http requests #5007

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

watson
Copy link
Collaborator

@watson watson commented Dec 12, 2024

What does this PR do?

Adds a queue in front of the outgoing http requests from the Dynamic Instrumentation devtools worker thread.

Two queues are added:

  • One for requests to /debugger/v1/diagnostics
  • One for requests to /debugger/v1/input

Both queues are configured in the same manner.

When adding a payload to any of the queues, it's buffered until either a 1 second timer ends or the size of the queue exceeds 5MB after which the queue is flushed. Once flushed, all payloads waiting in the queue are sent to the HTTP endpoint in a single HTTP request.

Motivation

Without this queue the tracer could produce up to 5.000 request per second. This would overload the tracer.

Note to reviewers

Hide whitespace changes for a better diff

Copy link
Collaborator Author

watson commented Dec 12, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

@watson watson force-pushed the watson/DEBUG-3228/batch-requests branch from 21bb3e1 to 6862b7c Compare December 12, 2024 17:55
Copy link

github-actions bot commented Dec 12, 2024

Overall package size

Self size: 8.25 MB
Deduped: 94.82 MB
No deduping: 95.38 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.2.2 | 29.27 MB | 29.27 MB | | @datadog/native-appsec | 8.3.0 | 19.37 MB | 19.38 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.0 | 2.58 MB | 2.72 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Dec 12, 2024

Benchmarks

Benchmark execution time: 2024-12-12 18:26:48

Comparing candidate commit c8456c4 in PR branch watson/DEBUG-3228/batch-requests with baseline commit c6defbc in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 261 metrics, 5 unstable metrics.

@watson watson force-pushed the watson/DEBUG-3228/batch-requests branch from 6862b7c to c8456c4 Compare December 12, 2024 18:19
@watson
Copy link
Collaborator Author

watson commented Dec 13, 2024

The system tests are now failing because we're sending an array instead of an object to the endpoints. Previously we were ignoring anything that wasn't an array, but now we start to validate the actual content, which means we're now seeing errors we wasn't seeing before.

We see two types of errors, both indicating a missing required field in the diagnostics messages: probeVersion and message. I belive these are actually because of some issues with the system-tests themselves and not this repo, so I've created a couple of PRs to fix the errors there:

Copy link

@shatzi shatzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. thank you.

currently we only creating snapshots probes. but once we have log probes we should have another take on this to handle chatty probes. Would be great if we can unify the spec for that before we implement.

this._onFlush(json)
}

add (str, size = Buffer.byteLength(str)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it better to pass just the json and remove the size param.
If the user give wrong values it break the logic of the queue

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is purely a performance optimization. In the snapshot code, the size is already calculated since we need to know if it's larger than 1MB. So to not calculate it twice we allow it to be passed in. It's not user generated input.

@@ -0,0 +1,36 @@
'use strict'

class JSONQueue {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe JSONBuffer is better as this is not really a queue API

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I couldn't decide if I should call it a buffer or a queue... was considering both names. In Node.js there's a Buffer class which is different, so the term is a bit overloaded in Node.js. But I'll consider changing it... you're probably right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants