Simplify A/B-Test orchestration #4879

roypat · 2024-10-29T12:23:15Z

Changes

Make the A/B-Test script take two directories with firecracker/jailer binaries as arguments, instead of two git revisions

Reason

Since we extracted the build step in our buildkite infra, we refuse to compile firecracker inside of pytest. This means that running A/B-Tests outside the buildkite infra was essentially impossible, as the two revisions were resolved to a directory path as it would be created by the buildkite infra. If this directory did not exist, the script would fail. Additionally, the few git operations that the script did do on the revisions were very brittle and did not match what the extracted build step did for resolving git objects (so it could be that the build step passes, but then ab_test.py doesn't understand the git objects anymore).

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

If a specific issue led to this PR, this PR closes the issue.
The description of changes is clear and encompassing.
Any required documentation changes (code and docs) are included in this
PR.
API changes follow the Runbook for Firecracker API changes.
User-facing changes are mentioned in CHANGELOG.md.
All added/changed functionality is tested.
New TODOs link to an issue.
Commits meet
contribution quality standards.

This functionality cannot be added in rust-vmm.

codecov · 2024-10-29T12:27:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.10%. Comparing base (3d0421f) to head (3d78dcc).
Report is 12 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4879   +/-   ##
=======================================
  Coverage   84.10%   84.10%           
=======================================
  Files         251      251           
  Lines       28080    28080           
=======================================
  Hits        23616    23616           
  Misses       4464     4464

Flag	Coverage Δ
5.10-c5n.metal	`84.67% <ø> (ø)`
5.10-m5n.metal	`84.65% <ø> (ø)`
5.10-m6a.metal	`83.96% <ø> (ø)`
5.10-m6g.metal	`80.78% <ø> (ø)`
5.10-m6i.metal	`84.65% <ø> (ø)`
5.10-m7g.metal	`80.78% <ø> (ø)`
6.1-c5n.metal	`84.67% <ø> (ø)`
6.1-m5n.metal	`84.65% <ø> (-0.01%)`	⬇️
6.1-m6a.metal	`83.96% <ø> (ø)`
6.1-m6g.metal	`80.78% <ø> (-0.01%)`	⬇️
6.1-m6i.metal	`84.65% <ø> (ø)`
6.1-m7g.metal	`80.78% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

.buildkite/pipeline_perf.py

tools/ab_test.py

tests/framework/ab_test.py

Eliminate the local variable and the for-break-else construct by simply returning early. Signed-off-by: Patrick Roy <[email protected]>

Our docker container uses Python 3.12 these days, so follow the comment the says "once we upgrade past 3.11, use the stdlib". Signed-off-by: Patrick Roy <[email protected]>

`None` is an invalid value, and this caused A/B-Tests to be impossible to run locally. Signed-off-by: Patrick Roy <[email protected]>

The output of the spectre/meltdown checker, and the vulnerability files on the host will not be influenced by the local checkout of the firecracker repository. Thus these A/B-tests were noops. Fix this by only doing the non-PR assertion in the nightly pipeline. Signed-off-by: Patrick Roy <[email protected]>

Currently, we have two types of pre-PR A/B-Tests: Those that depend on the repository as a whole (e.g. `cargo audit`, which checks properties of Cargo.toml), and those that depend on the pre-compiled firecracker binaries. At the moment, these tests can use the same infrastructure, because our buildkite shared build step will pre-compile binaries and then upload the entire repository as an artifact, and so both types of A/B-tests will find what they're looking for in this artifact and end up being happy. However, this model makes it very difficult to manually run A/B-tests locally (one will need to setup clones of the git repository at specific locations and then pre-compile firecracker with specific flags inside them). This patch series has the goal to simplify these manual A/B-tests, but the cost is that the pre-PR A/B tests will require different infrastructure. Thus, prepare by providing different functions for each of these types of A/B-test to use. Signed-off-by: Patrick Roy <[email protected]>

The tools/ab_test.py script relies on precompiled binaries existing in well-defined locations, and refuses to compile firecracker itself if they're missing. It derives these locations from commit SHAs, so conceptually, it makes more sense to cut out this middle step and just directly pass in directories instead of SHAs (with the expectation that the directories contain firecracker and jailer binaries). However, the commit SHAs were still used to print the commit ranges over which A/B-tests were done. It turns out that this was not working in the vast majority of cases though, as the commit log printing logic did not contain all the resolution logic that goes into compilation step (e.g. for parsing revisions and such). So just remove that part. In the EMF metrics, we now tag each report with the directory path instead of the commit SHA. For the post-merge runs, this makes no difference, as the SHA is part of the path. Signed-off-by: Patrick Roy <[email protected]>

Passing `--rev` to `tools/devtool build` will do a checkout of the specified revision into a temporary directory, compile firecracker into it, and then copy over the final binaries into build/$revision. This has advantages over our current approach for compiling A/B revisions, as by only copying the final binaries into build/$revision we save a lot of bandwidth when uploading artifacts for transfer between different buildkite steps (as we no longer include gigabytes of compilation artifacts). This now means that all revisions are compiled in the same docker container, however due to the rust-toolchain.toml this should not have an impact on cross-toolchain testability, and has the advantage that the cargo cache is shared (meaning we're hitting crates.io for downloads less). Signed-off-by: Patrick Roy <[email protected]>

Update the documentation to reflect that the script now takes directories with binaries. The old documentation was wrong anyway, since ab_test.py would fail if it couldn't find binaries in directories named matching the passed revisions after the very specific naming scheme that our buildkite infra uses. Signed-off-by: Patrick Roy <[email protected]>

In the pre-PR A/B-tests we were compiling Firecracker. Instead, explicitly rely on the precompiled binaries that we set up in the shared buildkite build step. This has the slight downside that it makes it harder to run these tests locally, as now you need to explicitly pre-compile the binaries, but arguably running these vulnerability A/B-tests locally doesnt make sense anyway, because they specifically test the security configuration on our supported .metals. While we're at it, rename the ab_* utility functions to make it obvious that they expect pre-compiled binaries. Signed-off-by: Patrick Roy <[email protected]>

Otherwise, they clash with the directories used for the pre-compiled binaries (as the `build`-subdirectories now no longer contain checked out repositories, but instead only binaries). Since each pipeline only uses git_ab_test in precisely one test, loosing the re-use is not a big deal. Signed-off-by: Patrick Roy <[email protected]>

These tags were removed in 07e564c, so the logic serves no purpose anymore (and it makes pytest print very unhelpful error messages if a test function has no doc string). Signed-off-by: Patrick Roy <[email protected]>

It's shorter and gives nicer error messages than `assert False` in a `try` block. Signed-off-by: Patrick Roy <[email protected]>

.buildkite/common.py

roypat requested review from xmarcalx, kalyazin, pb8o and Manciukic as code owners October 29, 2024 12:23

roypat force-pushed the ab-simplification branch 3 times, most recently from 8afc130 to 8381766 Compare October 29, 2024 14:44

roypat added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Oct 29, 2024

pb8o reviewed Oct 29, 2024

View reviewed changes

.buildkite/pipeline_perf.py Outdated Show resolved Hide resolved

tools/ab_test.py Outdated Show resolved Hide resolved

tests/framework/ab_test.py Outdated Show resolved Hide resolved

tests/framework/ab_test.py Show resolved Hide resolved

roypat force-pushed the ab-simplification branch 5 times, most recently from 8e5e6cf to f154498 Compare October 29, 2024 17:44

roypat requested a review from pb8o October 30, 2024 10:04

roypat force-pushed the ab-simplification branch from d3c6591 to 8e7bd10 Compare October 30, 2024 14:48

roypat added 12 commits October 30, 2024 17:15

test: refactor: Simplify CpuMap._cpus

100c477

Eliminate the local variable and the for-break-else construct by simply returning early. Signed-off-by: Patrick Roy <[email protected]>

test: replace ventored chdir context manager with contextlib

ee302c4

Our docker container uses Python 3.12 these days, so follow the comment the says "once we upgrade past 3.11, use the stdlib". Signed-off-by: Patrick Roy <[email protected]>

test: do not set host_os dimension to None

6b0587c

`None` is an invalid value, and this caused A/B-Tests to be impossible to run locally. Signed-off-by: Patrick Roy <[email protected]>

test: remove @tag parsing from record_props fixture

00a54ec

These tags were removed in 07e564c, so the logic serves no purpose anymore (and it makes pytest print very unhelpful error messages if a test function has no doc string). Signed-off-by: Patrick Roy <[email protected]>

test: use pytest.raises in test_empty_jailer_id

1836edf

It's shorter and gives nicer error messages than `assert False` in a `try` block. Signed-off-by: Patrick Roy <[email protected]>

roypat force-pushed the ab-simplification branch from a8e998a to 1836edf Compare October 30, 2024 17:15

pb8o reviewed Oct 31, 2024

View reviewed changes

.buildkite/common.py Show resolved Hide resolved

pb8o approved these changes Oct 31, 2024

View reviewed changes

Merge branch 'main' into ab-simplification

3d78dcc

zulinx86 approved these changes Nov 4, 2024

View reviewed changes

roypat merged commit 3e629eb into firecracker-microvm:main Nov 4, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify A/B-Test orchestration #4879

Simplify A/B-Test orchestration #4879

roypat commented Oct 29, 2024

codecov bot commented Oct 29, 2024 •

edited

Loading

Simplify A/B-Test orchestration #4879

Simplify A/B-Test orchestration #4879

Conversation

roypat commented Oct 29, 2024

Changes

Reason

License Acceptance

PR Checklist

codecov bot commented Oct 29, 2024 • edited Loading

Codecov Report

codecov bot commented Oct 29, 2024 •

edited

Loading