Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle kernel-triggered page fault for 6.1 kernels #4086

Merged
merged 5 commits into from
Sep 26, 2023

Conversation

bchalios
Copy link
Contributor

@bchalios bchalios commented Sep 1, 2023

Reason

In kernels >= 5.11 UFFD behaviour changes in that it distinguishes between page faults triggered in kernel and user space.
In order to create UFFDs that can handle kernel-triggered page faults, the process needs to have the CAP_SYS_PTRACE capability or, starting from kernel 6.1, use the /dev/userfaultfd device to create the file descriptor.

Right now, Firecracker opts-out from handling kernel triggered page faults. This does not affects us for kernels 4.14 and 5.10 but it will on 6.1.

Changes

  • Modify our example UFFD handler to serve a single page per time, so we exercise the problematic behaviour on 6.1.
  • Use a new version of userfaultfd-rs which, if present, uses /dev/userfaultfd to create userfault file descriptors.
  • Also, modify jailer to create /dev/userfaultd inside the jail if the device exists in the host

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@codecov
Copy link

codecov bot commented Sep 1, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (5b70cf9) 83.13% compared to head (6d924f9) 83.12%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4086      +/-   ##
==========================================
- Coverage   83.13%   83.12%   -0.01%     
==========================================
  Files         225      225              
  Lines       28559    28581      +22     
==========================================
+ Hits        23742    23758      +16     
- Misses       4817     4823       +6     
Flag Coverage Δ
4.14-c7g.metal 78.68% <80.00%> (-0.01%) ⬇️
4.14-m5d.metal 80.47% <80.00%> (-0.03%) ⬇️
4.14-m6a.metal 79.62% <80.00%> (-0.02%) ⬇️
4.14-m6g.metal 78.68% <80.00%> (-0.01%) ⬇️
4.14-m6i.metal 80.46% <80.00%> (?)
5.10-c7g.metal 81.58% <80.00%> (-0.02%) ⬇️
5.10-m5d.metal 83.14% <80.00%> (-0.01%) ⬇️
5.10-m6a.metal 82.38% <80.00%> (-0.02%) ⬇️
5.10-m6g.metal 81.58% <80.00%> (-0.02%) ⬇️
5.10-m6i.metal 83.12% <80.00%> (-0.02%) ⬇️
6.1-c7g.metal 81.58% <80.00%> (-0.02%) ⬇️
6.1-m5d.metal 83.13% <80.00%> (-0.03%) ⬇️
6.1-m6a.metal 82.38% <80.00%> (-0.02%) ⬇️
6.1-m6g.metal 81.58% <80.00%> (-0.02%) ⬇️
6.1-m6i.metal 83.12% <80.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/jailer/src/main.rs 79.00% <ø> (ø)
src/vmm/src/persist.rs 53.46% <0.00%> (-0.15%) ⬇️
src/jailer/src/env.rs 67.33% <85.29%> (+0.43%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch 5 times, most recently from 76ce6b6 to d535cc7 Compare September 5, 2023 13:19
@bchalios bchalios changed the title Test kernel triggered page faults Handle kernel-triggered page fault for 6.1 kernels Sep 6, 2023
tests/framework/utils.py Outdated Show resolved Hide resolved
tests/integration_tests/functional/test_uffd.py Outdated Show resolved Hide resolved
src/jailer/src/env.rs Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch from d535cc7 to 21094fa Compare September 7, 2023 11:07
@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch 6 times, most recently from 1abdaf5 to 118934d Compare September 21, 2023 10:02
@bchalios bchalios marked this pull request as ready for review September 21, 2023 10:04
@bchalios bchalios self-assigned this Sep 21, 2023
@bchalios bchalios added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Sep 21, 2023
@bchalios bchalios marked this pull request as draft September 21, 2023 10:30
@bchalios
Copy link
Contributor Author

Converting back to Draft, so that I can add some Documentation and a CHANGELOG entry.

@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch from 118934d to 59df5de Compare September 21, 2023 12:15
@bchalios bchalios marked this pull request as ready for review September 21, 2023 12:15
CHANGELOG.md Outdated Show resolved Hide resolved
tests/framework/utils.py Outdated Show resolved Hide resolved
tests/framework/utils.py Outdated Show resolved Hide resolved
tests/framework/utils.py Outdated Show resolved Hide resolved
@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch from 59df5de to d4a9113 Compare September 21, 2023 16:24
ShadowCurse
ShadowCurse previously approved these changes Sep 22, 2023
src/jailer/src/env.rs Show resolved Hide resolved
src/jailer/src/env.rs Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
src/jailer/src/env.rs Outdated Show resolved Hide resolved
src/jailer/src/env.rs Show resolved Hide resolved
tests/framework/utils.py Outdated Show resolved Hide resolved
tests/framework/utils.py Outdated Show resolved Hide resolved
With kernels >= 5.11, Linux allows a process that creates a userfaultfd
file descriptor to opt out from handling kernel triggered page faults.
`userfaultfd-rs` by defaults does this, so in 6.1 we will not be able to
handle page faults until we fix this option.

Our current valid page fault handler, will not exercise the issue
because it faults in the entire memory the first time a page fault
happens. In order to allow catching such errors, we change the page
fault handler to, instead, serve a single page at a time. This is slow,
but we are just testing here.

Signed-off-by: Babis Chalios <[email protected]>
@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch from 530bdec to c69e91c Compare September 26, 2023 10:48
@bchalios bchalios marked this pull request as draft September 26, 2023 13:40
@bchalios bchalios marked this pull request as ready for review September 26, 2023 13:41
The way we were checking if a UFFD handler was alive in tests involved
us sending some bytes to it over stdin and waiting to read something
over stdout with success. To achieve that we need to "pipe" the
handler's stdin/stdout, which means that we can't any more peep in its
output.

This commit changes the mechanism to use `Popen.poll()` to ensure that
the process is still alive and keep track of its stdout and stderr in a
file.

Signed-off-by: Babis Chalios <[email protected]>
@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch from 54376f7 to ec922c9 Compare September 26, 2023 14:03
CHANGELOG.md Outdated Show resolved Hide resolved
Since 5.11 Linux allows a process to choose if it wants to handle page
faults that were triggered in kernel space. userfaultfd-rs by default
chooses not to. Even though, this option does not affect kernel 5.10, it
means that in 6.1 we will not be able to handle such page faults.

This commit explicitly asks to handle kernel-triggered page faults.
Moreover, it updates the userfaultfd-rs version to a newer one that
uses the /dev/userfaultfd API to create file descriptors. This API does
not require a process to have `CAP_SYS_PTRACE` capability in order to
create a file descriptor capable of handling kernel-triggered page
faults.

Signed-off-by: Babis Chalios <[email protected]>
jailer keeps track of various paths, e.g. device files under /dev. It
represents these paths as C-like NULL terminated strings, because we
use these paths while calling directly system calls. This requires us
to do conversions between C-like and Rust-like strings quite often.

This commit reverses the logic to store the paths as Rust strings and
only convert them when we need to perform a system call, using the
CString type. This is much safer (in terms of Rust-safety), it allows
for more Rust-idiomatic code and requires less conversions along the
way.

Signed-off-by: Babis Chalios <[email protected]>
The new functionality of userfaultfd-rs is to use /dev/userfaultfd, when
present, to create userfault file descriptors. This commit adds logic to
look if the device is present on the host and, if it is, find the minor
device number at runtime (this is a misc device with a dynamic minor
number) and create the device in the jail.

Signed-off-by: Babis Chalios <[email protected]>
@bchalios bchalios force-pushed the fix_uffd_kernel_faults branch from ec922c9 to 6d924f9 Compare September 26, 2023 14:42
@bchalios bchalios merged commit 6600000 into firecracker-microvm:main Sep 26, 2023
3 of 5 checks passed
@bchalios bchalios deleted the fix_uffd_kernel_faults branch September 26, 2023 15:17
@roypat roypat mentioned this pull request Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants