-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle kernel-triggered page fault for 6.1 kernels #4086
Merged
bchalios
merged 5 commits into
firecracker-microvm:main
from
bchalios:fix_uffd_kernel_faults
Sep 26, 2023
Merged
Handle kernel-triggered page fault for 6.1 kernels #4086
bchalios
merged 5 commits into
firecracker-microvm:main
from
bchalios:fix_uffd_kernel_faults
Sep 26, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #4086 +/- ##
==========================================
- Coverage 83.13% 83.12% -0.01%
==========================================
Files 225 225
Lines 28559 28581 +22
==========================================
+ Hits 23742 23758 +16
- Misses 4817 4823 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
5 times, most recently
from
September 5, 2023 13:19
76ce6b6
to
d535cc7
Compare
bchalios
changed the title
Test kernel triggered page faults
Handle kernel-triggered page fault for 6.1 kernels
Sep 6, 2023
pb8o
reviewed
Sep 6, 2023
ShadowCurse
reviewed
Sep 7, 2023
ShadowCurse
reviewed
Sep 7, 2023
ShadowCurse
reviewed
Sep 7, 2023
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 7, 2023 11:07
d535cc7
to
21094fa
Compare
This was referenced Sep 14, 2023
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
6 times, most recently
from
September 21, 2023 10:02
1abdaf5
to
118934d
Compare
bchalios
added
the
Status: Awaiting review
Indicates that a pull request is ready to be reviewed
label
Sep 21, 2023
Converting back to Draft, so that I can add some Documentation and a CHANGELOG entry. |
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 21, 2023 12:15
118934d
to
59df5de
Compare
bchalios
requested review from
xmarcalx,
kalyazin and
wearyzen
as code owners
September 21, 2023 12:15
kalyazin
reviewed
Sep 21, 2023
pb8o
reviewed
Sep 21, 2023
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 21, 2023 16:24
59df5de
to
d4a9113
Compare
ShadowCurse
previously approved these changes
Sep 22, 2023
kalyazin
reviewed
Sep 22, 2023
roypat
reviewed
Sep 22, 2023
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 26, 2023 10:20
aa04ac6
to
930ac03
Compare
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 26, 2023 10:27
930ac03
to
530bdec
Compare
With kernels >= 5.11, Linux allows a process that creates a userfaultfd file descriptor to opt out from handling kernel triggered page faults. `userfaultfd-rs` by defaults does this, so in 6.1 we will not be able to handle page faults until we fix this option. Our current valid page fault handler, will not exercise the issue because it faults in the entire memory the first time a page fault happens. In order to allow catching such errors, we change the page fault handler to, instead, serve a single page at a time. This is slow, but we are just testing here. Signed-off-by: Babis Chalios <[email protected]>
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 26, 2023 10:48
530bdec
to
c69e91c
Compare
The way we were checking if a UFFD handler was alive in tests involved us sending some bytes to it over stdin and waiting to read something over stdout with success. To achieve that we need to "pipe" the handler's stdin/stdout, which means that we can't any more peep in its output. This commit changes the mechanism to use `Popen.poll()` to ensure that the process is still alive and keep track of its stdout and stderr in a file. Signed-off-by: Babis Chalios <[email protected]>
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 26, 2023 14:03
54376f7
to
ec922c9
Compare
kalyazin
reviewed
Sep 26, 2023
Since 5.11 Linux allows a process to choose if it wants to handle page faults that were triggered in kernel space. userfaultfd-rs by default chooses not to. Even though, this option does not affect kernel 5.10, it means that in 6.1 we will not be able to handle such page faults. This commit explicitly asks to handle kernel-triggered page faults. Moreover, it updates the userfaultfd-rs version to a newer one that uses the /dev/userfaultfd API to create file descriptors. This API does not require a process to have `CAP_SYS_PTRACE` capability in order to create a file descriptor capable of handling kernel-triggered page faults. Signed-off-by: Babis Chalios <[email protected]>
jailer keeps track of various paths, e.g. device files under /dev. It represents these paths as C-like NULL terminated strings, because we use these paths while calling directly system calls. This requires us to do conversions between C-like and Rust-like strings quite often. This commit reverses the logic to store the paths as Rust strings and only convert them when we need to perform a system call, using the CString type. This is much safer (in terms of Rust-safety), it allows for more Rust-idiomatic code and requires less conversions along the way. Signed-off-by: Babis Chalios <[email protected]>
The new functionality of userfaultfd-rs is to use /dev/userfaultfd, when present, to create userfault file descriptors. This commit adds logic to look if the device is present on the host and, if it is, find the minor device number at runtime (this is a misc device with a dynamic minor number) and create the device in the jail. Signed-off-by: Babis Chalios <[email protected]>
bchalios
force-pushed
the
fix_uffd_kernel_faults
branch
from
September 26, 2023 14:42
ec922c9
to
6d924f9
Compare
ShadowCurse
approved these changes
Sep 26, 2023
kalyazin
approved these changes
Sep 26, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reason
In kernels >= 5.11 UFFD behaviour changes in that it distinguishes between page faults triggered in kernel and user space.
In order to create UFFDs that can handle kernel-triggered page faults, the process needs to have the
CAP_SYS_PTRACE
capability or, starting from kernel 6.1, use the/dev/userfaultfd
device to create the file descriptor.Right now, Firecracker opts-out from handling kernel triggered page faults. This does not affects us for kernels 4.14 and 5.10 but it will on 6.1.
Changes
userfaultfd-rs
which, if present, uses/dev/userfaultfd
to create userfault file descriptors./dev/userfaultd
inside the jail if the device exists in the hostLicense Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
CHANGELOG.md
.TODO
s link to an issue.rust-vmm
.