Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support serving page faults in the kernel via UFFD #4025

Closed

Conversation

maggie-lou
Copy link

@maggie-lou maggie-lou commented Aug 10, 2023

Changes

Add an option to configure user_mode_only when creating a UFFD object.

Reason

Page faults can occur in user-space or in the kernel itself. For security reasons, in Linux 5.11 UFFD was restricted to only handle page faults that were triggered in user-space and not those in kernel-space by default (more context in this post).

This broke our firecracker UFFD integration, because kernel-space page faults were no longer being handled. LoadSnapshot would fail with errors like Failure during vcpu run: Bad address (os error 14).

To allow support for kernel-space page faults, add an option to set user_mode_only=false.

To recreate this error, run this modified version of the UFFD handler on Linux 5.11+. Rather than loading the entire memory region at once, it loads a single page for each page fault. There will eventually be a page fault in kernel-space that the UFFD handler will fail to handle, causing the Bad address error described above.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@JonathanWoollett-Light JonathanWoollett-Light added the Priority: High Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled label Aug 14, 2023
@bchalios bchalios self-assigned this Aug 14, 2023
@bchalios
Copy link
Contributor

Hi @maggie-lou, thanks for reporting this issue and thanks for the contribution as well.

I would like to go through the possible alternatives that we have for solving this. It seems like the suggestion for introducing the /dev/userfaultfd file has been merged. Do you maybe know from which kernel on-wards is it available for?

Alternatively, I'm thinking that the extra type in the MemBackend has a two main issues:

  1. It is an API breaking change. I would like to avoid it, if possible.
  2. It leaves the Uffd variant in a bit of limbo state. I really don't know if it is useful any more. We need to be able to handle kernel page faults at all cases. So, maybe the Uffd case should just be changed to not be user_mode_only by default. WDYT?

Also, would you be willing to write an integration test for this as well? I am thinking that we could do this in a test-driven way:

  1. Write the integration test that causes the UFFD handler to fail.
  2. Then apply whatever fix we will come up with and make sure this test passes.

As a side-note, our officially supported kernels currently are, 4.14 and 5.10. We intend to add support for 6.1 with Firecracker 1.5 release. So, whatever solution we converge to, it needs to take into account that. IOW, we need to solve this in the best way possible for 6.1 onwards.

Looking forward for your thoughts on this.

@bchalios
Copy link
Contributor

I double-checked in the meantime. Support for /dev/userfaultfd is introduced with kernel 6.1

@maggie-lou
Copy link
Author

Using /dev/userfaultfd to create the userfaultfd object sounds like the "right" solution, because you don't need to grant your firecracker process additional permissions that may compromise the security of your system. One downside is that it creates a gap where UFFD support is unusable for people on Linux 5.11 - 6.1.

Otherwise defaulting user_mode_only: false sounds good to me.

I'll let you decide which option is best. We are using a kernel in that unsupported range, and are using a patched version of Firecracker to get around this. Happy to write an integration test when we've committed to a solution

@bchalios
Copy link
Contributor

bchalios commented Aug 23, 2023

Hey @maggie-lou,

So, we definitely want to fix this for 6.1 for the next release. For the intermediate kernel versions, I agree that we can just opt in by doing user_mode_only: false.

This essentially doesn't change anything for kernels up to 5.10 and it is actually cleaner because, conceptually we want to be able to be generic and handle all the cases. Then it's up to the Firecracker user to allow it or not, via the CAP_SYS_PTRACE or the sysctl knob.

So a plan for achieving this would be:

  1. Create an integration test that actually replicates the problem
  2. Fix the issue for 6.1. Ideally, this needs to happen upstream within userfaultfd-rs and then we can consume the new version in Firecracker.
  3. Switch the default to user_mode_only: false to make the intention clear.

Please feel free to address any of these items you want. WDYT?

src/api_server/swagger/firecracker.yaml Outdated Show resolved Hide resolved
Comment on lines 33 to 35
/// Starting from Linux 5.11, this mode will only serve page
/// faults that occur in user-space, not those that occur in the kernel.
/// Use UffdPrivileged if you wish to serve page faults in both.
/// Before Linux 5.11, there is no difference between Uffd and UffdPrivileged.
/// Both will serve page faults in both user-space and the kernel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perhaps a naive question but some basic Googling and Kernel Doc skimming didn't help me. Do you have any docs that would explain what happens if I opt for Uffd (non-priviledged) while running on a Linux Kernel > 5.11 and the kernel originates a page fault? The uffd man page says that would trigger a SIGBUS, which seems like it would lead to process death/exit. So, I'm either missing something or such Kernel originated page faults are exceedingly rare in normal operation or Uffd mode will rarely (never) be used post 5.11 Kernel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right @avirtuos. You're not missing anything. We want to handle the case of kernel-triggered page faults at all times, otherwise we will get SIGBUS and die. That's what we plan to do is to use the new interface for creating UFFDs, which always gives us file descriptors that are able to handle those.

@maggie-lou maggie-lou marked this pull request as draft August 29, 2023 18:57
@maggie-lou
Copy link
Author

Just an update:

  1. I'm working on a unit test. Still debugging some things.
  2. Realistically I might not have time to work on this in the next couple weeks unfortunately.
  3. I reverted the UffdPrivileged changes and set user_mode_only=false in this PR.

@bchalios
Copy link
Contributor

Hey @maggie-lou, just a heads-up. Seeing that you said you didn't have time to work on this, I opened a PR at userfaultfd-rs that adds support for /dev/userfaultfd.

I was able to change our UFFD test to reproduce the error on 6.1 here: #4086 and also consumed the userfaultfd-rs changes to ensure they solve the problem.

@maggie-lou maggie-lou closed this Sep 18, 2023
@maggie-lou
Copy link
Author

Hey @maggie-lou, just a heads-up. Seeing that you said you didn't have time to work on this, I opened a PR at userfaultfd-rs that adds support for /dev/userfaultfd.

I was able to change our UFFD test to reproduce the error on 6.1 here: #4086 and also consumed the userfaultfd-rs changes to ensure they solve the problem.

Great! Sorry for the delay in working on this. Excited to see this get merged!

@bchalios
Copy link
Contributor

Thanks @maggie-lou. Would you mind taking a look at the userfaultfd-rs PR. The maintainer is looking for some feedback before merging it. A thumbs up or some comment there might help doing this faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants