Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KfRaiseIrql usersim API performance renders it hard to use in stress #184

Open
Alan-Jowett opened this issue Apr 17, 2024 · 6 comments
Open

Comments

@Alan-Jowett
Copy link
Member

In kernel mode, this is a relatively light weight API. The usersim version is very heavy and causes tests to timeout due to cost.

@Alan-Jowett
Copy link
Member Author

image

@mtfriesen
Copy link
Collaborator

Nit with title: the performance might not match, but the performance is poor partly because it tries very hard to match the observable functional behavior of kernel mode. Do you know what the bottleneck is?

@Alan-Jowett
Copy link
Member Author

Looks like SetThreadPriority and SetThreadGroupAffinity are super slow. I will try profiling to narrow down why.

@Alan-Jowett Alan-Jowett changed the title KfRaiseIrql usersim API doesn't accurately match kernel mode KfRaiseIrql usersim API performance renders it hard to use in stress Apr 17, 2024
@mtfriesen
Copy link
Collaborator

Darn. If we're hitting the SetThreadGroupAffinity function, then we must not have already been at dispatch level when the raise IRQL routine is called, so we can't trivially optimize away this function.

@Alan-Jowett
Copy link
Member Author

Correct. The test case that breaks is running BPF programs at passive level, which requires raising and lowering IRQL on every epoch enter / exit.

@mtfriesen
Copy link
Collaborator

mtfriesen commented Apr 17, 2024

For context, I added a considerable amount of overhead in PR #53. Previously this routine was somewhat lighter weight, but also allowed a variety of observable misbehavior, including passive threads and other DPCs preempting dispatch level code on the same processor.

Perhaps we could add a config knob to relax some of those constraints if tests are known not to depend on dispatch level code being properly serialized WRT each other and/or WRT passive threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants