-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profile subprocesses #124
Comments
I was thinking about a universal approach for profiling single and multiprocess applications, based on https://github.com/nylas/nylas-perftools approach. The idea is:
I think this is the simplest approach to solve it in a way that would be useful to profile the projects I'm working on in a production environment. What do you think? I'd love to contribute with this. |
I think there are a couple of ways we can get the child processes here - but it's pretty OS depedendant:
We could try to use PTRACE_O_TRACEFORK to get notified when the process forks. The author of pyflame tried this route with uber-archive/pyflame#67 and sounds like he got pretty far with this approach, but hit some limitations of sigtimedwait that blocked him. Alternatively we could periodically poll for subprocesses using procfs . The only annoyance here is that I think we can only get the parent of a process easily - so we'd have to scan each /proc/PID/stat file to get the parent and then filter out to ones that have a parent/grandparent etc that is the target program. This is almost certainly the easiest thing to get going, but isn't necessarily the most efficient method. This method is also what rbspy does to profile subprocesses
The officially supported functions to get subprocess information are in the tlhelp32 library (process32first and process32next) - but are much too slow to be practical. I'm thinking that for windows the best way is using the undocumented NtGetNextProcess function from ntdll.dll. There isn't much information about it online - and it does limit us to Windows Vista or above, but it's the best way I've found so far to get this going. Also we're using the analogous NtGetNextThread to get all the threads of a process right now.
I think we can use the proc_listchildpids function from libproc for OSX, and this should be relatively straightforward. Anyways, once we have all the child processes - I was thinking we'd just connect up to each one (new PythonSpy object per proceses), and then continue profiling as we do right now. @ygormutti Why do you think we need to write output to a file for each pid and then provide ways to convert these files back to a flamegraph? I think we could just collect all the stack traces for all the subprocesses internally - and then write out a merged flamegraph directly (this is basically what rbspy does right now to profile subprocesses) |
My two-cents.
We can use The function returns an array of py-spy/remoteprocess/src/freebsd/kinfo_proc.rs Lines 1582 to 1583 in a25648b
Luckily, we already have all the needed bindings.
rbspy is getting NtGetNextProcess-based implementation soon, please see |
Potentially can be useful that |
First draft is this is here: #186 - this will add support for profiling subprocesses with linux/osx |
I've added windows and freebsd support to that PR, and merged it into master. You should be able to add the |
Feature is in v0.3.0 |
It would be nice to be able to profile all the sub-processes of a python process. This would let us profile programs that use multiprocessing or gunicorn worker pools.
This will also help with profiling virtualenvs on windows w/ python 3.7.2 (#81 (comment))
The text was updated successfully, but these errors were encountered: