You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By default when feeding oprofile data, it sums across all threads and presents a plot for that. However, in many multithreaded programs, this frequently points to the wrong place to look at for optimization. For example, in a recent test, we have many threads that wait either on condition variables or go to sleep for intervals, and other threads that are CPU bound. What happens in this case is that gprof2dot shows that most of the time is spent in the code to wait (60% of the time in nanosleep, for example).
It would be really useful in the case of oprofile and vtune (not sure if the other programs give the ability to report samples per thread) to generate one plot per thread, or to have a command line option that indicates which thread to process. In one of our use cases, we have up to 100 threads (or more, many are short lived), and would like to look at specific threads - the text output is too wide to be useful for viewing, which is why I started using gprof2dot.
The text was updated successfully, but these errors were encountered:
By default when feeding oprofile data, it sums across all threads and presents a plot for that. However, in many multithreaded programs, this frequently points to the wrong place to look at for optimization. For example, in a recent test, we have many threads that wait either on condition variables or go to sleep for intervals, and other threads that are CPU bound. What happens in this case is that gprof2dot shows that most of the time is spent in the code to wait (60% of the time in nanosleep, for example).
It would be really useful in the case of oprofile and vtune (not sure if the other programs give the ability to report samples per thread) to generate one plot per thread, or to have a command line option that indicates which thread to process. In one of our use cases, we have up to 100 threads (or more, many are short lived), and would like to look at specific threads - the text output is too wide to be useful for viewing, which is why I started using gprof2dot.
The text was updated successfully, but these errors were encountered: