-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBL-2791: Enable actor stack trace mechanism #1382
base: master
Are you sure you want to change the base?
Conversation
This feature will keep track of two pieces of information: 1. For any given execution, the path of enqueue and execution calls that led to the execution 2. For any given actor, the linear history of enqueue and execution calls (regardless of source) If an exception occurs, this information is dumped to the logs.
Mark as draft until resolution of iOS simulator issue |
It is not supported in iOS simulator, and since the underlying unit is a queue instead of a thread it makes more logical sense to make use of dispatch_queue_set_specific
@snej Ths GCD implementation of this was a bit awkward. Let me know if you have any ideas about how to improve it. It's complicated by the fact that the manifest used as the "queue manifest" needs to live long enough to be used by however many recursive calls happen in a given execution. For the threaded mailbox this just meant using a thread local static shared_ptr and copying it to each context that uses it. However, thread_local is not allowed in iOS simulator and doesn't really fit well with the queue based logic so I tried to make use of the |
Do we need this for Apple platforms? It sounds like the same info that Xcode's debugger already shows. (At least item 1.) |
That's great if you are running in a debugger, but I'm thinking about this information being put into our logs so that we can have it even from the field. |
That is a lot of overhead to add to production builds! I think I'd need to be convinced that this is necessary. I can see that it would be useful in some occasions, but it would be slowing everything down and adding to memory bloat. Is this something that can be disabled except by a runtime flag? |
I'm trying to balance things out here. If it is enabled or disabled at runtime then I guarantee we get into a case where it's off and we end up back where we started -> with a bunch of intertwined logs that make it difficult to navigate through the flow of an issue of "replicator getting stuck" without context of the calls that led to that point. I disagree that it adds an excessive amount of memory pressure since it's going to prune out entries as it receives new ones (I'm certainly open to decreasing the number of entries that are saved though). I'm going to start proposing a lot of changes like this because our logs in general often leave us puzzled as to what is going on. I want a way to rectify this situation by collecting some data about the state of the program that can be accessed on demand or at exception time in this case. Whether or not having thread / queue local stuff adds too much of a performance penalty is something I could debate about. In short we need something here to help us navigate an actor based world in which the most common form of bug is a race condition or hang. Simply logging things is not enough. What I am after is an answer to the question "in what order did things happen in order to get here?" EDIT I also thought of adding this information to logs instead, but I figured that collecting it in memory would be overall better for performance than logging it all in realtime. |
We really need performance testing in CI so we can see whether a new feature like this affects performance... |
That's something that could be arranged I think. |
This feature will keep track of two pieces of information:
If an exception occurs, this information is dumped to the logs.