-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify behavior around context loss and error reporting. #744
base: main
Are you sure you want to change the base?
Specify behavior around context loss and error reporting. #744
Conversation
Based on @mingmingtasd's work in the Chromium prototype implementation. For webmachinelearning#477
Some points for discussion:
@mingmingtasd and @reillyeon could you do an initial review? |
Thanks! @inexorabletash @reillyeon
I will check context lost for all of the synchronous and asynchronous actions depending on the context in my chromium CL. |
Something this PR doesn't do is specify the behavior around rejecting in-flight asynchronous operations. |
Thoughts on how to do that and to what level of detail?
One generic approach for both of those would be to replace: "Queue an ML task with global to resolve promise with ..." with:
... which I think covers the script-observable behavior, but not that the async steps internally should fail. |
There are two separate considerations: what happens to the promise returned by a method and what happens to an asynchronous operation itself. Operations like |
I suppose the following step of execute graph could handle the device lost error:
A question is if the error is device lost, should it run the "context-lost" steps? Like the Chromium prototype does in
Or we could just say "If that returns an error, then queue an ML task with global to reject promise with an 'OperationError' DOMException" similar to
The new steps may not run because these steps may already be aborted (by "abort these steps") if previous steps fail . |
+1. Script timeline:
Device/queue timeline:
Note: impl. is always free to abort pending async ops immediately (and release buffers). |
I think this would be script observable - i.e. what order do the |
@reillyeon @huningxin @bbernhar I think the DirectML backend here has considered this, |
I think we've settled on the idea that destroying an Destroying an |
I have submitted a CL to expose |
Based on @mingmingtasd's work in the Chromium prototype implementation.
A
lost
promise attribute is added toMLContext
, which resolves when the context is lost, and provides an implementation-defined message explaining the reason. Synchronous and asynchronous actions depending on the context will fail if the promise is settled.This also modifies the omnipresent "has built" tests on
MLGraphBuilder
methods to be a "can built" test which also checks that the builder's context is not lost.For #477
Preview | Diff