Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1965: make FunctionDataFetcher handle blocking scopes resulting in a thrown Error as expected #1966

Conversation

wallind
Copy link

@wallind wallind commented Apr 22, 2024

📝 Description

Changes FunctionDataFetcher to always return a CompletableFuture (no matter if scope is suspend or not) in order to cleanly handle cases where runBlockingFunction results in a thrown Error (or descendant of Error). See #1965 for more detail.

🔗 Related Issues

Resolves #1965

*this is my first contribution to this repo; so if I need to do anything differently or in addition please let me know

wallind added 2 commits April 22, 2024 14:26
…leFuture`

This makes the result where `runBlockingFunction` ends in a thrown `Error` (or descendant of `Error`) be handled as expected.

Also for consistency makes `runBlockingFunction` wrap the result in a `CompletableFuture` even for the successful case.

Lastly improves grammar a bit adding some commas where needed.
@@ -49,7 +49,7 @@ open class FunctionDataFetcher(
/**
* Invoke a suspend function or blocking function, passing in the [target] if not null or default to using the source from the environment.
*/
override fun get(environment: DataFetchingEnvironment): Any? {
override fun get(environment: DataFetchingEnvironment): CompletableFuture<Any?> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe that this change is going to cause incompatiblity with the update to graphql-java 22

Copy link
Author

@wallind wallind Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand more specifically what you mean so that maybe I can try to adjust my solution accordingly?

Otherwise, I'd say I'm happy to remove the change to this get function signature as it's not required for my solution to work. Would that be an okay compromise?

*I mainly made the change as it made me have to make less changes to the Test class; where without this change I'll have to cast to CompletableFuturw repeatedly. This felt like an efficiency and type-safety boon.

Copy link
Contributor

@samuelAndalon samuelAndalon Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR; graphql-java 22 introduces polymorphic resolution, in previous versions, even if your data fetcher was returning a materialized object (not a CF), it was anyways creating a CF causing a terrible memory bottleneck where GC pause times were paying the price.

having said that, i wonder, why would you want to catch Errors ?
screenshot taken from Error docs
image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR; graphql-java 22 introduces polymorphic resolution, in previous versions, even if your data fetcher was returning a materialized object (not a CF), it was anyways creating a CF causing a terrible memory bottleneck where GC pause times were paying the price.

😅 I need to read up on this, as tbh that is pretty far out of my depth of current expertise lol. So I'm inclined to defer to your judgement and undo this part of my change regardless of where the larger change lands as accepted or not.

having said that, i wonder, why would you want to catch Errors ?
screenshot taken from Error docs

In regards to this, please see my thoughts/discussion here #1966 (comment) if you would.

} catch (exception: InvocationTargetException) {
throw exception.cause ?: exception
}
protected open fun runBlockingFunction(parameterValues: Map<KParameter, Any?>): CompletableFuture<Any?> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause an allocation of an object of type CompletableFuture for fields that are already materialized (in-memory).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the CompletableFuture wrap the success-case for consistency, but could also peel that back as it's really only the throw ... case that is the problem and inconsistent with the equivalent behavior for suspend resolvers.

Shall I do that?

@samuelAndalon
Copy link
Contributor

in order to cleanly handle cases where runBlockingFunction results in a thrown Error (or descendant of Error).

what if instead you handle your exceptions in your data fetchers ?, that way, you will have a more resilient way to recover from an error state.

Copy link
Contributor

@samuelAndalon samuelAndalon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

throw exception.cause ?: exception
}
protected open fun runBlockingFunction(parameterValues: Map<KParameter, Any?>): CompletableFuture<Any?> =
CompletableFuture<Any?>().apply {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, this change is going to cause incompatibility with the update to graphql-java 22

@wallind
Copy link
Author

wallind commented Apr 22, 2024

in order to cleanly handle cases where runBlockingFunction results in a thrown Error (or descendant of Error).

what if instead you handle your exceptions in your data fetchers ?, that way, you will have a more resilient way to recover from an error state.

If by "... exceptions ..." you mean thrown Errors, then yeah I get where you're coming from. I could change our code to handle that scenario.

More broadly though, I'm trying to point out an inconsistency between the way suspend and non-suspend scopes are handled in the resolvers atm; an inconsistency that is grounded in the logic in FunctionDataFetcher.

For suspend scopes (because of CompletableFuture) Error is gracefully handled via the errors response sub-structure:

image

and for non-suspend Error it is not, and ends up handling by the server's implementation of error-handling:

image

So while yes my original Issue/PR are written as trying to make non-suspend behave as suspend does currently, I could see the option to actually make the inverse change as being something you'd want based on your response here #1966 (comment).

TLDR do you want me to adjust this work so that Error in the suspend case is not gracefully handled in the errors response structure?

*you'll have to pardon my naivety regarding Exception vs Error as this is a relatively new subject focus for me. After speaking with my colleagues I understand what you mean with things like StackOverflowError and OutOfMemoryError being descendant from Error and more serious in nature.

@samuelAndalon
Copy link
Contributor

samuelAndalon commented Apr 22, 2024

TLDR do you want me to adjust this work so that Error in the suspend case is not gracefully handled in the errors response structure?

yes, that sounds good, we would appreciate that contribution

@dariuszkuc
Copy link
Collaborator

@wallind
Copy link
Author

wallind commented Apr 23, 2024

TLDR do you want me to adjust this work so that Error in the suspend case is not gracefully handled in the errors response structure?

yes, that sounds good, we would appreciate that contribution

Sweet @samuelAndalon! happy to help. Will try to get to that sometime in the next few days...except now I'm a bit confused by @dariuszkuc's suggestion:

Instead of changing the data fetcher I think we should probably just catch the error here -> https://github.com/ExpediaGroup/graphql-kotlin/blob/master/servers/graphql-kotlin-server/src/main/kotlin/com/expediagroup/graphql/server/execution/GraphQLRequestHandler.kt#L99-L137?

Are you saying to make the change I originally proposed, to make non-suspend not fail out but rather be gracefully handled...but in GraphQLRequestHandler? Or rather are you saying to change it so suspend doesn't gracefully handle Error--as @samuelAndalon just affirmed might be preferred--..but to make that change in GraphQLRequestHandler? Not 100% sure I'm following.

This would be the former:

private suspend fun execute(
  graphQLRequest: GraphQLRequest,
  batchGraphQLContext: GraphQLContext,
  dataLoaderRegistry: KotlinDataLoaderRegistry?,
): GraphQLResponse<*> =
  try {
    graphQL.executeAsync(
      graphQLRequest.toExecutionInput(batchGraphQLContext, dataLoaderRegistry),
    ).await().toGraphQLResponse()
  } catch (throwable: Throwable) {
    // Catch any Throwable, Error or Exception, and convert it to a GraphQLError
    val error = throwable.toGraphQLError()
    GraphQLResponse<Any?>(errors = listOf(error.toGraphQLKotlinType()))
  }

and this is the latter:

private suspend fun execute(
  graphQLRequest: GraphQLRequest,
  batchGraphQLContext: GraphQLContext,
  dataLoaderRegistry: KotlinDataLoaderRegistry?,
): GraphQLResponse<*> =
  try {
    val result = graphQL.executeAsync(
      graphQLRequest.toExecutionInput(batchGraphQLContext, dataLoaderRegistry),
    ).await()

    // Look for Error (or any descendant) and re-throw so it's not swallowed
    result.errors?.forEach {
      if (it is ExceptionWhileDataFetching && it.exception is Error) {
        throw it.exception
      }
    }

    result.toGraphQLResponse()
  } catch (exception: Exception) {
    val error = exception.toGraphQLError()
    GraphQLResponse<Any?>(errors = listOf(error.toGraphQLKotlinType()))
  }

Either way, I think I see why you're steering me away from FunctionDataFetcher#runSuspendingFunction as I'd have to make the CompletableFuture resolve within that method to know the failure type....which would be undesirable I'm guessing.

@samuelAndalon
Copy link
Contributor

i understand both approaches and i see reasons to follow one or the other, however, IMHO, we should not attempt to catch Errors, we have to follow what java mentions about Error.

@dariuszkuc
Copy link
Collaborator

Since graphql-java handles exceptionally completed futures (handles Throwable) I am unsure whether you can easily force our function data fetcher to blow up from coroutines in the same way as we do for sync execution. I believe this would require inspecting the result of the coroutine which would mean we would have to wait for it to complete....

@wallind
Copy link
Author

wallind commented Apr 23, 2024

@samuelAndalon and @dariuszkuc I appreciate the thoughtful discussion from both of you, but must admit I am not clear on the desired direction.

For now I am going to defer to @dariuszkuc's suggestion, as he seems to be the primary repository maintainer, and do what I originally intended "modify handling for non-suspend scopes to align with suspend as it exists today, meaning Error thrown from below is gracefully handled"; that would be the following change to GraphQLRequestHandler:

private suspend fun execute(
  graphQLRequest: GraphQLRequest,
  batchGraphQLContext: GraphQLContext,
  dataLoaderRegistry: KotlinDataLoaderRegistry?,
): GraphQLResponse<*> =
  try {
    graphQL.executeAsync(
      graphQLRequest.toExecutionInput(batchGraphQLContext, dataLoaderRegistry),
    ).await().toGraphQLResponse()
  } catch (throwable: Throwable) {
    // Catch any Throwable, Error or Exception, and convert it to a GraphQLError
    val error = throwable.toGraphQLError()
    GraphQLResponse<Any?>(errors = listOf(error.toGraphQLKotlinType()))
  }

If that's not right, or I'm somehow missing the mark don't hesitate to let me know.

*In the meantime I remain open to discussion, and will certainly not be rushing this through!

@dariuszkuc
Copy link
Collaborator

As Sam mentioned, when Error happens it generally would mean some "bad things" happened (like OOM) so I don't think we should be updating the default GraphQL server handler to catch those. Ideally coroutines should behave the same way but I don't think we can easily handle that. While this is an inconsistent behavior between coroutines and blocking functions, I think that this is probably a good tradeoff.

So the TLDR is -> we probably shouldn't change anything unless there is a good way to propagate errors out of the completable futures/errors.

@wallind
Copy link
Author

wallind commented Apr 23, 2024

As Sam mentioned, when Error happens it generally would mean some "bad things" happened (like OOM) so I don't think we should be updating the default GraphQL server handler to catch those. Ideally coroutines should behave the same way but I don't think we can easily handle that. While this is an inconsistent behavior between coroutines and blocking functions, I think that this is probably a good tradeoff.

So the TLDR is -> we probably shouldn't change anything unless there is a good way to propagate errors out of the completable futures/errors.

Understood! I will look into this "unless there is a good way to propagate errors out of the completable futures/errors." bit sometime this week. From the digging I was doing earlier I don't think it is possible, as I'm sure you may already know.

However it sounds like a fun challenge, so I'll spend some time going after it. Thanks again for your time and energy talking this through with me!

*I'll cleanup and close the PR/Issue with non-resolveable outcome/commentary later this week to keep the repo clean too

@dariuszkuc
Copy link
Collaborator

Closing this for now as non-resolvable

@dariuszkuc dariuszkuc closed this Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants