Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always explicitly disable gzip automatic decompression on reqwest client used by object_store #6843

Merged
merged 5 commits into from
Dec 11, 2024

Conversation

phillipleblanc
Copy link
Contributor

@phillipleblanc phillipleblanc commented Dec 6, 2024

Which issue does this PR close?

Closes #6842

Rationale for this change

Fixes an issue where enabling a non-default feature (gzip) for reqwest would cause object_store to stop working if using the HTTP store against an HTTP server that supports gzip response compression.

What changes are included in this PR?

Call the no_gzip method on the reqest ClientBuilder to ensure that even if the gzip feature is enabled, the object_store client will not use the transparent decompression logic.

I considered making this an option instead of always setting it, but since this is such a frustrating footgun to encounter and debug, I think its better to always set it unless there is a compelling reason not to.

Are there any user-facing changes?

My understanding is that most/all users interacting with object stores do not want the gzip compression logic (since none of the major cloud object store providers support it), so this change should not be breaking.

@github-actions github-actions bot added the object-store Object Store Interface label Dec 6, 2024
@@ -671,6 +671,10 @@ impl ClientOptions {
builder = builder.danger_accept_invalid_certs(true)
}

// Reqwest will remove the `Content-Length` header if it is configured to
// transparently decompress the body via the non-default `gzip` feature.
builder = builder.no_gzip();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that gzipped data content will be left gzipped?

So if I request a resource that the server gzip's in response, that the result I get from ObjectStore::get would also be gzipped 🤔

Copy link
Contributor Author

@phillipleblanc phillipleblanc Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct - sorry I could have made this clearer.

All this affects is what happens when the response has the header Content-Encoding: gzip, which HTTP servers will usually only do when the request has the header Accept-Encoding: gzip. If that is the case, then reqwest will transparently decode the body as a gzip stream and remove the Content-Length header (if the gzip feature is enabled - this no_gzip function explicitly disables that behavior even if the feature is)

For object store APIs, it will just return the bytes of the object as they are (including objects that are gzipped).

@tustvold tustvold merged commit 50cf8bd into apache:main Dec 11, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

object_store errors when reqwest gzip feature is enabled
3 participants