Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google-cloud-php-storage contains critical data corruption in \Google\Cloud\Storage\Connection\Rest->downloadObject() #7864

Open
indreka opened this issue Nov 27, 2024 · 0 comments

Comments

@indreka
Copy link

indreka commented Nov 27, 2024

Environment details

  • OS: Linux
  • PHP version: 8.1.29
  • Package name and version: google-cloud-php-storage:v1.33.1

Steps to reproduce

  1. try to fetch files from storage bucket when storage.googleapis.com is having intermittent issues and returns server error 500 or 503 and response body like "We encountered an internal error. Please try again." for first request and actual response for next request.
  2. \Google\Cloud\Storage\Connection\Rest->downloadObject returns mangled file content where the beginning of actual file data is replaced with text "We encountered an internal error. Please try again."

Code example

$content = $storageAdapter->read("path/to/someFile")["contents"];

Problematic code in downloadObject function is following:

        $requestOptions['restRetryListener'] = function (
            \Exception $e,
            $retryAttempt,
            &$arguments
        ) use (
            $resultStream,
            $requestedBytes,
            $invocationId
        ) {
            // if the exception has a response for us to use
            if ($e instanceof RequestException && $e->hasResponse()) {
                $msg = (string) $e->getResponse()->getBody();

                $fetchedStream = Utils::streamFor($msg);

                // add the partial response to our stream that we will return
                Utils::copyToStream($fetchedStream, $resultStream);

                // Start from the byte that was last fetched
                $startByte = intval($requestedBytes['startByte']) + $resultStream->getSize();
                $endByte = $requestedBytes['endByte'];

                // modify the range headers to fetch the remaining data
                $arguments[1]['headers']['Range'] = sprintf('bytes=%s-%s', $startByte, $endByte);
                $arguments[0] = $this->modifyRequestForRetry($arguments[0], $retryAttempt, $invocationId);
            }
        };

It tries to handle all RequestException cases by remembering $e->getResponse()->getBody() and assuming that these were the correct bytes and then tries to resume fetching by asking the remaining data (skipping number of bytes already remembered).
As a result, whenever internal server occurs in a way that some content is returned (not just header but actual body), that said content is injected into the file.
We found out by getting intermittent image format exceptions when fetching files from storage bucket during partial outage and suddenly random images got their first 51 bytes replaced with "We encountered an internal error. Please try again.".

We were quite surprised that instead of throwing exception during storage.googleapis.com partial outage, that retry/resume logic started randomly replacing data like this.

Same problematic code exists in current latest 1.44 version:

https://github.com/googleapis/google-cloud-php-storage/blob/ebdec855364c1df9e81755e9626e3ff4687263f4/src/Connection/Rest.php#L321

I understand that the point of that code is to resume file download (useful when there is some network error when 990MB of 1000MB file is downloaded), but any non-network-errors should get ignored, not injected into file content silently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant