Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate hashes of images and include it in image_details field to improve image caching #5238

Open
5 tasks done
asudox opened this issue Nov 29, 2024 · 4 comments
Open
5 tasks done
Labels
enhancement New feature or request

Comments

@asudox
Copy link

asudox commented Nov 29, 2024

Requirements

  • Is this a feature request? For questions or discussions use https://lemmy.ml/c/lemmy_support
  • Did you check to see if this issue already exists?
  • Is this only a feature request? Do not put multiple feature requests in one issue.
  • Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.
  • Do you agree to follow the rules in our Code of Conduct?

Is your proposal related to a problem?

The image_details table in (for example) a getpost json response does not include the hash of the image. A hash could be used to cache images better.

I assume the link field in image_details could be used for image caching, but this would not cache duplicate images or duplicates in other instances.

Describe the solution you'd like.

A SHA256 hash would be calculated and stored when an image is uploaded to an instance. This would then be returned in the image_details table.

Describe alternatives you've considered.

None.

Additional context

No response

@asudox asudox added the enhancement New feature or request label Nov 29, 2024
@Nutomic
Copy link
Member

Nutomic commented Nov 29, 2024

Images are already served with all the necessary headers for caching:

cache-control: public, max-age=86400, immutable
etag: W/"1167f-193300d43e0"

@asudox
Copy link
Author

asudox commented Nov 29, 2024

Images are already served with all the necessary headers for caching:

cache-control: public, max-age=86400, immutable
etag: W/"1167f-193300d43e0"

that caches the image at that unique link. if there are duplicates of the same image, this would not work.

if, for instance, the same image is uploaded again by another user (on the same instance or another instance), this wouldn't get the cached image, but make a new request to get the same image, even though the same image with the same hash is available in the image cache because the duplicate has a different link.

@dessalines
Copy link
Member

Seems like pictrs could handle this case, maybe via redirects or something on duplicate hashes to the same image.

cc @asonix

@asudox
Copy link
Author

asudox commented Nov 29, 2024

Seems like pictrs could handle this case, maybe via redirects or something on duplicate hashes to the same image.

well, I guess that would save storage and solve this problem when the duplicate image is on the same instance.
however, with hashes, it wouldn't matter if that image is uploaded to instance X or instance Y, it would still work.

what you suggested could probably be another feature request for saving storage as it does not quite achieve what I meant in my feature request.

@asudox asudox changed the title Calculate hash of images and include it in image_details field to improve image caching Calculate hashes of images and include it in image_details field to improve image caching Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants