-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature to use the Azure or AWS-specific Image Caches of ImageSharp #15016
Comments
@Piedone I agree with the need to move this outside the file system. This once was a topic during triage and I shared the same concern. I just don't think it will have any performance gain. It'll free up resources on the web server for sure. But, I can see it being slightly slower in performance. Either way, I agree with adding this feature would be great. |
Thanks for your feedback! On Azure App Services storage IO is a really narrow bottleneck, so due to not bumping into its throttling even with single page views and especially if the site sees any traffic), I'd expect a performance increase in the average media request too (and the seemingly local storage there is a networked one anyway, with latencies similar to Blob Storage). But we'll see, we're testing it now. In the meantime, I'll start working on this. |
Due to the limitation mentioned above we can't easily manage tenants' caches separately unless using separate containers/buckets. This means that purging the cache of just one tenant is not supported (nor purging the cache at all, unless you delete the blobs manually). Furthermore, cache files are only removed for a tenant when removing the tenant itself if you use a separate container for each tenant. I think this is acceptable. But perhaps I can add a background task that automatically deletes cache files older than What do you think? I created the first draft of this, only for Azure for now: #15028. |
We'll have subpath config in ImageSharp: SixLabors/ImageSharp.Web#351 (reply in thread) |
Opened an ImageSharp PR to have subpath config: SixLabors/ImageSharp.Web#353. |
Note that this is only about caching images resized with ImageSharp, not any other Media file coming from Blob Storage (that I'm still looking into under #14859). Those are directly served by IS, and currently the cache IS uses is only the local file system one. This issue and the corresponding PR are about swapping that out with the Blob Storage cache of IS. I.e., after that, IS will save cache files to Blob Storage and also serve them from them directly. No local files used at all. The request still go through the same pipeline though, so URLs don't change, clients aren't directly connecting to the blob's URL. Yes, I'm aware that the local file system under Azure App Services is actually an Azure File Share. The problem with it is that it has a much lower (default) r/s, i.e. IOPS limit (see here) than Blob Storage (which is 20k r/s). And the actual local file system, which I also tried, is even lower at presumably 500 IOPS (see). I went through these under #14859. So, with this change, we can spread out the IO operations across multiple storages, so hopefully neither of them are a bottleneck, at least until you not to anything extreme. We're still testing the above prototype. The results are a mixed bag yet, but we'll see after gathering some more data. |
Here are the results after about a week:
With Blob Storage, at least I can see which exact operations take time. And while usually it's much faster, frequently Blob Storage is super slow. See e.g. this request: Each Blob Storage operation took more than 1,5s! This is really slow. I see many more examples of the same, where the latency of simple Blob operations for 20-30 KB images take seconds. It's hopeless for us to optimize anything if the underlying Blob operations can take so long. And while the most painful are request that have seconds of latencies, multiple 100s of ms is also way too much for what I'd expect. Note that the App and Blob Storage account are in the same Azure region, so these requests are supposedly not even leaving the datacenter. Looking at the Success Server Latency and Success E2E latency it's the same story (note that this is Success Server Latency, i.e. the time it takes Blob Storage to response, it's not related to network latency): Note how usually the latency is within what you'd expect, <20 ms (though this is not visible in the chart) but then there's a huge spike. Without anything special in the amount of transactions: And it's not because of huge egress/ingress volume: This is one example, but we see Blob latencies spike to seconds, even close to 10s, multiple times a day. @sebastienros can you ask somebody from the Azure team to check if this is normal? From what I can see in public sources, it isn't. |
@Piedone Just wondering what your settings look like. Which blob tear are you using? May be you are on in a non-hot tier. https://learn.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview . Are you using private endpoint? |
It's on the hot tier, yes. BTW I think hot-cold are not performance tiers but pricing ones. The endpoint is public. |
@Piedone I can try to ping them but only if you create an isolated repro (no orchard for instance) that is easy to understand. They might even say "remove ImageSharp" to see if that is the problem. At that point we could even check the SLA and then file a support ticket. If think this is a sensible step before I start reaching out directly. |
I see, though the Success Server Latency (and the other storage metrics) is not really something that would depend on whether it's OC, IS, or anything else on the other end of the line. The SLA for Standard storage transactions I think is 2s (see page 83) for files <=1 MB. It's usually, but not always, within that. I can't find the Premium figure there, but at least for File here it says that should be <10ms. Now I switched over to Premium storage as a test. We'll see in the next few days whether it makes a difference. Interestingly, pricing-wise it should actually be better than the Hot tier of Standard, since while its data storage price is $0,15/GB vs $0,021, read/write operations are cheaper. This is a bit more promising since it seems that CPU became the bottleneck, with some requests resembling this, with the time between the arrows NOT being storage latency (but I assume CPU): Latency metrics, even E2E, are <10ms. We'll see. |
Well, with premium storage I'm still frequently getting >1s Blob latencies, and resized image requests are still topping the slow requests list. E.g. here the quickest Blob access was 400ms, the slowest 1,7s: Note that this is a cache read for an existing cache item, as you can see, it's just fetching the blob. This time though, checking out the storage account's metrics for the time frame of such requests, Success Server Latency there is max 21 ms with E2E latency being <=28 ms. So, this is promising for the wider use case. However, we're doing a standard App Service hosting model, so still, where is the rest of the time spent? Not much remains than the network, but I can't see any issues that might cause this (and then again this doesn't happen all the time). |
I asked this in the Microsoft Q&A too: https://learn.microsoft.com/en-us/answers/questions/1514895/where-are-blobbaseclient-operations-spending-time |
In the meantime, as an additional issue, we see too many IS cache misses. My working theory is that this is caused by the webroot of our app being renewed due to staged publishing. And since via The real, fundamental solution here might be a persistent (and fast) webroot folder. I'm not hopeful about being able to move that in any more granular way than setting I'll try this too. |
I'll be able to share some real-life performance comparisons between various IS cache approaches. |
See: #14859 (comment) |
Is your feature request related to a problem? Please describe.
When caching images resized with ImageSharp, we use
PhysicalFileSystemCache
. This can cause an IO bottleneck, like it happens on Azure App Services, see #14859.Describe the solution you'd like
There are alternative Image Caches:
AzureBlobStorageImageCache
stores resized images in Azure Blob Storage, andAWSS3StorageCache
in AWS S3. These can offer a performance boost to apps running in either cloud by reducing the number of local IO transactions. Furthermore, if you run throwaway webservers that start with a blank slate after every deployment, including containers, then you can use these features to make your image cache persistent.So, we could add features in
OrchardCore.Media.Azure
andOrchardCore.Media.AmazonS3
to optionally use these caches.Note that unfortunately at the moment these caches can't distinguish tenants, see this feature request. We could either contribute that (if the ImageSharp team wants it, to begin with; or add it from a fork of
ImageSharp.Web
) or accept that the cache files will be in the root of an Azure Blob Storage container/AWS S3 bucket for all tenants, making tenant-specific cache purging or clean-up after tenant deletion infeasible.I'm adding a simple prototype below, but I'm yet actively testing
AzureBlobStorageImageCache
on DotNest:Describe alternatives you've considered
Using Output Cache for
/media
, as elaborated under #14859, but that's perhaps good anyway, in addition to this.The text was updated successfully, but these errors were encountered: