Performance issues of Caffeine with time based expiration (vs Guava) #1320

sfc-gh-emammedov · 2023-11-15T15:29:13Z

Hi,

We are considering using Caffeine in one of our projects where Guava is currently used. Before making the switch we wanted to run a performance experiment against both caches.

When building the cache we are relying on time based expiration:

...
    .expireAfterWrite(60L * 60L * 2L, TimeUnit.SECONDS)
    .expireAfterAccess(60L * 60L * 2L, TimeUnit.SECONDS)
.build()

The benchmark (https://github.com/sfc-gh-emammedov/guava-caffeine-performance-comparison/blob/main/CaffeineVsGuava/src/com/example/cache/CaffeineVsGuavaTest.java) has the following knobs (default setting):

THREAD_COUNT (10) concurrent threads accessing the cache (either read or write)
UNIQUE_ENTITIES_COUNT (1000) number of unique entities stored in cache
TOTAL_OPERATIONS_PER_THREAD (100) number of operations each thread will perform against each unique entity in the cache
READ_RATIO (0.9, i.e. 90%) ratio of read threads (the remaining threads are assigned as write threads

During the tests we noticed that although read operations are faster with Caffeine, write operations are slower. Using IntelliJ profiler we observed that a bunch of time was spent for maintenance related tasks (in this simplified benchmark, it might not be the most expensive operation, but during our internal cache tests, majority of the time was spent in scheduleDrainBuffers and eventually during the thread unparking):

We tried different executors for Caffeine, but that did not help either. We tried following:

Caffeine (Executors.newSingleThreadExecutor())
Caffeine (Executors.newFixedThreadPool(10))
Caffeine (Runnable::run, basically running the maintenance on main thread)

We then tried to remove time based expiration completely and that made the real difference. Caffeine was way faster than Guava then.

Here are the results of benchmark (durations in nanoseconds):

Guava

Avg benchmark dur:32899479
Avg read dur:965
Avg write dur:300

Caffeine

Avg benchmark dur:52712004
Avg read dur:734
Avg write dur:510

Caffeine without expiration

Avg benchmark dur:16337327
Avg read dur:83
Avg write dur:150

We were wondering if this is an expected behaviour for Caffeine when it is configured with time based expiration or whether we are missing some key configuration knob which would make Caffeine performant with time based expiration?

Thank you!

The text was updated successfully, but these errors were encountered:

ben-manes · 2023-11-15T16:40:36Z

Thanks for taking the time to benchmark and provide your findings. Here are a few observations,

You should probably use jmh for benchmarks to ensure you do not bias the analysis (see ours as examples)
Enabling both expiration modes doesn't make sense in practice, as here expireAfterWrite is redundant.
You can adjust concurrencyLevel in Guava for higher throughputs
A repeated full scan by all threads is not a realistic distribution, power law like zipf follows a hot-cold pattern
Caffeine uses a write buffer to schedule maintenance for a batch of work, assuming that it can hide latencies either due to a low write rate or writes to popular entries. When the buffer is full then it stalls writers as backpressure to avoid runaway growth. You are probably forcing this and since only one thread performs maintenance at a time, it degrades total write throughput.
If using put then we do have a write tolerance to cope with a flood of expireAfterWrite updates, where updates within 1s are considered close enough to downgrade to a lossy read buffer event. That optimization isn't present on computes / merge, though probably could be. If you switch you might see a large speed up as the write buffer is less stressed.
Guava splits the cache into N segments, which improves throughput at various costs. A write-heavy workload is rare, but if needed then we defer that optimization to users who can decide if the tradeoff is worthwhile. That is simply to stripe by the keys hash to chose a cache, e.g. caches[key.hashCode() % caches.length].put(key, value).

ben-manes · 2023-11-15T18:01:30Z

Using merge is a bit odd since that is a forced write, whereas computeIfAbsent is usually the behavior that you want. That does a read before falling back to a write if absent or expired, and is what both Guava and Caffeine are optimized for. You can see that while merge is less optimized, those more common cases are faster in your benchmark harness. (Note jmh should still be strongly preferred)

Merge

Guava

Avg benchmark dur:52921453
Avg read dur:546
Avg write dur:439

Caffeine

Avg benchmark dur:60350082
Avg read dur:362
Avg write dur:553

Caffeine without expiration

Avg benchmark dur:33437848
Avg read dur:51
Avg write dur:314

Put

Guava

Avg benchmark dur:43992562
Avg read dur:516
Avg write dur:338

Caffeine

Avg benchmark dur:26648787
Avg read dur:306
Avg write dur:170

Caffeine without expiration

Avg benchmark dur:13292782
Avg read dur:73
Avg write dur:97

Compute If Absent

Guava

Avg benchmark dur:45823483
Avg read dur:571
Avg write dur:378

Caffeine

Avg benchmark dur:21660199
Avg read dur:484
Avg write dur:155

Caffeine without expiration

Avg benchmark dur:8973124
Avg read dur:130
Avg write dur:62

sfc-gh-emammedov · 2023-11-16T10:48:33Z

Thank you for the detailed reply!

The reason we are using merge is that there are multiple threads accessing and updating the cache. Each thread is fetching the latest data from the database and putting the up-to-date entity into the cache. The entities are versioned. Whenever a thread wants to write an entity into the cache we need to ensure that version of the entity is increasing (because otherwise we would be writing a stale value into the cache). merge allows us to do that atomically (based on the implementation of LocalCache in Guava and AFAIU Caffeine provides the same guarantees). I am not sure if this could be replicated via put method (without explicit locking involved).

You mentioned that put has an optimisation to cope with a flood of expireAfterWrite updates. Is it possible to add a similar optimisation for merge?

ben-manes · 2023-11-17T00:58:10Z

That sounds like a good reason to use merge, thanks for clarifying. Note that if you are not already using it in Guava, be aware that their computes have had nasty bugs such as corruption or deadlocks. I helped fix a few, but since it is difficult to get fixes merged there are still open items so consider reviewing the bug list.

I think the optimization could be applied. It was added to resolve a similar benchmark concern (orbit/orbit#144 (comment)). I don't think you'll run into this as a bottleneck in a non-synthetic benchmark since the application and I/O time, as well as the item distribution, will give the cache enough time to flush the write buffer and hide the latency. It's worth doing to alleviate concerns, but it shouldn't be a blocker for you.

sfc-gh-emammedov · 2023-11-17T16:11:17Z

Got you, makes sense.

Just curious, would you have time to help add an optimisation for merge?

ben-manes · 2023-11-17T16:23:52Z

It’s to hard to say given a busy week and the holiday season. I might get to it this weekend, or not. I can’t say tbh.

ben-manes mentioned this issue Aug 15, 2024

Question: difference in performance between Cache#get() and Cache.asMap()#compute #1761

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issues of Caffeine with time based expiration (vs Guava) #1320

Performance issues of Caffeine with time based expiration (vs Guava) #1320

sfc-gh-emammedov commented Nov 15, 2023

ben-manes commented Nov 15, 2023

ben-manes commented Nov 15, 2023

Guava

Caffeine

Caffeine without expiration

Guava

Caffeine

Caffeine without expiration

Guava

Caffeine

Caffeine without expiration

sfc-gh-emammedov commented Nov 16, 2023

ben-manes commented Nov 17, 2023 •

edited

Loading

sfc-gh-emammedov commented Nov 17, 2023

ben-manes commented Nov 17, 2023

Performance issues of Caffeine with time based expiration (vs Guava) #1320

Performance issues of Caffeine with time based expiration (vs Guava) #1320

Comments

sfc-gh-emammedov commented Nov 15, 2023

ben-manes commented Nov 15, 2023

ben-manes commented Nov 15, 2023

Guava

Caffeine

Caffeine without expiration

Guava

Caffeine

Caffeine without expiration

Guava

Caffeine

Caffeine without expiration

sfc-gh-emammedov commented Nov 16, 2023

ben-manes commented Nov 17, 2023 • edited Loading

sfc-gh-emammedov commented Nov 17, 2023

ben-manes commented Nov 17, 2023

ben-manes commented Nov 17, 2023 •

edited

Loading