Hibernate query cache considered harmful?
Over the last few months we’ve had a whole bunch of people doing a whole bunch of Hibernate performance tuning. I’m going to do a few blogs on some really common things that we ran into that might help you as well. The first one I wanted to talk about was Hibernate query caching.
Hibernate caching overview
Hibernate has several levels and kinds of cache. There is a first level cache of Hibernate entities stored in and scoped specifically to a particular open Session. In Hibernate, entity objects are never shared between sessions.
There is also a second level cache that is shared across sessions. The second level cache is divided into regions of four types: entity, collection, query, and timestamp. Entity and collection regions cache the data from entities and relationships (but not the entities themselves). The query and timestamp caches are related and the subject of our focus here.
The query cache is used to store the result set of a query made against the database. The key of the query cache contains the query string itself and any bind parameters passed with the query. The value consists not of the entity field values but just the primary key identifiers for all of the entities returned by the query. When a query is made that hits the query cache, that set of entity identifiers can be retrieved and then resolved through the first or second level caches instead of retrieving those entities from the database.
The timestamp cache keeps track of the last update timestamp for each table (this timestamp is updated for any table modification). If query caching is on, there is exactly one timestamp cache and it is utilized by all query cache instances. Any time the query cache is checked for a query, the timestamp cache is checked for all tables in the query. If the timestamp of the last update on a table is greater than the time the query results were cached, then the entry is removed and the lookup is a miss.
Using the query cache
Two hibernate properties control whether the query cache is enabled:
The first turns on the second level cache in general and the second turns on the query cache regions. If the second is set to false, the query and timestamp cache regions are not created or used.
By default, turning on query caching does not actually cause any queries to be cached. Instead, Query or Criteria that should be cached must be explicitly marked that way. For example, a Query can be executed in this way:
Query bandsByName = session.createQuery("from Band b where b.name = :name"); bandsByName.setString("name", bandName); bandsByName.setCacheable(true); //bandsByName.setCacheRegion("special.region");
or if you’re using JPA with annotations, you can use setHint(“org.hibernate.cacheable”, true) on a javax.persistence.Query, etc. Note here that you can define your own specially named query cache with setCacheRegion instead of the default one. This isn’t particularly useful unless you have a very specific requirement to be able to evict everything from a particular query cache or to set specific eviction policies on a per-query basis. There is only ever one timestamp cache shared by all query caches.
You can also create a cacheable query through a Criteria:
Criteria crit = session.createCriteria(Company.class); crit.add( Restrictions.naturalId().set("taxId", "209384092") ).setCacheable(true) .uniqueResult();
When Hibernate executes a query based on one of these constructs, it will do something like this:
In the final step, entities are assembled from the identifiers held in the result of the lookup and that process may hit the second level cache to hydrate the returned entities.
More caching is better, right?
I think many users who turn on the second level cache also turn on the query cache because they don’t know what the query cache does. (I certainly turned it on the first time I tried second level cache.) Well, as it turns out, that’s a bad idea. The Hibernate query cache is actually not only not helpful but downright harmful to latency and scalability in many common scenarios.
If I were to summarize one piece of advice from this whole article it would be: turn off the query cache unless a) you know why you’re turning it on and b) you can measure a real improvement in a realistic load. Let’s look at some reasons why…
- The primary reason that the query cache is often not useful is that the results are constantly being invalidated by table modifications. As I mentioned earlier, any table modification causes the timestamp cache to be updated. When you do a lookup that returns any entity through the query cache, it’s possible it is invalidated by an insert or update totally unrelated to that entity. For example, you might be doing a query by social security number on a Person. If any other Person in your db has been inserted or updated since the last hit on that ssn, the result will be thrown out (even though it’s likely still valid).
It is quite common to set up a query cache but find that the hit rate is very low as the data is constantly being invalidated. But fear not – there is a way around this issue which is discussed in the next section.
- Probably the next most common query cache problem is memory usage. Query cache is notorious for gobbling up your heap, mainly because of its keys. First you have the query string itself, which is quite commonly hundreds of characters long and is frequently repeated with different bind parameters. There is a great discussion of that issue and some advanced ways to avoid it here.
Second, the bind parameters were until recently the actual objects passed in to the Query or Criteria (if that’s how you constructed the query). That meant that it was quite easy to put session-scoped entities and other complex objects into your cache where they would sit forever, possibly holding references to collections and large chunks of your session cache. This issue is described here and was actually fixed by HHH-3383 in the recent Hibernate 3.2.7 and 3.3.2 releases which convert those entities to identifiers.
- Another issue we found internally in testing at Terracotta (but which I haven’t seen mentioned anywhere else) is that turning on the query cache introduces a new source of lock contention in the timestamp cache. This cache has a single coarse lock that is locked for all inserts/updates/deletes to update the timestamp of a table and also for every lookup that occurs through the query cache. Under load, this lock can easily become a bottleneck.
Interestingly, the first part can become an issue even if NO queries are cached! One place you might see that is when doing multi-threaded table loading – every insert into any table across all threads must obtain the same lock in the timestamp cache, even if no queries are being cached. This is where it can be quite dangerous to turn on query cache if you aren’t actually using it.
- Not really a problem per se, but a common misconfiguration that can occur is with the eviction settings for the timestamp cache. Because you want your query cache to always be able to find the last update timestamp in the timestamp cache, you want to be sure that the timestamp cache does not evict entries before the query cache! In general, it is recommended that the timestamp cache be eternal or evicted on a “time to live” longer than the query cache “time to live”. The timestamp cache should not use “time to idle” based eviction.
You might wonder what happens in the case where a timestamp is evicted too early. I think if I’m reading the code right in UpdateTimestampsCache.isUpToDate(), this situation is treated as the timestamp cache saying no modifications have occurred and allowing a query cache entry to stand. It is possible in that case for updates to have occurred before or after timestamp eviction that would not be noticed and you might actually see stale values. If so, that would be a pretty subtle bug.
When is the query cache useful?
The one case that seems to be a sweet spot for using the query cache is when you frequently need to look up an entity based on a natural key. A natural key is a field or fields that form a unique way to identify an entity in a table but is not the primary key. For example, you might have a Person table with an auto-generated primary key in the database. But several identifiers for a Person might form a natural key, such as social security number or email address.
If you frequently have a user or external input providing you a natural key for lookup, your normal second level entity cache is NOT helpful because it caches based on primary key. In this case, you can use a Criteria-based cached query lookup on an immutable natural key column to do that first hop from natural key to primary key and remember that step for the future.
That query (including the bind parameter holding the natural key value) will be mapped to a result of one primary key id. You can then leverage the normal second level entity cache to do the primary key lookup.
If you noticed in the workflow up above, there is a special check made for whether the lookup is on a natural immutable key. This hint can only be supplied by a Criteria, not by a Query or other means. The columns must be marked as natural keys and flagged as immutable in your mapping file.
If you do this, you skip the check on the timestamp cache! This is because for an *immutable* natural key, it’s impossible for a table modification to change the mapping of natural to primary key. Skipping that check makes the query cache higher performance and avoids the invalidation problem, yielding much higher hit rates on your query cache.
Note that this still doesn’t avoid the lock contention created in the timestamp cache synchronization.
At Terracotta, we’re working on a new Hibernate second level cache implementation for Terracotta 3.1 and we’ve so far been focusing on performance tuning the entity and collection caches. The numbers we’re seeing with the new clustered cache look great and we can’t wait to have people try it out.
We see dramatic slow-downs any time we turn on the query cache (even if no queries are cached) as that introduces the lock contention in the timestamp cache. [Note that this is in no way specific to Terracotta - we see it with all second-level caches. The contention is in Hibernate's UpdateTimestampsCache, not in any cache provider.]
In the Terracotta cache implementation, we use a special highly concurrent distributed data structure for each cache region. That actually allows us to remove much of the synchronization present in Hibernate above the cache. We have started tinkering with this (probably for the next release) and we think it will dramatically improve the throughput query cache users see by eliminating the lock contention.
As I did the research to write this blog, I also started to see that the whole invalidation by table is very blunt. It seems like there should be a more general purpose way to solve this natural key lookup problem in a fine-grained way that leveraged the very common case where lookups and changes are done on a per-key basis and avoid the invalidation by table altogether.
- Hibernate query cache docs
- Hibernate natural ID query docs
- Hibernate: Truly Understanding the Second-Level and Query Caches – really old but still pretty good. I think this may pre-date collection cache regions.
- Hibernate Caches – basically notes and mind-maps from the cache sections of Java Persistence with Hibernate
- Understanding Caching in Hibernate – Part Two : The Query Cache – nice breakdown of calls
- Hibernate: Cache Queries the Natural Id Way – most detailed explanation of the natural key query cache optimization I’ve seen
- Hibernate Query Cache: A Dirty Little Secret – memory issue with query cache keys
- Hibernate Wars: The Query Cache Strikes Back – yet another query cache memory issue