具有Hibernate二级缓存的CPU优势何时超过初始命中 [英] When does the CPU benefit of having an Hibernate 2nd level cache outweigh the initial hit

查看:122
本文介绍了具有Hibernate二级缓存的CPU优势何时超过初始命中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将对象添加到Hibernate二级对象缓存的CPU好处何时超过初始命中。

When does the CPU benefit of having an object added to Hibernate 2nd level object cache outweigh the initial hit.

我目前正在使用没有二级缓存的Hibernate。这适用于处理音乐文件的应用程序( www.jthink.net/songkong ),它使用Hibernate它可以扩展更多的数据,即它可以处理100,000首歌曲,内存比1000首歌曲更多。一旦歌曲处理完毕,那些歌曲就没兴趣了(除非用户运行撤销)

I am currently using Hibernate without 2nd level cache. This is for an application that processes music files (www.jthink.net/songkong) and it uses Hibernate so it can scale with more data, i.e it can process 100,000 songs with little more memory than 1000 songs. Once the songs have been processed then those songs are of no interest (unless the user runs Undo)

据我所知,如果我启用二级缓存(对于我的歌曲类) )然后,如果只是写入数据库,那么首先将歌曲写入缓存将使用更多的cpu,并且对歌曲对象的其他修改也将需要更多的cpu资源。但是随后从Ehcache中检索歌曲将需要更少的资源,然后从数据库中检索它。

As I understand it if I enable 2nd level cache (for my song class) then the initial write of the song to cache will use more cpu then if just writing to database, and additional modifications to the song object will also require more cpu resource. But subsequent retrieval of the song from Ehcache will require less resource then retrieving it from database.

我的歌曲逐个文件夹处理并经历多个阶段(在不同的阶段)执行者),当他们在下一个Executor上排队时,我们只是将歌曲ID作为参数传递,否则它将使用大量的堆存储器来存储Song对象本身。因此,当一个特定的任务实际上在Executor上运行时,它所做的第一件事就是检索那些id的歌曲。

My songs are processed folder by folder and go through a number of stages (on different Executors), when they are queued on the next Executor we just pass the song ids as parameters otherwise, it would use a lot of heap memory storing the Song objects themselves. So when a particular task is actually run on an Executor the first thing it does is retrieve the songs for those ids.

所以没有特定的歌曲ID被检索1000s有时,但每首歌通常写入1至4次,并检索10次。因此,如果我们有一个非常小的缓存(因为我想保持堆内存在密切控制下)我会期望处理前几个文件夹将他们的歌曲添加到缓存中,然后当他们从新文件夹中完成歌曲时将采取他们的放在缓存中。

So there are no particular song ids that are retrieved 1000s of times, but every song is typically written to between 1 and 4 times and retrieves 10 times. So if we had a quite small cache (because I want to keep heap memory under close control) I would expect the first few folders to be processed to have their songs added to the cache, then as they complete songs from new folders would take their place in the cache.

但我的问题是,它值得吗?

But my question is, is it worth it?

根据经验10次​​检索与1-4次写入是否有意义使用二级缓存,或仅在比率更像100:1时才有用?

As a rule of thumb does 10 retrievals versus 1-4 writes makes sense of using 2nd level cache, or is only useful if the ratio is more like 100:1?

推荐答案

真正的答案是:只需对它进行基准测试。

写入堆缓存并不是那么昂贵。所以是的,即使从缓存中检索一次也会更快,然后回到数据库。

Writing to heap cache isn't that costly. So yes, even retrieving once from the cache will make it faster then boing back to the database.

然后,缓存主要在HashMap之上做两件事。它驱逐并过期。

Then, a cache does mostly two things on top of a HashMap. It evicts and expires.

驱逐意味着你设置一些最大大小到缓存。达到此值后,缓存将逐出最旧条目以添加新条目。最老的有多种定义。 Ehcache对一组条目进行抽样,并将样本中最长时间未访问的条目踢出。

Eviction means that you set some maximum size to the cache. When this is reached, the cache will evict the "oldest" entry to add a new one. There are multiple definitions for oldest. Ehcache does a sampling over a set of entries and kicks out the entry that wasn't accessed for the longest time in the sample.

到期意味着给定的条目将是在某些时候被认为是陈旧的。例如,您希望在使用数据库中的最新条目刷新条目前1小时保留一个条目。当您收到条目时,Ehcache会首先查看条目是否已过期。如果是,则返回null并从缓存中删除该条目。这意味着过期的条目将保留在缓存中,直到您尝试访问它。

Expiration means that a given entry will be considered stale at some point. For instance, you want to keep an entry 1 hour before refreshing the entry with the latest one in the database. When you get an entry, Ehcache first looks if the entry is expired. If it is, it will return null and remove the entry from the cache. It means that an expired entry will stay in the cache until you try to access it.

在您的情况下,您将需要加载一次条目。然后把它放在缓存中。使用它,最后删除它以节省内存。如果您有最后一步,您知道您不再需要该条目,只需将其删除即可。

In your case, you will want to load the entry once. Then have it in cache. Use it and finally remove it to save memory. If you have a final step where you know you won't need the entry anymore, just remove it there.

如果您不这样做,您将不得不依赖驱逐。因为驱逐算法会首先删除过期的条目(为什么删除一个完全有效的条目,如果你可以删除过期的条目?)。

If you don't, you will have to rely on eviction. Because the eviction algorithm will remove expired entries first (why removing a perfectly valid entry if you can remove expired one?).

你应该计算一个条目应该停留的时间在缓存中通过所有执行程序。这将是您的到期时间(TTL)。然后,您或多或少地将缓存大小调整为<​​code> NB_EXECUTORS * NB_STEPS 。然后它将是当前使用的歌曲的大小。添加新歌曲时,缓存需要逐出旧条目。在大多数情况下,此条目将过期,因此不会造成任何损害。

You should calculate how many time an entry should stay in cache to go through all the Executors. This will be your expiry time (TTL). Then you size your cache more or less to NB_EXECUTORS * NB_STEPS. It will then be the size of the currently in used songs. When adding a new song, the cache will need to evict an old entry. In most cases, this entry will be expired so no harm done.

为防止驱逐(在未找到过期条目时成本高昂),您可以编写背景代码获取条目的例程。它会触发到期。但是,在确定使用基准测试之前,请不要这样做,它实际上更快。

To prevent eviction (which can be costly when not finding an expired entry), you can code a background routine that gets entries. It will trigger expiration. But again, don't do that before being sure, using a benchmark, that it is actually faster.

最后,您可能希望直接缓存歌曲而不是使用Hibernate级别2.因为它需要较少的操作才能获得歌曲。此外,当编写二级缓存中的条目时,Hibernate倾向于从缓存中逐出。确保将其配置为 NOT 执行此操作。

Finally, you might want to cache a song directly instead of using Hibernate level 2. Because it will require less operation to get the song. Also, when writing an entry that was in second-level cache, Hibernate tend to evict from the cache. Make sure you configure it to NOT do that.

有关修改的说明。默认情况下,Ehcache堆上缓存(仅限堆上缓存)是每个引用。因此,如果从缓存中检索Song对象然后对其进行修改,则缓存中的条目也会被修改,因为它实际上是唯一的实例。

A note about modification. By default Ehcache on-heap cache (and only on-heap cache) is per reference. So if you retrieve a Song object from the cache and then modify it, the entry in cache is modified as well since it's actually the one and only instance.

但是,这是不是Hibernate二级缓存如何工作。他们将在缓存中保留某种数据库行。这将被转换为歌曲并返回给你。

However, that's not how Hibernate second level cache works. They will keep in cache some kind of database row. This will be converted to the Song and returned to you.

当你将歌曲保存到数据库时,Hibernate会像我上面所说的那样从缓存中逐出它(但是你可能会要求配置中的缓存更新,我不确定。)

When you save the Song to database, Hibernate will evict it from the cache as I was saying above (but you might ask for a cache update in the configuration, I'm not sure about that).

这就是为什么我认为你应该直接缓存而不是使用二级缓存。但请注意,因为你得到了一个由Hibernate加载的对象。在将其放入缓存之前,您需要将其从Hibernate中分离出来。然后将其附加到新执行程序中。否则,如果您有集合,例如,可能会发生奇怪的事情。

That's why I think you should cache directly instead of using second-level cache. However, watch out because you get an object loaded by Hibernate. You need to detach it from Hibernate before putting it in cache. And then attach it in the new executor. Otherwise, if you have collections for instance, strange things can happen.

现在,假设您希望每次都更新缓存和数据库。你有两种方法可以做到这一点。

Now, assuming you want to update the cache and database every time. You have two ways to do it.

使用Cache-aside,你将更新数据库,然后更新缓存。

With Cache-aside, you will update the DB then update the cache.

使用Cache-through,您将更新缓存,这将更新(注意)更新数据库。由于需要提供 CacheLoaderWriter 实现,因此Cache-through更复杂一些。但它确保缓存和数据库始终保持同步。

With Cache-through, you will update the cache which will take care (atomitacally) of updating the DB. Cache-through is a little more involved since you need to provide a CacheLoaderWriter implementation. But it makes sure the cache and database are always in sync.

这篇关于具有Hibernate二级缓存的CPU优势何时超过初始命中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆