缓存策略,什么时候缓存变得毫无意义? [英] Caching strategy, when does caching become pointless?

查看:64
本文介绍了缓存策略,什么时候缓存变得毫无意义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对缓存策略和实现还很陌生.我正在从事一个项目,该项目将是数据库密集型的,但是信息也会定期更新和更改.

I'm pretty new to caching strategies and implementations. I'm working on a project that will be database intensive, but also have information being updated and changed very regularly.

我已经找到了足够的信息来大致了解如何开发缓存功能,但是我不确定的是一般策略.

I've found enough info to know generally how to develop the caching function, but what I'm unsure about is the general strategy.

如果我缓存所有查询结果,并按可以在有意义的触发器上清除的逻辑事物将它们分组,则我的缓存中可能会有成千上万个(至少)微小文件.仅缓存较大的查询结果是否更有意义?

If I cache all query results and group them by logical things that I can clear on triggers that make sense, I'll probably have tens of thousands (at least) tiny files in my cache. Would it make more sense to cache only large query results?

我知道这是一个特定于硬件的问题,但是一般而言,缓存在什么数量的文件上变得毫无意义?意思是,如果您正在使用所有这些小文件加载文件系统,对它们的访问最终是否会变得足够慢,以至于您可能还没有缓存信息呢?

I know that this is a somewhat hardware specific question, but generally speaking at what volume of files does caching become somewhat pointless? Meaning, if you're loading up the file system with all of these tiny files, does access to them eventually become slow enough that you might as well have just not cached the information to start with?

谢谢您,我对您提供的任何意见都很感兴趣

Thanks all, I'm interested in any opinions you have to offer

基于对此的回答完全是特定于应用程序的,让我以这种方式提出这个问题,它应该是普遍的:

Based on the responses regarding this being absolutely application specific, let me pose the question this way which should be universal:

假设我有一个依赖于其中有1,000,000个项目的表的应用程序...

Assuming that I have an application that depends on one table with 1,000,000 items in it...

执行查询以直接从数据库中检索这些项目之一,或者从我的缓存目录中检索包含1,000,000个文件的每个项目,其中每个文件都包含其中一项的详细信息,会更快吗?

Would it be quicker to do a query to retrieve one of those items directly from the database, or to retrieve one of those items from my cache directory with 1,000,000 files, each containing the details of one of those items?

显然100,000不足以得到有效的答案,让我们将其设为1,000,000.任何人都想以1,000,000,000的价格去购买吗?因为我能做到...

Apparently 100,000 wasn't enough to get a valid answer, let's make it 1,000,000. Anyone want to go for 1,000,000,000? Because I can do it...

推荐答案

使用MySQL的内置查询缓存,而不是尝试自己维护它.写入表时,它将自动清除对表的缓存查询.另外,它可以在内存中工作,因此应该非常高效...

Use MySQL's built in query cache instead of trying to maintain it yourself. It will automatically clear cached queries to tables when they are written to. Plus, it works in memory so it should be very efficient...

此外,不要只是缓存查询.尝试在呈现周期的不同阶段缓存应用程序的整个段.因此,您可以让MySQL缓存查询,然后缓存每个单独的视图(呈现),每个单独的块和每个页面.然后,您可以根据请求选择是否从缓存中提取.

Also, don't just cache queries. Try to cache entire segments of the application at different stages in the rendering cycle. So you can let MySQL cache the queries, then you cache each individual view (rendered), each individual block, and each page. Then, you can choose whether or not to pull from cache based upon the request.

例如,非登录用户可以直接从缓存中获取整个页面.但是登录的用户可能无法(由于用户名等).因此,对于他来说,您也许可以从缓存中呈现页面上1/2的视图(因为它们不依赖于用户对象).您仍然可以从缓存中受益,但是会根据需要进行分层.

For example, a non-logged-in user may get the full page directly from cache. But a logged-in user may not be able to (due to username, etc). So for him, you may be able to render 1/2 your views on the page from cache (since they don't depend on the user object). You still get the benefit of caching, but it'll be tiered based upon need.

如果您确实希望获得大量流量,则绝对值得研究 Memcached .让MySQL为您存储查询,然后将所有用户级缓存项存储在memcache中...

If you're really expecting a lot of traffic, it's definitely worth looking into Memcached. Let MySQL store your queries for you, and then store all user-land cache items in memcache...

回答您的

如果单个目录变大,文件系统可能会变慢.只要按目录命名"(这样每个目录仅占缓存文件的一小部分),从该角度来看就可以了.至于确切的阈值,它实际上将更多地取决于您的硬件和文件系统.我知道EXT3会变得很慢,如果单个目录中有大量文件(我的目录实际上有成千上万个文件,而简单地stat()其中一个文件,可能要花半秒的时间,更不用说了做任何种类的目录清单)...

Filesystems can become slow if a single directory grows big. As long as you're "namespacing" by directory (so each directory only has a small portion of cache files), you should be fine from that standpoint. As for the exact threshold, it really will depend on your hardware and filesystem more than anything else. I know EXT3 gets quite slow if there are a load of files in a single directory (I have directories with literally hundreds of thousands of files, and it can take up to half a second to simply stat() one of the files, let alone do any kind of directory listing)...

但是请意识到,如果添加另一台服务器,则将具有重复的缓存(这不是一件好事),或者将不得不重写整个缓存层.有没有理由从一开始就不使用Memcached?

But realize that if you add another server, you're going to either have duplication of cache (which is not a good thing), or are going to have to rewrite your entire cache layer. Is there a reason not to go with Memcached right from the start?

EDit 2:回答您的最新

仍然很难打电话.我有一个应用程序,它的数据库大约有15亿行(每天增长约50万行).我们根本不使用任何缓存,因为我们没有并发问题.而且即使这样做,我们还是最好向它抛出更多的MySQL服务器,而不是添加缓存,因为任何形式的缓存命中率都非常低,以至于没有时间花很多时间来添加它.

It's still too tough to call. I have an application that has a database with around 1.5 billion rows (growing at around 500k per day). We don't use any caching on it at all because we don't have concurrency issues. And even if we did, we'd be better off throwing more MySQL servers at it rather than adding caching since any form of cache would have such a low hit rate that it wouldn't be worth the development time to add it.

这就是我坚决不缓存速度的原因.总会有一个不在缓存中的对象.因此,如果您用这些对象之一访问页面,它仍然需要快速.根据经验,我会尝试缓存将在接下来的几分钟内再次访问的所有内容(无论如何,我会在其他应用程序上保留大约5分钟的生产时间).因此,如果项目在这段时间内获得的点击次数不多,或者点击率非常低(不到90%),那么我就不会在缓存该项目....

And that's the reason I am so adamant about not caching for speed. There will always be an object that is not in cache. So if you hit a page with one of those objects, it still needs to be fast. As a rule of thumb, I try to cache anything that will be accessed again in the next few minutes (I keep a time to live of about 5 minutes in production on other applications anyway). So if items aren't getting more than a few hits in that time span, or the hit rate is very low (less than 90%), I don't bother caching that item....

这篇关于缓存策略,什么时候缓存变得毫无意义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆