memcached数据高速缓存的设计模式 [英] Design pattern for memcached data caching

查看:178
本文介绍了memcached数据高速缓存的设计模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很容易在您现有的数据库查询中包装可选的memcached缓存。例如:

It's easy to wrap optional memcached caching around your existing database queries. For example:

旧(仅限数据库):

function getX
    x = get from db
    return x
end

(具有内存缓存的DB):

New (DB with memcache):

function getX
    x = get from memcache
    if found
      return x
    endif

    x = get from db
    set x in memcache
    return x
end

事情是这样的,这并不总是你想要缓存的方式。例如采取以下两个查询:

The thing is though, that's not always how you want to cache. For instance take the following two queries:

-- get all items (recordset)
SELECT * FROM items;

-- get one item (record)
SELECT * FROM items WHERE pkid = 42;

如果我使用上面的伪代码来处理缓存,的项目42两次。一旦在大记录集和一次自己。而我宁愿做这样的事情:

If I was to use the above pseudo-code to handle the caching, I would be storing all fields of item 42 twice. Once in the big record set and once on its own. Whereas I'd rather do something like this:

SELECT pkid FROM items;

并缓存PK的索引。

因此,总而言之,对DB最有效的数据访问策略并不适合内存缓存策略。因为我想要memcache层是可选的(即如果memcache是​​down,网站仍然工作)我想有两个世界的最好的,但这样做,我很确定我需要维护一个很多的查询在2种不同的形式(1. fetch索引,然后记录;和2.在一个查询中提取记录集)。分页变得更复杂。使用DB你可以执行LIMIT / OFFSET SQL查询,但是使用memcache,你只需要获取PK的索引,然后批量获取数组的相关切片。

So in summary, the data access strategy that will work best for the DB doesn't neatly fit the memcache strategy. Since I want the memcache layer to be optional (i.e. if memcache is down, the site still works) I kind of want to have the best of both worlds, but to do so, I'm pretty sure I'll need to maintain a lot of the queries in 2 different forms (1. fetch index, then records; and 2. fetch recordset in one query). It gets more complicated with pagination. With the DB you'd do LIMIT/OFFSET SQL queries, but with memcache you'd just fetch the index of PK's and then batch-get the relevant slice of the array.

我不知道如何整洁地设计这个,有谁有任何建议吗?

I'm not sure how to neatly design this, does anyone have any suggestions?

更好的是,如果你已经反对这一点。如何处理?

Better yet, if you've come up against this yourself. How do you handle it?

推荐答案

如果您使用缓存,那么为了充分利用它,接受您的数据将始终陈旧到一定程度,并且数据的某些部分将不会彼此同步。试图通过维护一个副本保持所有的记录是最新的是最好留给关系数据库,所以如果这是你需要的行为,那么你可能更好的一个功能强大的64位数据库服务器与大量的RAM所以它可以执行自己的内部缓存。

If you're using a cache then, to get the most out of it, you have to accept that your data will always be stale to an extent, and that some portions of the data will be out of sync with each other. Trying to keep all the records up to date by maintaining a single copy is something best left to relational databases, so if this is the behaviour you need then you're probably better off with a powerful 64-bit DB server with a lot of RAM so it can perform its own internal caching.

如果你可以接受陈旧的数据(如果真正的可扩展性是重要的,你需要),一种方法是只是抛出将整个结果集放入缓存;不用担心重复。 RAM便宜。如果你发现你的缓存已满,那么只购买更多的RAM和/或缓存服务器。例如,如果您有一个查询,它表示通过条件X和Y过滤的集合中的项目1-24,然后使用包含所有这些信息的缓存键,然后当再次请求该相同的搜索时,只返回整个结果集缓存。您可以在一次击中从缓存中获取完整的结果集,也可以转到数据库。

If you can accept stale data (which you'll need to if real scalability is important) then one approach is to just throw the whole result set into the cache; don't worry about duplication. RAM is cheap. If you find your cache is getting full then just buy more RAM and/or cache servers. For example if you have a query that represents items 1-24 in a set filtered by conditions X and Y then use a cache key that contains all this information, and then when asked for that same search again just return the entire result set from the cache. You either get the full result set from the cache in one hit, or you go to the database.

最难的是计算出多少数据可以失效,

The hardest thing is working out how much data can be stale, and how stale it can be without either (a) people noticing too much, or (b) breaking business requirements such as minimum update intervals.

这种方法适用于只读操作,因为它不需要(a)人注意太多,或者(b)主要是应用,特别是具有分页查询和/或用于数据的过滤标准的有限集合的应用。这也意味着您的应用程序工作与缓存打开或关闭完全相同,只有缓存关闭时的0%命中率。这是我们在blinkBox在几乎所有情况下采取的方法。

This approach works well for read-mostly applications, particularly ones that have paged queries and/or a finite set of filter criteria for the data. It also means that your application works exactly the same with the cache on or off, just with 0% hit rate when the cache is off. It's the approach we take at blinkBox in almost all cases.

这篇关于memcached数据高速缓存的设计模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆