缓存失效——有通用的解决方案吗? [英] Cache Invalidation — Is there a General Solution?

查看:26
本文介绍了缓存失效——有通用的解决方案吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

计算机科学中只有两个难题:缓存失效和命名事物."

"There are only two hard problems in Computer Science: cache invalidation and naming things."

菲尔·卡尔顿

是否有使缓存失效的通用解决方案或方法;知道条目何时过时,从而保证您始终获得最新数据?

Is there a general solution or method to invalidating a cache; to know when an entry is stale, so you are guaranteed to always get fresh data?

例如,考虑一个从文件中获取数据的函数 getData().它根据文件的最后修改时间对其进行缓存,每次调用时都会检查该时间.
然后添加第二个函数 transformData() 来转换数据,并在下次调用该函数时缓存其结果.它不知道文件 - 你如何添加如果文件被更改,此缓存将失效的依赖项?

For example, consider a function getData() that gets data from a file. It caches it based on the last modified time of the file, which it checks every time it's called.
Then you add a second function transformData() which transforms the data, and caches its result for next time the function is called. It has no knowledge of the file - how do you add the dependency that if the file is changed, this cache becomes invalid?

您可以在每次调用 transformData() 时调用 getData() 并将其与用于构建缓存的值进行比较,但最终可能会非常昂贵.

You could call getData() every time transformData() is called and compare it with the value that was used to build the cache, but that could end up being very costly.

推荐答案

您所说的是生命周期依赖链,即一件事依赖于另一件事,而另一件事可以在其控制之外进行修改.

What you are talking about is lifetime dependency chaining, that one thing is dependent on another which can be modified outside of it's control.

如果你有一个从 a, bc 的幂等函数,其中,如果 ab 相同,c 相同,但检查 b 的成本很高,那么你要么:

If you have an idempotent function from a, b to c where, if a and b are the same then c is the same but the cost of checking b is high then you either:

  1. 接受您有时会使用过时的信息进行操作,不要总是检查b
  2. 尽你所能尽快检查b

你不能吃蛋糕然后吃它...

You cannot have your cake and eat it...

如果您可以在顶部基于 a 分层附加缓存,那么这会影响最初的问题而不是一点点.如果你选择了 1,那么你有任何你给自己的自由,因此可以缓存更多,但必须记住考虑 b 缓存值的有效性.如果您选择 2,您仍然必须每次都检查 b,但如果 b 检出,则可以回退到 a 的缓存中.

If you can layer an additional cache based on a over the top then this affects the initial problem not one bit. If you chose 1 then you have whatever freedom you gave yourself and can thus cache more but must remember to consider the validity of the cached value of b. If you chose 2 you must still check b every time but can fall back on the cache for a if b checks out.

如果您对缓存进行分层,您必须考虑是否由于组合行为而违反了系统的规则".

If you layer caches you must consider whether you have violated the 'rules' of the system as a result of the combined behaviour.

如果你知道 ab 的情况下总是有效的,那么你可以像这样安排你的缓存(伪代码):

If you know that a always has validity if b does then you can arrange your cache like so (pseudocode):

private map<b,map<a,c>> cache // 
private func realFunction    // (a,b) -> c

get(a, b) 
{
    c result;
    map<a,c> endCache;
    if (cache[b] expired or not present)
    {
        remove all b -> * entries in cache;   
        endCache = new map<a,c>();      
        add to cache b -> endCache;
    }
    else
    {
        endCache = cache[b];     
    }
    if (endCache[a] not present)     // important line
    {
        result = realFunction(a,b); 
        endCache[a] = result;
    }
    else   
    {
        result = endCache[a];
    }
    return result;
}

显然连续分层(比如x)是微不足道的,只要在每个阶段新添加的输入的有效性匹配a:bx:bx:a.

Obviously successive layering (say x) is trivial so long as, at each stage the validity of the newly added input matches the a:b relationship for x:b and x:a.

然而,您很有可能获得三个输入,其有效性完全独立(或循环),因此不可能进行分层.这意味着标记为//important 的行必须更改为

However it is quite possible that you could get three inputs whose validity was entirely independent (or was cyclic), so no layering would be possible. This would mean the line marked // important would have to change to

if (endCache[a] 过期或不存在)

if (endCache[a] expired or not present)

这篇关于缓存失效——有通用的解决方案吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆