并发缓存共享模式 [英] Pattern for concurrent cache sharing

查看:168
本文介绍了并发缓存共享模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Ok我有点不确定如何最好命名这个问题:)但假设这个场景,你是
出去和获取一些网页(与各种网址)和本地缓存。缓存部分很容易解决,即使有多个线程。



但是,想象一个线程开始获取一个url,几个毫秒后,另一个想得到相同的网址。有什么好的模式使秒线程的方法等待第一个获取页面,将其插入缓存并返回它,所以你不必做多个请求。有足够的开销,即使对于需要大约300-700毫秒的请求,它值得做吗?



基本上,当相同网址的请求紧紧相连后,我想要第二个请求捎带第一个请求



我有一个松散的想法有一个字典,你插入一个对象与键作为url当你开始抓取一个页面并锁定它。如果有任何匹配的键已经它获得的对象,锁定,然后尝试获取实际缓存的URL。



我有点不确定但是要使它真正线程安全,使用ConcurrentDictionary可能是它的一部分...



这样的场景有什么共同的模式和解决方案? p>

细分错误行为:



线程1:检查缓存,它不存在,因此开始获取url



线程2:开始抓取相同的网址,因为它仍然不存在于缓存



线程1:进入缓存,返回页



线程2:完成并插入缓存(或丢弃),返回页面



细分正确的行为:



线程1:检查缓存,它不存在,因此开始获取URL



线程2:想要相同的url,但看到它当前被抓取,因此在线程1上等待



线程1:完成并插入缓存,页



线程2:通知线程1已完成并返回页面线程1



> EDIT



大多数解决方案sofar似乎误解了这个问题,只解决了缓存,正如我所说的是不是问题,问题是当做外部网络抓取使第二次抓取在第一次缓存之前完成以使用第一次抓取的结果,而不是第二次

解决方案

您可以使用 ConcurrentDictionary< K,V> 的变体双重锁定

  public static string GetUrlContent(string url)
{
object value1 = _cache.GetOrAdd(url,new object());

if(value1 == null)//只有在内容为null时才需要null检查。 //可以合法地是一个空字符串

var urlContent = value1 as string;
if(urlContent!= null)
return urlContent; //获得内容

// value1不是一个字符串,意味着它是一个锁定对象
lock(value1)
{
对象value2 = _cache [url];

//此时value2将*作为url的内容
// *或*我们已经拥有锁的对象
if(value2!= value1 )
return(string)value2; // got the content

urlContent = FetchContentFromTheWeb(url); // todo
_cache [url] = urlContent;
return urlContent;
}
}

private static readonly ConcurrentDictionary< string,object> _cache =
new ConcurrentDictionary< string,object>();


Ok I was a little unsure on how best name this problem :) But assume this scenarion, you're going out and fetching some webpage (with various urls) and caching it locally. The cache part is pretty easy to solve even with multiple threads.

However, imagine that one thread starts fetching an url, and a couple of milliseconds later another want to get the same url. Is there any good pattern for making the seconds thread's method wait on the first one to fetch the page , insert it into the cache and return it so you don't have to do multiple requests. With little enough overhead that it's worth doing even for requests that take about 300-700 ms? And without locking requests for other urls

Basically when requests for identical urls comes in tightly after each other I want the second request to "piggyback" the first request

I had some loose idea of having a dictionary where you insert an object with the key as url when you start fetching a page and lock on it. If there's any matching the key already it get's the object, locks on it and then tries to fetch the url for the actual cache.

I'm a little unsure of the particulars however to make it really thread-safe, using ConcurrentDictionary might be one part of it...

Is there any common pattern and solutions for scenarios like this?

Breakdown wrong behavior:

Thread 1: Checks the cache, it doesnt exists so starts fetching the url

Thread 2: Starts fetching the same url since it still doesn't exist in Cache

Thread 1: finished and inserts into the cache, returns the page

Thread 2: Finishes and also inserts into cache (or discards it), returns the page

Breakdown correct behavior:

Thread 1: Checks the cache, it doesnt exists so starts fetching the url

Thread 2: Wants the same url, but sees it's currently being fetched so waits on thread 1

Thread 1: finished and inserts into the cache, returns the page

Thread 2: Notices that thread 1 is finished and returns the page thread 1 it fetched

EDIT

Most solutions sofar seem to misunderstand the problem and only addressing the caching, as I said that isnt the problem, the problem is when doing an external web fetch to make the second fetch that is done before the first one has cached it to use the result from the first rather then doing a second

解决方案

You could use a ConcurrentDictionary<K,V> and a variant of double-checked locking:

public static string GetUrlContent(string url)
{
    object value1 = _cache.GetOrAdd(url, new object());

    if (value1 == null)    // null check only required if content
        return null;       // could legitimately be a null string

    var urlContent = value1 as string;
    if (urlContent != null)
        return urlContent;    // got the content

    // value1 isn't a string which means that it's an object to lock against
    lock (value1)
    {
        object value2 = _cache[url];

        // at this point value2 will *either* be the url content
        // *or* the object that we already hold a lock against
        if (value2 != value1)
            return (string)value2;    // got the content

        urlContent = FetchContentFromTheWeb(url);    // todo
        _cache[url] = urlContent;
        return urlContent;
    }
}

private static readonly ConcurrentDictionary<string, object> _cache =
                                  new ConcurrentDictionary<string, object>();

这篇关于并发缓存共享模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆