何时缓存任务? [英] When to cache Tasks?

查看:57
本文介绍了何时缓存任务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在观看异步禅宗:实现最佳性能的最佳做法 Stephen Toub 开始谈论任务缓存,而不是缓存任务作业的结果,而是缓存任务本身.据我了解,为每项工作开始一项新任务都是很昂贵的,应将其尽量减少.在28:00左右,他展示了这种方法:

I was watching The zen of async: Best practices for best performance and Stephen Toub started to talk about Task caching, where instead of caching the results of task jobs you cache the tasks themselves. As far as i understood starting a new task for every job is expensive and it should be minimized as much as possible. At around 28:00 he showed this method:

private static ConcurrentDictionary<string, string> s_urlToContents;

public static async Task<string> GetContentsAsync(string url)
{
    string contents;
    if(!s_urlToContents.TryGetValue(url, out contents))
    {
        var response = await new HttpClient().GetAsync(url);
        contents = response.EnsureSuccessStatusCode().Content.ReadAsString();
        s_urlToContents.TryAdd(url, contents);
    }
    return contents;
}

乍一看,这似乎是一种考虑周全的方法,您可以在其中缓存结果,但我丝毫没有想到要缓存获取内容的工作.

Which at a first look looks like a good thought out method where you cache results, i didn't event think about caching the job of getting the contents.

然后他展示了这种方法:

And than he showed this method:

private static ConcurrentDictionary<string, Task<string>> s_urlToContents;

public static Task<string> GetContentsAsync(string url)
{
    Task<string> contents;
    if(!s_urlToContents.TryGetValue(url, out contents))
    {
        contents = GetContentsAsync(url);
        contents.ContinueWith(t => s_urlToContents.TryAdd(url, t); },
        TaskContinuationOptions.OnlyOnRanToCompletion |
        TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
    }
    return contents;
}

private static async Task<string> GetContentsAsync(string url)
{
    var response = await new HttpClient().GetAsync(url);
    return response.EnsureSuccessStatusCode().Content.ReadAsString();
}

除了了解结果的存储方式之外,我很难理解它的实际作用.

I have trouble understanding how this actually helps more than just storing the results.

这是否意味着您使用更少的任务来获取数据?

Does this mean that you're using less Tasks to get the data?

而且,我们如何知道何时缓存任务?据我了解,如果您将缓存放在错误的位置,您只会承受大量负担,并给系统带来过多压力

And also, how do we know when to cache tasks? As far as i understand if you're caching in the wrong place you just get a load of overhead and stress the system too much

推荐答案

让我们假设您正在与一个使用城市名称并返回其邮政编码的远程服务进行交谈.该服务是远程的并且处于负载状态,因此我们正在讨论具有异步签名的方法:

Let's assume you are talking to a remote service which takes the name of a city and returns its zip codes. The service is remote and under load so we are talking to a method with an asynchronous signature:

interface IZipCodeService
{
    Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName);
}

由于该服务对于每个请求都需要一段时间,因此我们希望为其实现一个本地缓存.自然,高速缓存也将具有异步签名,甚至可能实现相同的接口(请参见Facade模式).同步签名会破坏从不与.Wait()、. Result或类似名称同步调用异步代码的最佳实践.至少缓存应将其留给调用方.

Since the service needs a while for every request we would like to implement a local cache for it. Naturally the cache will also have an asynchronous signature maybe even implementing the same interface (see Facade pattern). A synchronous signature would break the best-practice of never calling asynchronous code synchronously with .Wait(), .Result or similar. At least the cache should leave that up to the caller.

因此,我们对此进行第一次迭代:

So let's do a first iteration on this:

class ZipCodeCache : IZipCodeService
{
    private readonly IZipCodeService realService;
    private readonly ConcurrentDictionary<string, ICollection<ZipCode>> zipCache = new ConcurrentDictionary<string, ICollection<ZipCode>>();

    public ZipCodeCache(IZipCodeService realService)
    {
        this.realService = realService;
    }

    public Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName)
    {
        ICollection<ZipCode> zipCodes;
        if (zipCache.TryGetValue(cityName, out zipCodes))
        {
            // Already in cache. Returning cached value
            return Task.FromResult(zipCodes);
        }
        return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) =>
        {
            this.zipCache.TryAdd(cityName, task.Result);
            return task.Result;
        });
    }
}

如您所见,缓存不缓存Task对象,而是ZipCode集合的返回值.但是这样做必须通过调用Task.FromResult为每个命中的缓存构造一个Task,我认为这正是Stephen Toub试图避免的事情.一个Task对象会带来额外的开销,尤其是对于垃圾回收器来说,因为您不仅要创建垃圾,而且每个Task都有一个终结器,运行时需要考虑该终结器.

As you can see the cache does not cache Task objects but the returned values of ZipCode collections. But by doing so it has to construct a Task for every cache hit by calling Task.FromResult and I think that is exactly what Stephen Toub tries to avoid. A Task object comes with overhead especially for the garbage collector because you are not only creating garbage but also every Task has a Finalizer which needs to be considered by the runtime.

解决此问题的唯一方法是缓存整个Task对象:

The only option to work around this is by caching the whole Task object:

class ZipCodeCache2 : IZipCodeService
{
    private readonly IZipCodeService realService;
    private readonly ConcurrentDictionary<string, Task<ICollection<ZipCode>>> zipCache = new ConcurrentDictionary<string, Task<ICollection<ZipCode>>>();

    public ZipCodeCache2(IZipCodeService realService)
    {
        this.realService = realService;
    }

    public Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName)
    {
        Task<ICollection<ZipCode>> zipCodes;
        if (zipCache.TryGetValue(cityName, out zipCodes))
        {
            return zipCodes;
        }
        return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) =>
        {
            this.zipCache.TryAdd(cityName, task);
            return task.Result;
        });
    }
}

您可以看到通过调用Task.FromResult创建的Tasks已经消失.此外,在使用async/await关键字时无法避免创建此Task,因为在内部,无论您缓存了什么代码,它们都会创建一个Task以返回.像这样:

As you can see the creation of Tasks by calling Task.FromResult is gone. Furthermore it is not possible to avoid this Task creation when using the async/await keywords because internally they will create a Task to return no matter what your code has cached. Something like:

    public async Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName)
    {
        Task<ICollection<ZipCode>> zipCodes;
        if (zipCache.TryGetValue(cityName, out zipCodes))
        {
            return zipCodes;
        }

无法编译.

不要被Stephen Toub的ContinueWith标志 TaskContinuationOptions.OnlyOnRanToCompletion TaskContinuationOptions.ExecuteSynchronously 所迷惑.它们(仅仅是)另一个与缓存任务的主要目标无关的性能优化.

Don't get confused by Stephen Toub's ContinueWith flags TaskContinuationOptions.OnlyOnRanToCompletion and TaskContinuationOptions.ExecuteSynchronously. They are (only) another performance optimization which is not related to the main objective of caching Tasks.

与每个缓存一样,您应该考虑不时清理缓存并删除太旧或无效的条目的机制.您还可以实施一项策略,将缓存限制为n个条目,并通过引入一些计数来尝试缓存请求最多的项目.

As with every cache you should consider some mechanism which clean the cache from time to time and remove entries which are too old or invalid. You could also implement a policy which limits the cache to n entries and trys to cache the items requested most by introducing some counting.

在有和没有缓存任务的情况下,我都做了一些基准测试.您可以在 http://pastebin.com/SEr2838A 中找到代码,结果在我的计算机上看起来像这样(带有.NET4.6)

I did some benchmarking with and without caching of Tasks. You can find the code here http://pastebin.com/SEr2838A and the results look like this on my machine (w/ .NET4.6)

Caching ZipCodes: 00:00:04.6653104
Gen0: 3560 Gen1: 0 Gen2: 0
Caching Tasks: 00:00:03.9452951
Gen0: 1017 Gen1: 0 Gen2: 0

这篇关于何时缓存任务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆