比TPL Dataflow更简单的解决方案,用于并行异步Blob删除 [英] Simpler solution than TPL Dataflow for parallel async blob deletion

查看:84
本文介绍了比TPL Dataflow更简单的解决方案,用于并行异步Blob删除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Azure上实现一个辅助角色,该角色需要从Azure存储中删除Blob.假设我的Blob列表中包含大约1万个项目.

I'm implementing a worker role on Azure which needs to delete blobs from Azure storage. Let's assume my list of blobs has about 10K items.

最简单的同步方法可能是:

The simplest synchronous approach would probably be:

Parallel.ForEach(list, x => ((CloudBlob) x).Delete());

要求:

  • 我想异步地(在一个线程上)实现同一件事.

  • I want to implement the same thing asynchronously (on a single thread).

我想将并发连接数限制为50-因此,当同时仅执行50个异步连接时,我将删除10K.如果删除完成,则可以开始新的删除.

I want to limit the number of concurrent connections to 50 - so I'll do my 10K deletions when only 50 async ones are being performed at the same time. If one deletion completes, a new one can be started.

解决方案?

到目前为止,在阅读了此问题

So far, after reading this question and this one, it seems that TPL Dataflow is the way to go.

这是一个简单的问题,数据流似乎过大了.有没有更简单的选择?

This is such a simple problem and dataflow seems like an overkill. Is there any simpler alternative?

如果没有,将如何使用数据流来实现?据我了解,我需要一个执行async删除的动作块(我是否需要await?).创建块时,应将MaxDegreeOfParallelism设置为50.然后,我需要将列表中的10K blob发布到该块,然后使用block.Completion.Wait()执行.这是正确的吗?

If not, how would this be implemented using dataflow? As I understand, I need a single action block which performs the async delete (do I need await?). When creating my block I should set MaxDegreeOfParallelism to 50. Then I need to post my 10K blobs from the list to the block and then execute with block.Completion.Wait(). Is this correct?

推荐答案

对于这种简单的内容,SemaphoreSlim就足够了. TPL数据流很棒,特别是如果您希望将工作限制在更大的管道的一部分中.但是,在您的情况下,听起来更像是您确实只需要执行一项需要限制的操作.

For something this simple, a SemaphoreSlim should suffice. TPL Dataflow is great, especially if you're looking to limit work in one part of a larger pipeline. However, in your scenario it sounds more like you really do just have one action that you need to throttle.

异步执行非常简单:

var semaphore = new SemaphoreSlim(50);
var tasks = list.Cast<CloudBlob>().Select(async x =>
{
    using (await semaphore.TakeAsync())
        await x.DeleteAsync();
});
await Task.WhenAll(tasks);

其中TakeAsync定义为:

private sealed class SemaphoreSlimKey : IDisposable
{
    private readonly SemaphoreSlim _semaphore;
    public SemaphoreSlimKey(SemaphoreSlim semaphore) { _semaphore = semaphore; }
    void IDisposable.Dispose() { _semaphore.Release(); }
}

public static async Task<IDisposable> TakeAsync(this SemaphoreSlim semaphore)
{
    await semaphore.WaitAsync().ConfigureAwait(false);
    return new SemaphoreSlimKey(semaphore);
}

这篇关于比TPL Dataflow更简单的解决方案,用于并行异步Blob删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆