比TPL Dataflow更简单的解决方案,用于并行异步Blob删除 [英] Simpler solution than TPL Dataflow for parallel async blob deletion
问题描述
我正在Azure上实现一个辅助角色,该角色需要从Azure存储中删除Blob.假设我的Blob列表中包含大约1万个项目.
I'm implementing a worker role on Azure which needs to delete blobs from Azure storage. Let's assume my list of blobs has about 10K items.
最简单的同步方法可能是:
The simplest synchronous approach would probably be:
Parallel.ForEach(list, x => ((CloudBlob) x).Delete());
要求:
-
我想异步地(在一个线程上)实现同一件事.
I want to implement the same thing asynchronously (on a single thread).
我想将并发连接数限制为50-因此,当同时仅执行50个异步连接时,我将删除10K.如果删除完成,则可以开始新的删除.
I want to limit the number of concurrent connections to 50 - so I'll do my 10K deletions when only 50 async ones are being performed at the same time. If one deletion completes, a new one can be started.
解决方案?
到目前为止,在阅读了此问题和
So far, after reading this question and this one, it seems that TPL Dataflow is the way to go.
这是一个简单的问题,数据流似乎过大了.有没有更简单的选择?
This is such a simple problem and dataflow seems like an overkill. Is there any simpler alternative?
如果没有,将如何使用数据流来实现?据我了解,我需要一个执行async
删除的动作块(我是否需要await
?).创建块时,应将MaxDegreeOfParallelism
设置为50.然后,我需要将列表中的10K blob发布到该块,然后使用block.Completion.Wait()
执行.这是正确的吗?
If not, how would this be implemented using dataflow? As I understand, I need a single action block which performs the async
delete (do I need await
?). When creating my block I should set MaxDegreeOfParallelism
to 50. Then I need to post my 10K blobs from the list to the block and then execute with block.Completion.Wait()
. Is this correct?
推荐答案
对于这种简单的内容,SemaphoreSlim
就足够了. TPL数据流很棒,特别是如果您希望将工作限制在更大的管道的一部分中.但是,在您的情况下,听起来更像是您确实只需要执行一项需要限制的操作.
For something this simple, a SemaphoreSlim
should suffice. TPL Dataflow is great, especially if you're looking to limit work in one part of a larger pipeline. However, in your scenario it sounds more like you really do just have one action that you need to throttle.
异步执行非常简单:
var semaphore = new SemaphoreSlim(50);
var tasks = list.Cast<CloudBlob>().Select(async x =>
{
using (await semaphore.TakeAsync())
await x.DeleteAsync();
});
await Task.WhenAll(tasks);
其中TakeAsync
定义为:
private sealed class SemaphoreSlimKey : IDisposable
{
private readonly SemaphoreSlim _semaphore;
public SemaphoreSlimKey(SemaphoreSlim semaphore) { _semaphore = semaphore; }
void IDisposable.Dispose() { _semaphore.Release(); }
}
public static async Task<IDisposable> TakeAsync(this SemaphoreSlim semaphore)
{
await semaphore.WaitAsync().ConfigureAwait(false);
return new SemaphoreSlimKey(semaphore);
}
这篇关于比TPL Dataflow更简单的解决方案,用于并行异步Blob删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!