并行IO绑定(网络)ForEach循环 [英] Parallelizing IO Bound (Network) ForEach Loop

查看:78
本文介绍了并行IO绑定(网络)ForEach循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据选择的选项,我有几种方法可以将整个目录上传到应用程序中的Amazon S3.当前,其中一个选项将并行执行多个目录的上载.我不确定这是否是个好主意,因为在某些情况下会加快上传速度,而在其他情况下会减慢上传速度.似乎有许多小目录时,速度会加快,但如果批处理中有大目录,则速度会降低.我正在使用下面看到的并行ForEach循环,并利用AWS API的TransferUtility.UploadDirectoryAsync()方法,例如:

I have a few different ways of upload entire directories to Amazon S3 within my application depending on what options are selected. Currently one of the options will perform an upload of multiple directories in parallel. I'm not sure if this is a good idea as in some cases it sped up the upload and other cases it slowed it down. The speed up appears to be when there are a bunch of small directories, but it slows down if there are large directories in the batch. I'm using the parallel ForEach loop seen below and utilizing the AWS API's TransferUtility.UploadDirectoryAsync() method as such:

Parallel.ForEach(dirs,myParallelOptions, 
                   async dir => { await MyUploadMethodAsync(dir) };

TransferUtility.UploadDirectoryAsync()方法在MyUploadMethodAsync()之内的位置. TransferUtility的上载方法都对单个文件并行执行零件的上载(如果大小足够大),因此对目录进行并行上载也可能会过大.显然,我们仍然受限于可用带宽的数量,因此这可能是浪费,我只应在UploadDirectoryAsync()方法中使用常规的foreach循环.谁能提供一些有关并行化是否不好的见解?

Where the TransferUtility.UploadDirectoryAsync() method is within MyUploadMethodAsync(). The TransferUtility's upload methods all perform parallel uploads of parts a single file (if the size is big enough to do so), so performing a parallel upload of the directory as well may be overkill. Obviously we are still limited to the amount of bandwidth available so this might be a waste and I just should just use a regular foreach loop with the UploadDirectoryAsync() method. Can anyone provide some insight on if this is bad case for parallelization?

推荐答案

您是否真的对此进行过测试?您使用它的方式,由于async lambda,Parallel.ForEach可能会在MyUploadMethodAsync中的任何一个完成之前返回好:

Did you actually test this? The way you're using it, Parallel.ForEach may return well before any of MyUploadMethodAsync is completed, because of the async lambda:

Parallel.ForEach(dirs,myParallelOptions, 
    async dir => { await MyUploadMethodAsync(dir) };

Parallel.ForEach适用于受CPU限制的任务.对于受IO约束的任务,您可能正在寻找类似以下的内容:

Parallel.ForEach is suited for CPU-bound tasks. For IO-bound tasks, you are probably looking for something like this:

var tasks = dirs.Select(dir => MyUploadMethodAsync(dir));
await Task.WhenAll(tasks);
// or Task.WaitAll(tasks) if you need a blocking wait

这篇关于并行IO绑定(网络)ForEach循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆