繁重的处理:阶段还是循环线程? [英] Heavy processing: stage or loop thread?

查看:63
本文介绍了繁重的处理:阶段还是循环线程?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要创建一个处理大量图像的程序.该过程中大约需要10个不同的阶段,这些阶段需要顺序发生.

I need to create a program that processes a huge amount of images. There are about 10 different stages in the process which need to happen sequentially.

我想问一问,使用下面描述的管道模式,在每个处理阶段都有自己的线程和缓冲区的情况下,创建一个管道是否更好: https://msdn.microsoft.com/en-us/library/ff963548.aspx

I wanted to ask if it is better to create a pipeline where each processing stage has its own thread and buffers in between using the pipeline pattern described here: https://msdn.microsoft.com/en-us/library/ff963548.aspx

或创建一个线程池并仅使用Parallel.Foreach将一个图像分配给一个线程.

or create a thread pool and assign one image to one thread by just using Parallel.Foreach.

为什么?

推荐答案

老实说,如果没有对基准进行基准测试,实际上是没有办法说出来的.但是,实际上您可以使用 TPL数据流.

Honestly, there really is no way to tell without actually benchmarking it. However you actually may be able to both parallel and a pipeline at the same time using TPL Dataflow.

管道中的每个阶段都是 TransformBlock<TInput, TOutput> ,那么可以并行处理的阶段可以具有其平行度集.

Each stage in the pipeline would be a TransformBlock<TInput, TOutput> then the stages that could be processed in parallel can have its Degree of Parallelism set.

这里是一个示例(在浏览器中编写,因此可能会有错误),它使用3级流水线加载图像,以便从磁盘读取,裁剪图像,然后将其写回到磁盘.读写阶段一次只能处理1张图像,但裁剪阶段将同时处理5张图像.此外,管道仅允许排队等待100张图像进行写入,而排队等待100张以上图像进行裁剪.如果管道已满,它将停止读取图像,并等待直到管道中有空间为止(防止过度使用RAM来存储图像).

Here is an example (written in browser so may have errors), it loads images with a 3 stage pipeline for reading from the disk, cropping an image, then writing it back to the disk. The read and write phase only do 1 image at a time but the crop phase will process 5 images concurrently. Also the pipeline only lets 100 images be queued out to write and 100 more images to be queued out to be cropped. If the pipeline gets full it will stop reading in images and wait till there is room in the pipeline (preventing overuse of RAM for images).

public async Task CropImages(string directory, int x, int y)
{
    var loadImage = new TransformBlock<String, MyImage>(LoadImageAsync);
    var cropImage = new TransformBlock<MyImage, MyImage>((image) => Crop(image, x, y),
                                                         new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 5});
    var saveImage = new ActionBlock(SaveImageAsync);


   loadImage.LinkTo(cropImage, new DataflowLinkOptions {PropagateCompletion = true, MaxMessages = 100});
   cropImage.LinkTo(saveImage, new DataflowLinkOptions {PropagateCompletion = true, MaxMessages = 100});

   foreach(var file in Directory.EnumerateFiles(directory, "*.jpg"))
   {
       await loadImage.SendAsync(file);
   }
   loadImage.Complete();
   await saveImage.Completion;
}

private async Task<MyImage> LoadImageAsync(string fileName)
{
    byte[] data = await GetDataAsync(fileName);
    return new MyImage(data, fileName);
}

private MyImage Crop(MyImage image, int x, int y)
{
    image.Crop(x,y);
    return image;
}

private async Task SaveImageAsync(MyImage image)
{
    var fileName = Path.GetFileName(image.FileName);
    var directoryName = Path.GetDirectoryName(image.FileName);
    var newName = Path.Combine(directoryName, "Cropped-" + filename);
    await SaveDataAsync(image.Bytes, newName);
}

这篇关于繁重的处理:阶段还是循环线程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆