如何安排我的任务并行运行 [英] How to schedule my task running in parallel

查看:109
本文介绍了如何安排我的任务并行运行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用parallel.foreach运行maxdegreeofparallelism的任务,如果有一个任务需要很长时间才能完成,那么所有人都必须等待下一个迭代过程。



现在,情况是,我有多个文件订单处理由我的Windows服务执行,第一阶段我从db获取文件命令,然后对这些记录进行一些操作,然后将maxdegreeofparallelism的文件订单列表传递给我的并行对于每个任务。



我的文件订单记录中有少量文件,其中包含许多文档,如100或200,并且这些文档也有很多页面。因此,执行任务需要花费太多时间,因为如果订单处理的文件或页面较少的短文件将等待其他人处理大文件或页面。


现在我想要并行运行的那些特定的免费/完成任务应该获得下一个文件顺序的新db记录并开始并行处理那个。



两个例外情况是,下一个提取的文件顺序不应该是已经由另一个任务并行处理的顺序,并且还应该不超过给定的最大并行度/最大计划任务以处理所有并行任务处理的数量文件订单。



请为此建议更好的解决方案。



我试过的:



I am running my tasks for maxdegreeofparallelism using parallel.foreach and if there's a task taking too long time to complete then all have to wait for next iteration process.

Now, the scenario is, I have mutiple file orders processing performed by my windows service and on first phase I fetch my file orders from db, then do some manipulations over those records and then pass that list of file orders with maxdegreeofparallelism to my parallel for each tasking.

There are few files in my file orders records those have many documents like 100 or 200 and into those documents have many pages as well. So, performing tasking such that taking too much time because If an order has processed for short size file having less documents or pages will wait for others to be finished dealing with larges documents or pages.

Now I want those particular free / completed task running in parallel should get new db record for next file order and start processing that one in parallel.

Two exceptions are, the next fetched file orders shouldn't be the ones those are already processed by another tasks in parallel and number of parallel tasks processing should also not exceeded by given max degree of parallelism / maximum scheduled tasks for this service to process all file orders.

Please suggest a better solution for this.

What I have tried:

/// <summary>
        /// Starts the given tasks and waits for them to complete. This will run, at most, the specified number of tasks in parallel.
        /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
        /// </summary>
        /// <param name="tasksToRun">The tasks to run.</param>
        /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
        /// <param name="cancellationToken">The cancellation token.</param>
        public static Task StartAndWaitAllThrottled(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, CancellationToken cancellationToken = new CancellationToken())
        {
            return StartAndWaitAllThrottled(tasksToRun, maxTasksToRunInParallel, -1, cancellationToken);
        }

        /// <summary>
        /// Starts the given tasks and waits for them to complete. This will run, at most, the specified number of tasks in parallel.
        /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
        /// </summary>
        /// <param name="tasksToRun">The tasks to run.</param>
        /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
        /// <param name="timeoutInMilliseconds">The maximum milliseconds we should allow the max tasks to run in parallel before allowing another task to start. Specify -1 to wait indefinitely.</param>
        /// <param name="cancellationToken">The cancellation token.</param>
        public static Task StartAndWaitAllThrottled(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, int timeoutInMilliseconds, CancellationToken cancellationToken = new CancellationToken())
        {
            // Convert to a list of tasks so that we don't enumerate over it multiple times needlessly.
            var tasks = tasksToRun.ToList();

            using (var throttler = new SemaphoreSlim(1,maxTasksToRunInParallel))
            {
                var postTaskTasks = new List<Task>();

                // Have each task notify the throttler when it completes so that it decrements the number of tasks currently running.
                tasks.ForEach(t => postTaskTasks.Add(t.ContinueWith(tsk => throttler.Release())));

                // Start running each task.
                foreach (var task in tasks)
                {
                    // Increment the number of tasks currently running and wait if too many are running.
                    throttler.Wait(timeoutInMilliseconds, cancellationToken);
                    
                    cancellationToken.ThrowIfCancellationRequested();
                    task.Start();
                }

                // Wait for all of the provided tasks to complete.
                // We wait on the list of "post" tasks instead of the original tasks, otherwise there is a potential race condition where the throttler's using block is exited before some Tasks have had their "post" action completed, which references the throttler, resulting in an exception due to accessing a disposed object.
                Task.WaitAll(postTaskTasks.ToArray(), cancellationToken);
            }
        }

推荐答案

进行I / O操作时,使用并行操作不是一个好主意处理,大多数情况下会降低速度,特别是在使用硬盘时。

使用较新的SSD或RAID可能会有效,但我不推荐它。
When doing I/O operations, it is not a good idea to use parallel processing, most of the time this will slow things down, especially when a harddisk is being used.
With newer SSD's or RAID it might work, however I would not recommend it.


这篇关于如何安排我的任务并行运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆