总结为双倍时从 ForEach 循环转换为 Parallel.ForEach 循环会减慢速度 [英] Converting from a ForEach loop to a Parallel.ForEach loop when summarizing into a double slows things down
问题描述
我有一段 C# 代码如下.此代码总结了 DataTable 中的一列双打":
I have a section of C# code as follows. This code summarizes a column of 'doubles' in a DataTable :
var data = this.Db.ExecuteRead(query, this.Score.Name);
var time = 0.0;
foreach (DataRow row in data.Rows)
{
time += this.ParseDouble(row[0].ToString()) / MillisecondsPerMinute;
}
执行此代码需要 4 秒.我想加快速度,所以我将它并行化如下:
This code takes 4 seconds to execute. I wanted to speed it up, so I parallelized it as follows:
Parallel.ForEach(
data.AsEnumerable(),
row =>
{
time += this.ParseDouble(row[0].ToString()) / MillisecondsPerMinute;
});
此代码需要 3 秒才能执行.它也会导致碰撞.我不认为双"线程安全.这是预料之中的.然后我添加了一个互斥体以使其线程安全:
This code takes 3 seconds to execute. It also causes collisions. I don't think a 'double' thread safe. This was expected. I then added a Mutex to make it thread safe:
Parallel.ForEach(
data.AsEnumerable(),
row =>
{
mut.WaitOne();
ptime += this.ParseDouble(row[0].ToString()) / MillisecondsPerMinute;
mut.ReleaseMutex();
});
这段代码要慢得多.执行需要 15 秒,但会产生准确的结果.我的问题是,我最好在这里使用标准的ForEach",还是可以以更好的方式实现多线程?
This code is much slower. It takes 15 seconds to execute but produces accurate results. My question is, am I better off staying with the standard 'ForEach' here, or can I implement the multithreading in a better way?
作为参考,这里是 ParseDouble 方法:
For reference, here is the ParseDouble method:
protected double ParseDouble(string text)
{
double value;
if (!double.TryParse(text, out value))
{
throw new DoubleExpectedException();
}
return value;
}
推荐答案
这里有一些方法.首先是一个简单的 Parallel.ForEach
,减少受保护区域 (lock
) 到所需的绝对最小值(更新共享状态).这应该可以最大限度地减少对锁的争用.
Here are some approaches. First a simple Parallel.ForEach
, reducing the protected region (lock
) to the absolute minimum required (the updating of the shared state). This should minimize the contention for the lock.
DataTable data = this.Db.ExecuteRead(query, this.Score.Name);
double totalTime = 0.0;
Parallel.ForEach(data.AsEnumerable(), row =>
{
double time = Double.Parse(row[0].ToString()) / MillisecondsPerMinute;
lock (data) { totalTime += time; }
});
PLINQ 方法.简单且安全,但可能不是最有效的:
A PLINQ approach. Easy and secure, but probably not the most efficient:
double totalTime = data
.AsEnumerable()
.AsParallel()
.Select(row => Double.Parse(row[0].ToString()) / MillisecondsPerMinute)
.Sum();
Parallel.ForEach
和 Partitioner.Create
应该提供最佳性能,因为它允许对工作负载进行分块:
The combination of Parallel.ForEach
and Partitioner.Create
should give the best performance, because it allows to chunkify the workload:
double totalTime = 0.0;
Parallel.ForEach(Partitioner.Create(0, data.Rows.Count), () => 0.0D,
(range, state, accumulator) =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
DataRow row = data.Rows[i];
accumulator += Double.Parse(row[0].ToString()) / MillisecondsPerMinute;
}
return accumulator;
}, accumulator =>
{
lock (data) { totalTime += accumulator; }
});
这篇关于总结为双倍时从 ForEach 循环转换为 Parallel.ForEach 循环会减慢速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!