分配内存时线程争用 [英] Thread contention when allocating memory
问题描述
在 C# 中,我运行了一个创建许多小对象的玩具代码(我知道最好避免这种情况 - 我只是想研究这个问题).对于创建的对象总数,一个线程的运行速度比每个处理器一个线程的运行速度要快(Parallel.For).
In C#, I run a toy code that creates many small objects (which I know should ideally be avoided - I just want to study the problem). For the same total amount of objects created, one thread runs faster than one thread per processor (Parallel.For).
原子操作包括创建一个包含 20k 个小对象的列表(实际上是一个数组)(为简单起见,这里是 long[4]):
The atomic action consists in creating a list (actually an array) containing 20k small objects (here long[4] for simplicity):
private static void CreateList()
{
long[][] list = new long[20000][];
for (var i = 0; i < 20000; i++)
list[i] = new long[4];
}
如果我在单个线程中创建 1000 个列表,它会在 1.5 秒内运行.如果我用多个线程创建 1000 个列表(每个线程负责 1000 个列表的一个子集),它会在 2 秒内运行.
If I create 1000 lists in a single thread it runs in 1.5s. If I create 1000 lists with several threads (each responsible for a subset of the 1000 lists), It runs in 2s.
行为在以下情况下基本相同:
The behaviour is essentially the same when:
- 使用经典的小对象代替长对象[4]
- 使用真正的列表而不是数组
- 使用不同数量的对象
你能解释一下原因吗?内存管理器中是否有锁".是否与垃圾回收有关?
代码详情:
public static void Main()
{
Benchmark(1000, CreateList);
}
private static void Benchmark(int repeat, Action action)
{
Console.WriteLine("Single thread");
Benchmark(delegate ()
{
for (int i = 0; i < repeat; i++)
action();
});
Console.WriteLine("Multi thread");
Benchmark(delegate ()
{
Parallel.For(0, repeat, i => action());
});
}
private static void Benchmark(Action action)
{
for (int i = 0; i < 10; i++)
{
Stopwatch sw = new Stopwatch();
sw.Start();
action();
sw.Stop();
Console.WriteLine("Time : " + sw.Elapsed.TotalSeconds);
}
}
推荐答案
尽管内存管理器使用某种信号量是正常的,但具有许多内存分配的多线程应用程序与默认的 C# 垃圾收集器的工作非常糟糕.有了合适的垃圾收集器,事情就会好很多.
Even though it is normal the memory manager uses some kind of semaphore, multi-threaded apps with many memory allocations work very bad with the default C# garbage collector. With the proper garbage collector, things are MUCH better.
你应该:
- 启用服务器垃圾回收
- (可能)禁用并发 GC
服务器 GC 将允许线程之间更好的并行度,因为内存分配是部分独立的.在这种情况下,具有多个内核的机器的性能可能会发生根本性的变化.
Server GC will allow a better degree of parallelization between threads since memory allocation is partially independent. In this kind of situation, the performance can change radically on machines with several cores.
简而言之,将其添加到您项目的配置文件中:
For short, add this to the config file of your project:
<runtime>
<gcServer enabled="true"/>
<gcConcurrent enabled="false" />
</runtime>
您可以在 垃圾收集基础.
这篇关于分配内存时线程争用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!