更好的系统上的多线程性能更差(可能是由于Deedle) [英] Worse multithreaded performance on better system (possibly due to Deedle)

查看:160
本文介绍了更好的系统上的多线程性能更差(可能是由于Deedle)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在使用Deedle处理多线程C#服务.在四核当前系统与八核目标系统上进行的测试表明,该服务在目标系统上的速度大约慢了两倍,而不是两倍.即使将线程数限制为两个,目标系统仍然要慢40%.

We are dealing with a multithreaded C# service using Deedle. Tests on a quad-core current system versus an octa-core target system show that the service is about two times slower on the target system instead of two times faster. Even when restricting the number of threads to two, the target system is still almost 40% slower.

分析表明,在Deedle(/F#)中有很多等待,使目标系统基本上在两个内核上运行.非Deedle测试程序可在目标系统上显示正常行为并具有出色的内存带宽.

Analysis shows a lot of waiting in Deedle(/F#), making the target system basically run on two core. Non-Deedle test programs show normal behaviour and superiour memory bandwidth on the target system.

关于什么可能导致这种情况或如何最好地解决这种情况的任何想法?

Any ideas on what could cause this or how to best approach this situation?

似乎大部分时间在调用Invoke的过程中都已完成.

It seems most of the time waiting is done in calls to Invoke.

推荐答案

问题原来是由于使用Windows 7,.NET 4.5(或实际上是4.0运行时)以及在F#/中大量使用尾部递归而造成的.迪德.

The problem turned out to be a combination of using Windows 7, .NET 4.5 (or actually the 4.0 runtime) and the heavy use of tail recursion in F#/Deedle.

使用Visual Studio的Concurrency Visualizer,我已经发现大部分时间都花在等待Invoke调用上.经过仔细检查,这些结果将导致以下呼叫跟踪:

Using Visual Studio's Concurrency Visualizer, I already found that most time is spent waiting in Invoke calls. On closer inspection, these result in the following call trace:

ntdll.dll:RtlEnterCriticalSection
ntdll.dll:RtlpLookupDynamicFunctionEntry
ntdll.dll:RtlLookupFunctionEntry
clr.dll:JIT_TailCall
<some Deedle/F# thing>.Invoke

搜索这些函数给出了许多文章和论坛主题,表明使用F#可能会导致对JIT_TailCall的大量调用,并且.NET 4.6具有新的JIT编译器,似乎可以处理与这些调用有关的一些问题.尽管我没有发现任何有关锁定/同步的问题,但这确实使我想到了更新到.NET 4.6可能是一个解决方案.

Searching for these function gave multiple articles and forum threads indicating that using F# can result in a lot of calls to JIT_TailCall and that .NET 4.6 has a new JIT compiler that seems to deal with some issues relating to these calls. Although I didn't find anything mentioning problems relating to locking/synchronisation, this did give me the idea that updating to .NET 4.6 might be a solution.

但是,在我自己也使用.NET 4.5的Windows 8.1系统上,不会发生此问题.在搜索了类似的Invoke调用后,我发现此系统上的调用跟踪如下:

However, on my own Windows 8.1 system that also uses .NET 4.5, the problem doesn't occur. After searching a bit for similar Invoke calls, I found that the call trace on this system looked as follows:

ntdll.dll:RtlAcquireSRWLockShared
ntdll.dll:RtlpLookupDynamicFunctionEntry
ntdll.dll:RtlLookupFunctionEntry
clr.dll:JIT_TailCall
<some Deedle/F# thing>.Invoke

显然,在Windows 8(.1)中,锁定机制更改为较不严格的锁定,从而大大减少了等待锁定的需要.

Apparently, in Windows 8(.1) the locking mechanism was changed to something less strict, which resulted in a lot less need for waiting for the lock.

因此,只有目标系统结合了Windows 7的严格锁定和.NET 4.5效率较低的JIT编译器,F#大量使用尾部递归才会引起问题.更新到.NET 4.6后,问题消失了,我们的服务按预期运行.

So only with the target system's combination of both Windows 7's strict locking and .NET 4.5's less efficient JIT compiler, did F#'s heavy usage of tail recursion cause problems. After updating to .NET 4.6, the problem disappeared and our service is running as expected.

这篇关于更好的系统上的多线程性能更差(可能是由于Deedle)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆