OS X中的多线程C程序比Linux慢得多 [英] Multi-threaded C program much slower in OS X than Linux

查看:72
本文介绍了OS X中的多线程C程序比Linux慢得多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是为已经完成并提交的OS类作业写的.昨天我发布了这个问题,但是由于学术诚信"规定,我将其取消,直到提交截止日期之后.

I wrote this for an OS class assignment that I've already completed and handed in. I posted this question yesterday, but due to "Academic Honesty" regulations I took it off until after the submission deadline.

目标是学习如何使用关键部分.有一个data数组,其中包含100个单调递增的数字,0 ... 99和40个线程,它们随机交换两个元素,每个元素交换2,000,000次.一秒钟,Checker就会通过并确保每个数字只有一个(这意味着没有并行访问发生).

The object was to learn how to use critical sections. There is a data array with 100 monotonously increasing numbers, 0...99, and 40 threads that randomly swap two elements 2,000,000 times each. Once a second a Checkergoes through and makes sure that there is only one of each number (which means that no parallel access happened).

这是Linux时代:

Here were the Linux times:

real    0m5.102s
user    0m5.087s
sys     0m0.000s

和OS X时间

real    6m54.139s
user    0m41.873s
sys     6m43.792s

我在运行OS X的同一台计算机上使用ubuntu/trusty64运行一个无聊的盒子.它是四核i7 2.3Ghz(最高3.2Ghz)2012 rMBP.

I run a vagrant box with ubuntu/trusty64 on the same machine that is running OS X. It is a quad-core i7 2.3Ghz (up to 3.2Ghz) 2012 rMBP.

如果我理解正确,那么sys是系统开销,我无法控制,即使到那时,41s的用户时间仍表明线程可能是串行运行的.

If I understand correctly, sys is system overhead, which I have no control over, and even then, 41s of user time suggests that perhaps the threads are running serially.

如果需要,我可以发布所有代码,但是我将发布我认为相关的位.我正在使用pthreads,因为这是Linux提供的功能,但是我认为它们可以在OS X上运行.

I can post all the code if needed, but I will post the bits I think are relevant. I am using pthreads since that's what Linux provides, but I assumed they work on OS X.

创建swapper线程以运行swapManyTimes例程:

for (int i = 0; i < NUM_THREADS; i++) {
    int err = pthread_create(&(threads[i]), NULL, swapManyTimes, NULL);
}

Swapper线程关键部分,在for循环中运行200万次:

Swapper thread critical section, run in a for loop 2 million times:

pthread_mutex_lock(&mutex);    // begin critical section
int tmpFirst = data[first];
data[first] = data[second];
data[second] = tmpFirst;
pthread_mutex_unlock(&mutex);  // end critical section

仅创建一个Checker线程,与Swapper相同.通过遍历data数组并使用true标记与每个值相对应的索引来进行操作.然后,它检查有多少个索引为空.这样:

Only one Checker thread is created, same way as Swapper. It operates by going over the data array and marking the index corresponding to each value with true. Afterwards, it checks how many indices are empty. as such:

pthread_mutex_lock(&mutex);
for (int i = 0; i < DATA_SIZE; i++) {
    int value = data[i];
    consistency[value] = 1;
}
pthread_mutex_unlock(&mutex); 

在通过它的while(1)循环后,通过调用sleep(1)每秒运行一次.在所有swapper线程都加入之后,该线程也被取消并加入.

It runs once a second by calling sleep(1) after it runs through its while(1) loop. After all swapper threads are joined this thread is cancelled and joined as well.

我将很乐意提供更多信息,以帮助弄清为什么这在Mac上是如此糟糕.我并不是真正地在寻求代码优化方面的帮助,除非这是OS X遇到的问题.我尝试在OS X上同时使用clanggcc-4.9来构建它.

I would be happy to provide any more information that can help figure out why this sucks so much on Mac. I'm not really looking for help with code optimization, unless that's what's tripping up OS X. I've tried building it using both clang and gcc-4.9 on OS X.

推荐答案

MacOSX和Linux实现pthread的方式不同,从而导致这种缓慢的行为.特别是MacOSX不使用自旋锁(根据ISO C标准,它们是可选的).这样的示例可能会导致非常非常慢的代码性能.

MacOSX and Linux implement pthread differently, causing this slow behavior. Specifically MacOSX does not use spinlocks (they are optional according to ISO C standard). This can lead to very, very slow code performance with examples like this one.

这篇关于OS X中的多线程C程序比Linux慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆