相同的过程的线程之间的上下文切换的成本,在Linux [英] Cost of context switch between threads of same process, on Linux

查看:270
本文介绍了相同的过程的线程之间的上下文切换的成本,在Linux的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

时有上下文在Linux上同一进程的线程之间的切换成本什么好的经验数据(x86和x86_64的为主,感兴趣)?我说的最后一个指令一个线程在用户空间执行之间的周期或纳秒数越来越投入到自愿或不自愿睡觉前,和第一条指令在同一过程的不同的线程在同一CPU /核心醒来后执行

Is there any good empirical data on the cost of context switching between threads of the same process on Linux (x86 and x86_64, mainly, are of interest)? I'm talking about the number of cycles or nanoseconds between the last instruction one thread executes in userspace before getting put to sleep voluntarily or involuntarily, and the first instruction a different thread of the same process executes after waking up on the same cpu/core.

我写的不断执行 RDTSC 在分配到同一个CPU /核心2个线程,并将结果保存在volatile变量快速测试程序,并比较其姐妹线程对应的volatile变量。它检测到的妹妹线程的值发生变化,第一次,它打印的区别,然后回到循环。我得到的约九千六百分之八千九百周期最小/平均数量上的Atom D510 CPU的这种方式。请问此过程似乎是合理的,并做数字似乎可信?

I wrote a quick test program that constantly performs rdtsc in 2 threads assigned to the same cpu/core, stores the result in a volatile variable, and compares to its sister thread's corresponding volatile variable. The first time it detects a change in the sister thread's value, it prints the difference, then goes back to looping. I'm getting minimum/median counts of about 8900/9600 cycles this way on an Atom D510 cpu. Does this procedure seem reasonable, and do the numbers seem believable?

我的目标是评估是否对现代系统,线程每连接服务器模型可能是有竞争力的,甚至优于选择型复用。这似乎在理论上可行,因为从FD 执行IO过渡X 于fd 涉及仅仅是要睡觉一个线程和醒来在另一个,而不是多个系统调用,但它是依赖于上下文切换的开销。

My goal is to estimate whether, on modern systems, thread-per-connection server model could be competitive with or even outperform select-type multiplexing. This seems plausible in theory, as the transition from performing IO on fd X to fd Y involves merely going to sleep in one thread and waking up in another, rather than multiple syscalls, but it's dependent on the overhead of context switching.

推荐答案

。(免责声明:这不是直接回答这个问题,它只是一些建议,我希望会有所帮助)

(Disclaimer: This isn't a direct answer to the question, it's just some suggestions that I hope will be helpful).

首先,你肯定得到的数字听起来好像他们是球场内。但是请注意,该中断/陷阱延迟可能变化的很多的实施同样的ISA不同的CPU型号之一。这也是一个不同的故事,如果你的线程都使用浮点或向量运算,因为如果他们没有内核避免了保存/恢复浮点或矢量单元状态。

Firstly, the numbers you're getting certainly sound like they're within the ballpark. Note, however, that the interrupt / trap latency can vary a lot among different CPU models implementing the same ISA. It's also a different story if your threads have used floating point or vector operations, because if they haven't the kernel avoids saving/restoring the floating point or vector unit state.

您应该能够通过使用内核跟踪基础设施,以获得更准确的数字 - PERF章附表特别是设计用来测量和分析调度延迟。

You should be able to get more accurate numbers by using the kernel tracing infrastructure - perf sched in particular is designed to measure and analyse scheduler latency.

如果你的目标是线程每连接服务器模式,那么你可能根本不应该衡量非自愿上下文切换的延迟 - 在这样的服务器通常,大多数上下文切换将是自愿的,因为在一个线程块阅读()等待从网络上更多的数据。因此,一个更好的测试平台可能涉及测量从一个线程阻塞的潜伏期阅读()来另一名来自同一个被唤醒。

If your goal is to model thread-per-connection servers, then you probably shouldn't be measuring involuntary context switch latency - usually in such a server, the majority of context switches will be voluntary, as a thread blocks in read() waiting for more data from the network. Therefore, a better testbed might involve measuring the latency from one thread blocking in a read() to another being woken up from the same.

请注意,在重负荷下一个写得很好的复用服务器,从FD X上的过渡于fd 往往会涉及到同一个系统调用(如过度活跃的文件描述符的列表服务器迭代从单一的的epoll()返回)。一个线程也应该通过简单的只有一个堆有高速缓存足迹比多线程,更少。我怀疑就了事(用于摆平一些的定义)的唯一途径可能是有一个基准的枪战......

Note that in a well-written multiplexing server under heavy load, the transition from fd X to fd Y will often involve the same single system call (as the server iterates over a list of active file descriptors returned from a single epoll()). One thread also ought to have less cache footprint than multiple threads, simply through having only one stack. I suspect the only way to settle the matter (for some definition of "settle") might be to have a benchmark shootout...

这篇关于相同的过程的线程之间的上下文切换的成本,在Linux的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆