上下文切换的开销是多少? [英] What is the overhead of a context-switch?

查看:515
本文介绍了上下文切换的开销是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最初,我认为上下文切换的开销是正在刷新TLB.但是我只是在维基百科上看到的:

Originally I believed the overhead to a context-switch was the TLB being flushed. However I just saw on wikipedia:

http://en.wikipedia.org/wiki/Translation_lookaside_buffer

2008年,英特尔(Nehalem)[18]和AMD(SVM)[19]都推出了 标签作为TLB条目的一部分,以及用于检查 查找期间标记.即使这些未得到充分利用, 设想在将来,这些标签将标识地址 每个TLB条目所属的空间. 因此,上下文切换不会 导致TLB的刷新 –但只是更改了 当前地址空间到新任务的地址空间的标签.

In 2008, both Intel (Nehalem)[18] and AMD (SVM)[19] have introduced tags as part of the TLB entry and dedicated hardware that checks the tag during lookup. Even though these are not fully exploited, it is envisioned that in the future, these tags will identify the address space to which every TLB entry belongs. Thus a context switch will not result in the flushing of the TLB – but just changing the tag of the current address space to the tag of the address space of the new task.

以上内容是否确认了对于较新的Intel CPU,TLB不会在上下文切换中被刷新?

Does the above confirm for newer Intel CPUs the TLB doesn't get flushed on context switches?

这是否意味着上下文切换现在没有真正的开销了?

Does this mean there is no real overhead now in a context-switch?

(我想了解上下文切换的性能损失)

(I am trying to understand the performance penalty of a context-switch)

推荐答案

维基百科知道在其上下文切换中文章,"上下文切换是存储和恢复进程状态(上下文)的过程,以便以后可以从同一点恢复执行.".我将假定在同一操作系统的两个进程之间进行上下文切换,而不是用户/内核模式转换(系统调用),后者要快得多并且不需要TLB刷新.

As wikipedia knows in its Context switch article, "context switch is the process of storing and restoring the state (context) of a process so that execution can be resumed from the same point at a later time.". I'll assume context switch between two processes of the same OS, not the user/kernel mode transition (syscall) which is much faster and needs no TLB flush.

因此,OS内核需要大量时间将当前正在运行的进程的执行状态(所有,实际上是所有寄存器,以及许多特殊的控制结构)保存到内存中,然后加载其他进程的执行状态(读入)从记忆里).如有需要,TLB刷新将为交换机增加一些时间,但这仅占总开销的一小部分.

So, there is lot of time needed for OS kernel to save execution state (all, really all, registers; and many special control structures) of current running process to memory, and then load execution state of other process (read in from memory). TLB flush, if needed, will add some time to the switch, but it is only small part of total overhead.

如果要查找上下文切换延迟,请使用lmbench基准测试工具 http://www.bitmover .com/lmbench/和LAT_CTX测试 http://www.bitmover.com /lmbench/lat_ctx.8.html

If you want to find context switch latency, there is lmbench benchmark tool http://www.bitmover.com/lmbench/ with LAT_CTX test http://www.bitmover.com/lmbench/lat_ctx.8.html

我找不到nehalem的结果(phoronix套件中是否有lmbench?),但没有找到

I can't find results for nehalem (is there lmbench in phoronix suite?), but for core2 and modern Linux context switch may cost 5-7 microseconds.

对于低质量测试,也有结果 http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html ,其中1-3微秒用于上下文切换.无法从他的结果中获得不刷新TLB的确切效果.

There are also results for lower-quality test http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html with 1-3 microseconds for context switch. Can't get exact effect of non-flushing the TLB from his results.

更新-您的问题应该与虚拟化有关,而不是与流程上下文切换有关.

UPDATE - Your question should be about Virtualization, not about process context switch.

RWT在有关Nehalem的文章中说:"Nehalem内部:英特尔的未来处理器和系统. TLB,页面表和同步",David Kanter于2008年4月2日发表,Nehalem在TLB中添加了VPID,以使虚拟机/主机切换(vmentry/vmexit)更快:

RWT says in their article about Nehalem "Inside Nehalem: Intel’s Future Processor and System. TLBs, Page Tables and Synchronization" April 2, 2008 by David Kanter, that Nehalem added VPID to the TLB to make virtual machine/host switches (vmentry/vmexit) faster:

通过引入虚拟处理器ID"或VPID,Nehalem的TLB条目也进行了微妙的更改.每个TLB条目都会缓存虚拟地址到物理地址的转换...该转换特定于给定的进程和虚拟机.每当处理器在虚拟来宾和主机实例之间切换时,英特尔的旧版CPU就会刷新TLB,以确保进程仅访问允许其触摸的内存. VPID跟踪TLB中给定转换条目与哪个VM相关联,因此当VM退出并重新进入时,不必出于安全考虑而刷新TLB. .... VPID通过降低VM转换的开销来提高虚拟化性能.英特尔估计,与Merom(即65nm Core 2)相比,Nehalem中往返VM过渡的延迟为40%,比45nm Penryn低约三分之一.

Nehalem’s TLB entries have also changed subtly by introducing a "Virtual Processor ID" or VPID. Every TLB entry caches a virtual to physical address translation ... that translation is specific to a given process and virtual machine. Intel’s older CPUs would flush the TLBs whenever the processor switched between the virtualized guest and the host instance, to ensure that processes only accessed memory they were allowed to touch. The VPID tracks which VM a given translation entry in the TLB is associated with, so that when a VM exit and re-entry occurs, the TLBs do not have to be flushed for safety. .... The VPID is helpful for virtualization performance by lowering the overhead of VM transitions; Intel estimates that the latency of a round trip VM transition in Nehalem is 40% compared to Merom (i.e. the 65nm Core 2) and about a third lower than the 45nm Penryn.

另外,您应该知道,在问题中您引用的片段中,"[18]"链接指向"G. Neiger,A.Santoni,F.Leung,D.Rodgers和R. Uhlig .英特尔虚拟化技术:高效处理器虚拟化的硬件支持.英特尔技术期刊,第10(3)页.",所以这是有效虚拟化的功能(快速的来宾-主机切换).

Also, you should know, that in the fragment cited by you in the question, the "[18]" link was to "G. Neiger, A. Santoni, F. Leung, D. Rodgers, and R. Uhlig. Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization. Intel Technology Journal, 10(3).", so this is feature for effective virtualization (fast guest-host switches).

这篇关于上下文切换的开销是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆