如何使用API​​C创建的IPI唤醒了SMP的接入点在x86汇编? [英] How to use the APIC to create IPIs to wake the APs for SMP in x86 assembly?

查看:543
本文介绍了如何使用API​​C创建的IPI唤醒了SMP的接入点在x86汇编?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在启动后的环境(无OS),怎么会之一中使用的BSP(第一核/处理器)创建的IPI为APS(所有其他内核/处理器)?从本质上讲,如何做一个唤醒,并设置为其他内核指令指针从1开始的时候?

In a post-boot enviroment (no OS), how would one use the BSP (first core/processor) to create IPIs for the APs (all other cores/processors)? Essentially, how does one wake and set the instruction pointer for the other cores when starting from one?

推荐答案

警告:我认为这里的80x86。如果它不是那么的80x86我不知道: - )

WARNING: I've assumed 80x86 here. If it's not 80x86 then I don't know :-)

首先,你需要找出多少其他CPU存在和他们的APIC ID是什么,确定本地APICS的物理地址。要做到这一点,你解析ACPI表(见ACPI规范MADT / APIC)。如果你不能找到有效的ACPI表(例如电脑太旧)有一个旧的多处理器规范,定义与它相同的信息它自己的表。请注意,多处理器规范现在是很​​precated(也有一些计算机与虚拟多处理器表),这就是为什么你需要首先检查ACPI表。

First you need to find out how many other CPUs exist and what their APIC IDs are, and determine the physical address of the local APICs. To do this you parse ACPI tables (see MADT/APIC in the ACPI specification). If you can't find valid ACPI tables (e.g. computer is too old) there's an older "MultiProcessor Specification" that defines its own tables with the same information in it. Note that the "MultiProcessor Specification" is deprecated now (and there are some computers with dummy MultiProcessor tables) which is why you need to check the ACPI tables first.

接下来的步骤是确定你有什么类型的本地APIC的。有3种情况 - 老外部82489DX本地APICS(没有内置在CPU本身),xAPIC和x2APIC

The next step is to determine what type of local APIC you have. There are 3 cases - old external "82489DX" local APICs (not built into the CPU itself), xAPIC and x2APIC.

通过检查CPUID,以确定是否本地APIC是x2APIC启动。如果你有两个选择 - 您可以使用x2APIC,或者你可以使用xAPIC兼容模式。对于xAPIC兼容模式,你只能使用8位APIC ID和将无法支持与电脑CPU的地段(如255或更多CPU)的。我推荐使用x2APIC(即使你不关心,有很多CPU的计算机)作为它的速度更快。如果你使用x2APIC模式,那么你就需要将本地APIC切换到该模式。

Start by checking CPUID to determine if the local APIC is x2APIC. If it is you have 2 choices - you can use x2APIC, or you can use "xAPIC compatibility mode". For "xAPIC compatibility mode" you can only use 8-bit APIC IDs and won't be able to support computers with lots of CPUs (e.g. 255 or more CPUs). I'd recommend using x2APIC (even if you don't care about computers with lots of CPUs) as its faster. If you do use x2APIC mode then you'll need to switch the local APIC into this mode.

否则,如果它不是x2APIC,读取本地APIC版的寄存器。如果本地APIC的版本是较高为0x10或然后其xAPIC,并且,如果它是为0x0F或低级那么它的外部82489DX本地APIC

Otherwise, if its not x2APIC, read the local APIC's version register. If the local APIC's version is 0x10 or higher then its xAPIC, and if it's 0x0F or lower then it's an external "82489DX" local APIC.

老外部82489DX本地APICS是在80486以上的电脑使用的,而这些都是极为罕见的(他们是非常罕见的20年前,当时大多数死亡和/或因为得到了更换,扔掉)。由于不同的序列用于启动其它的CPU,并且因为有这些当地的APICS电脑是极为罕见的(例如,你可能永远无法测试code),它使一个很大的意义不打扰支持这些计算机。如果你支持所有这些旧电脑;我建议把它们当作唯一的单CPU,只是没有启动任何其它CPU / s如果本地APIC是82489DX。出于这个原因,我不会描述用于从这里开始它们(它英特尔的多进程规范中描述,如果你很好奇)。

The old external "82489DX" local APICs were used in 80486 and older computers, and these are extremely rare (they were very rare 20 years ago, then most died and/or got replaced and thrown away since). Because a different sequence is used to start other CPUs, and because computers that have these local APICs are extremely rare (e.g. you will probably never be able to test your code) it makes a lot of sense to not bother supporting these computers. If you support these old computers at all; I'd recommend treating them as "single-CPU only" and simply not starting any other CPU/s if the local APIC is "82489DX". For this reason I won't describe the method used to start them here (it is described in Intel's "MultiProcess Specification" if you're curious).

有关xAPIC和x2APIC,用于开始另一个CPU的序列是基本相同的(访问本地APIC的只是不同的方式 - 的MSR或存储器映射)。我建议你​​使用(例如)函数指针来隐藏这些差异;以至于后来code可以调用通过送IPI功能。如果没有本地APIC贴心的函数指针是x2APIC或xAPIC。

For xAPIC and x2APIC, the sequence for starting another CPU is essentially the same (just different ways of accessing the local APIC - MSRs or memory mapped). I'd recommend using (e.g.) function pointers to hide these differences; so that later code can call a "send IPI" function via. the function pointer without caring if the local APIC is x2APIC or xAPIC.

要真正开始,你需要的IPI的序列(处理器间中断)发送给它的另一个CPU。英特尔的方法是这样的:

To actually start the another CPU you need to send a sequence of IPIs (Inter Processor Interrupts) to it. Intel's method goes like this:

Send an INIT IPI to the CPU you're starting
Wait for 10 ms
Send a STARTUP IPI to the CPU you're starting
Wait for 200 us
Send another STARTUP IPI to the CPU you're starting
Wait for 200 us
Wait for started CPU to set a flag (so you know it started)
    If flag was set by other CPU, other CPU was started successfully
    Else if time-out, other CPU failed to start

有2个问题与英特尔的方法。通常情况下,其他的CPU将首次启动IPI开始,在某些情况下,这可能会导致问题(例如,如果其它CPU的启动code确实像 total_CPUs ++; 那么每个CPU可能会两次执行。为了避免这个问题,你可以添加额外的同步(例如,其他CPU等待一个我知道你开始的标志被继续之前的第一个CPU设置)。与英特尔的方法的第二个问题是测量这些延迟。通常情况下操作系统启动其它CPU,然后计算出哪些功能的CPU支持,什么硬件present算账,并没有precise定时器/ s的设置来衡量这些200美元的延迟准确。

There are 2 problems with Intel's method. Often the other CPU will be started by the first STARTUP IPI, and in some cases this can lead to problems (e.g. if the other CPU's startup code does something like total_CPUs++; then each CPU might execute it twice. To avoid this problem you can add extra synchronisation (e.g. other CPU waits for an "I know you started" flag to be set by the first CPU before it continues). The second problem with Intel's method is measuring those delays. Typically an OS starts the other CPUs, then figures out what features the CPUs support and what hardware is present afterwards, and doesn't have precise timer/s setup to measure those 200 us delays accurately.

要避免这些问题;我用的是这样的一种替代方法:

To avoid those problems; I use an alternative method that goes like this:

Send an INIT IPI to the CPU you're starting
Wait for 10 ms
Send a STARTUP IPI to the CPU you're starting
Wait for started CPU to set a flag (so you know it started) with a short timeout (e.g. 1 ms)
    If flag was set by other CPU, other CPU was started successfully
    Else if time-out
        Send another STARTUP IPI to the CPU you're starting
        Wait for started CPU to set a flag with a long timeout (e.g. 200 ms)
            If flag was set by other CPU, other CPU was started successfully
            Else if time-out, other CPU failed to start
If CPU started successfully
    Set flag to tell other CPU it can continue

另外请注意,你需要单独启动的CPU。我见过人们开始使用广播IPI所有,但自功能在同一时间所有CPU - 这是错误的,破碎和狡猾的(除非你在写固件不这样做)。这样做的问题是,某些CPU可能出现故障(例如,失败的BIST /内建自测试)和某些CPU可能会被禁用(当超线程在固件被禁用例如超线程);和广播IPI所有,但自我的方法可以启动不应该被启动的CPU。

Also note that you need to start CPUs individually. I've seen people start all CPUs at the same time using the "broadcast IPI to all but self" feature - this is wrong and broken and dodgy (don't do it unless you're writing firmware). The problem with this is that some CPUs may be faulty (e.g. failed their BIST/built-in self test) and some CPUs may be disabled (e.g. hyper-threading when hyper-threading is disabled in firmware); and the "broadcast IPI to all but self" method can start CPUs that should never have been started.

最后,对于拥有大量可能需要相当长的时间都开始他们,如果你开始他们一次一个CPU的计算机。例如,如果需要11毫秒启动每个CPU和有128个CPU,那么它会采取1.4秒。如果你想快速启动有办法避免这种情况。例如,第一个CPU可以开始第二个CPU,那么第一和第二CPU可以开始第3和第4的CPU,那么这些四个CPU可以开始下四个CPU,等等。这样你可以在77毫秒启动128个CPU而不是1.4秒。

Finally, for computers with a large number of CPUs it can take a relatively long time to start them all if you're starting them one at a time. For example, if it takes 11 ms to start each CPU and there are 128 CPUs, then it'd take 1.4 seconds. If you want to boot fast there are ways to avoid this. For example, the first CPU can start the second CPU, then the 1st and 2nd CPU can start the 3rd and 4th CPU, then those four CPUs can start the next four CPUs, etc. In this way you can start 128 CPUs in 77 ms instead of 1.4 seconds.

注:我建议刚开始的CPU一次,并确保运作的,你尝试任何一种平行启动之前(它的东西,你可以不用担心事后你知道后,剩下的工作)。

这是其它CPU / S将开始执行的地址是在连接启动IPI的矢量字段codeD。该CPU将开始与 CS =矢量* 256 IP = 0 执行code(在实模式下) 。矢量场为8位,所以你可以使用最高起始地址为0x000FF000(为0xFF00:为0x0000在实模式下)。然而,这是传统的ROM区域(在实践中,起始地址必须是低级)。通常情况下你一小片启动code复制到一个合适的地址;其中,启动code处理同步(例如,设置一个我开始标志,另一个CPU可以看到,等待被告知它的确定继续),然后确实喜欢启用保护/长模式和跳跃之前,设置一个堆栈的东西在操作系统的正常code的入口点。这小片启动code的被称为AP CPU启动蹦床。这也是什么使平行启动有点复杂;作为正在启动的每个CPU需要有自己的/单独的同步标记和堆栈;因为这些东西与变量蹦床正常实施(例如 MOV ESP,[CS:stackTop] )。这意味着有多个蹦床结束

The address that the other CPU/s will begin executing is encoded in the "vector" field of the STARTUP IPI. The CPU will start executing code (in real mode) with CS = vector * 256 and IP = 0. The vector field is 8-bit, so the highest starting address you can use is 0x000FF000 (0xFF00:0x0000 in real mode). However, this is the legacy ROM area (in practice the starting address would have to be lower). Typically you'd copy a little piece of startup code into a suitable address; where the startup code handles synchronisation (e.g. setting an "I started" flag that another CPU can see and waiting to be told it's OK to continue) and then does things like enabling protected/long mode and setting up a stack before jumping to an entry point in the OS's normal code. This little piece of startup code is called the "AP CPU startup trampoline". This is also what makes the "parallel startup" a little complicated; as each CPU being started needs its own/separate synchronisation flags and stack; and because these things are normally implemented with variables in the trampoline (e.g. mov esp,[cs:stackTop]) it means end up with multiple trampolines.

这篇关于如何使用API​​C创建的IPI唤醒了SMP的接入点在x86汇编?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆