正在运行时中断汇编指令 [英] Interrupting an assembly instruction while it is operating

查看:45
本文介绍了正在运行时中断汇编指令的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当中断到达 CPU 时,如果它被确认,则通过在跳转到处理程序之前保存当前地址位置来处理它.否则将被忽略.

When an interrupt comes to CPU, it is handled by saving current address location prior jumping into the handler if it is acknowledged. Otherwise it is ignored.

我想知道汇编指令调用是否被中断.

I wonder whether an assembly instruction call is interrupted.

例如

mvi a, 03h ; put 3 value into acc. in 8080 assembly

单行指令可以中断吗?或者如果不是,它是原子的??

Can be the one line instruction interrupted? Or if not, it is atomic??

是否总能保证一行汇编指令"总是原子的??

Is there always a guarantee that "one line assembly instruction" is always atomic??

如果没有lock"关键字,即在 8080 程序集中,那么原子性是如何提供的?

What if there is no "lock" keyword i.e. in 8080 assembly, then how is the atomicity provided?

例如,如果要操作64位和,但是一行​​指令"没有办法做到,并且在对和进行操作时出现中断怎么办.如何在程序集级别防止它?

For example, what if 64 bit sum is wanted to be operated, but there is no way to do it with "one line instruction" and an interrupt comes while operating on sum. How can it be prevented at assembly level??

对我来说,这个概念正在开始归结.

The concept is being started to boil down for me.

推荐答案

是的,包括 8080 和 x86 在内的所有正常"ISA 都保证指令对于同一内核上的中断是原子的.指令已完全执行并且其所有架构效果都可见(在中断处理程序中),或者它们都不可见.任何与此规则的偏差通常都会被仔细记录.

Yes all "normal" ISAs including 8080 and x86 guarantee that instructions are atomic with respect to interrupts on the same core. Either an instruction has fully executed and all its architectural effects are visible (in the interrupt handler), or none of them are. Any deviations from this rule are generally carefully documented.

例如,Intel 的 x86 手册第 3 卷(约 1000 页 PDF) 确实特别指出:

For example, Intel's x86 manual vol.3 (~1000 page PDF) does make a point of specifically saying this:

6.6 程序或任务重启
为了允许在处理异常或中断后重新启动程序或任务,所有异常(中止除外)保证报告指令边界上的异常.所有中断保证在指令边界上进行.

Intel 的 vol.1 手册谈到了使用 cmpxchg 没有lock 前缀来读取的单核系统-以原子方式修改-写入(相对于其他软件,而不是硬件 DMA 访问).

An old paragraph in Intel's vol.1 manual talks about single-core systems using cmpxchg without a lock prefix to read-modify-write atomically (with respect to other software, not hardware DMA access).

CMPXCHG 指令通常用于测试和修改信号量.它检查信号量是否免费.如果信号量空闲,则标记为已分配;否则它获取当前所有者的 ID.这一切都完成了在一个不间断的操作中 [因为它是一条指令].在单处理器系统中,CMPXCHG 指令消除了对在执行多条指令以测试和修改信号量之前切换到保护级别 0(以禁用中断).

The CMPXCHG instruction is commonly used for testing and modifying semaphores. It checks to see if a semaphore is free. If the semaphore is free, it is marked allocated; otherwise it gets the ID of the current owner. This is all done in one uninterruptible operation [because it's a single instruction]. In a single-processor system, the CMPXCHG instruction eliminates the need to switch to protection level 0 (to disable interrupts) before executing multiple instructions to test and modify a semaphore.

对于多处理器系统,CMPXCHG 可以结合 LOCK 前缀来执行比较和原子地交换操作.(参见第 8 章多处理器管理"中的锁定原子操作",的英特尔® 64 位和 IA-32 架构软件开发人员手册,第 3A 卷,了解有关原子的更多信息操作.)

For multiple processor systems, CMPXCHG can be combined with the LOCK prefix to perform the compare and exchange operation atomically. (See "Locked Atomic Operations" in Chapter 8, "Multiple-Processor Management," of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for more information on atomic operations.)

(有关 lock 前缀及其实现方式与非锁定 add [mem], 1 的更多信息,请参阅 num++ 可以是 'int num' 的原子吗?)

(For more about the lock prefix and how it's implemented vs. non-locked add [mem], 1, see Can num++ be atomic for 'int num'?)

正如英特尔在第一段中指出的那样,实现多指令原子性的一种方法是禁用中断,然后在完成后重新启用.这比使用互斥锁更好保护一个更大的整数,尤其是当你在谈论主程序和中断处理程序之间共享的数据时.如果在主程序持有锁时发生中断,它不能等待锁被释放;那永远不会发生.

As Intel points out in that first paragraph, one way to achieve multi-instruction atomicity is to disable interrupts, then re-enable when you're done. This is better than using a mutex to protect a larger integer, especially if you're talking about data shared between the main program and an interrupt handler. If an interrupt happens while the main program holds the lock, it can't wait for the lock to be release; that would never happen.

在简单的有序流水线上,尤其是在微控制器上,禁用中断通常非常便宜.(有时您需要保存之前的中断状态,而不是无条件地启用中断.例如,可能在中断已禁用的情况下调用的函数.)

Disabling interrupts is usually pretty cheap on simple in-order pipelines, or especially microcontrollers. (Sometimes you need to save the previous interrupt state, instead of unconditionally enabling interrupts. E.g. a function that might be called with interrupts already disabled.)

无论如何,禁用中断是您可以在 8080 上使用 64 位整数自动执行某些操作的方法.

根据为该指令记录的规则,一些长时间运行的指令是可中断的.

A few long-running instructions are interruptible, according to rules documented for that instruction.

例如x86 的 rep 字符串指令,如 rep movsb(任意大小的单指令 memcpy)在架构上等同于重复基本指令(movsb)RCX 次,每次递减 RCX 并递增或递减指针输入(RSI 和 RDI).在复制期间到达的中断可以设置 RCX starting_value - byte_copied 并且(如果 RCX 然后是非零)让 RIP 指向指令,因此在中断后恢复 rep movsbcode> 将再次运行并完成其余的复制.

e.g. x86's rep-string instructions, like rep movsb (single-instruction memcpy of arbitrary size) are architecturally equivalent to repeating the base instruction (movsb) RCX times, decrementing RCX each time and incrementing or decrementing the pointer inputs (RSI and RDI). An interrupt arriving during a copy can set RCX starting_value - byte_copied and (if RCX is then non-zero) leave RIP pointing to the instruction, so on resuming after the interrupt the rep movsb will run again and do the rest of the copy.

其他 x86 示例包括 SIMD 收集加载 (AVX2/AVX512) 和分散存储 (AVX512).例如.vpgatherdd ymm0, [rdi + ymm1*4], ymm2 最多执行 8 个 32 位加载,根据 ymm2 的元素设置.并将结果合并到ymm0中.

Other x86 examples include SIMD gather loads (AVX2/AVX512) and scatter stores (AVX512). E.g. vpgatherdd ymm0, [rdi + ymm1*4], ymm2 does up to 8 32-bit loads, according to which elements of ymm2 are set. And the results are merged into ymm0.

在正常情况下(在收集期间没有中断、没有页面错误或其他同步异常),您在目标寄存器中获取数据,并且掩码寄存器最终归零.因此,掩码寄存器为 CPU 提供了存储进度的地方.

In the normal case (no interrupts, no page faults or other synchronous exceptions during the gather), you get the data in the destination register, and the mask register ends up zeroed. The mask register thus gives the CPU somewhere to store progress.

Gather 和 scatter 很慢,并且可能需要触发多个页面错误,因此对于同步异常,即使在处理页面错误会取消映射所有其他页面的病态条件下,这也能保证向前进展.但更相关的是,这意味着避免在中间元素页面错误时重做 TLB 未命中,并且在异步中断到达时不丢弃工作.

Gather and scatter are slow, and might need to trigger multiple page faults, so for synchronous exceptions this guarantees forward progress even under pathological conditions where handling a page fault unmaps all other pages. But more relevantly, it means avoiding redoing TLB misses if a middle element page faults, and not discarding work if an async interrupt arrives.

其他一些长时间运行的指令(例如 wbinvd 跨所有内核刷新所有数据缓存)在架构上是不可中断的,甚至是架构上可中止(放弃部分工作并处理中断).它具有特权,因此用户空间无法将其作为导致高中断延迟的拒绝服务攻击来执行.

Some other long-running instructions (like wbinvd which flushes all data caches across all cores) are not architecturally interruptible, or even microarchitecturally abortable (to discard partial work and go handle an interrupt). It's privileged so user-space can't execute it as a denial-of-service attack causing high interrupt latency.

记录有趣行为的相关示例是当 x86 popad 离开堆栈顶部(段限制)时.这是针对异常(不是外部中断),在 vol.3 手册前面的第 6.5 节异常分类中进行了记录(即故障/陷阱/中止,请参阅 PDF 了解更多详细信息.)

Related example of documenting funny behaviour is when x86 popad goes off the top of the stack (segment limit). This is for an exception (not an external interrupt), documented earlier in the vol.3 manual, in section 6.5 EXCEPTION CLASSIFICATIONS (i.e. fault / trap / abort, see the PDF for more details.)

注意
通常报告为故障的异常子集不可重新启动.此类异常导致损失一些处理器状态.例如,执行 POPAD 指令,其中堆栈帧越过堆栈段的末尾导致报告错误.在这种情况下,异常处理程序看到指令指针 (CS:EIP) 已恢复,就像 POPAD指令没有被执行.但是,内部处理器状态(通用寄存器)将被修改. 这种情况被认为是编程错误.一个应用程序引起此类异常应由操作系统终止.

NOTE
One exception subset normally reported as a fault is not restartable. Such exceptions result in loss of some processor state. For example, executing a POPAD instruction where the stack frame crosses over the end of the stack segment causes a fault to be reported. In this situation, the exception handler sees that the instruction pointer (CS:EIP) has been restored as if the POPAD instruction had not been executed. However, internal processor state (the general-purpose registers) will have been modified. Such cases are considered programming errors. An application causing this class of exceptions should be terminated by the operating system.

请注意,这仅在 popad 本身导致异常时,不会出于任何其他原因.外部中断不能像 rep movsbvpgatherdd

Note that this is only if popad itself causes an exception, not for any other reason. An external interrupt can't split popad the way it can for rep movsb or vpgatherdd

(我猜为了 popad 故障,它有效地迭代工作,一次弹出 1 个寄存器并在逻辑上修改 RSP/ESP/SP 以及目标寄存器.而不是检查整个区域它会在开始之前加载段限制,因为我猜这需要额外的添加.)

(I guess for the purposes of popad faulting, it effectively works iteratively, popping 1 register at a time and logically modifying RSP/ESP/SP as well as the target register. Instead of checking the whole region it's going to load for segment limit before starting, because that would require an extra add, I guess.)

像现代 x86 这样具有乱序执行和将复杂指令拆分为多个 uops 的 CPU 仍然确保了这种情况.当中断到达时,CPU 必须在它正在运行的两条指令之间选择一个点作为中断体系结构发生的位置.它必须放弃任何已经完成的解码工作或开始执行任何后续指令.假设中断返回,它们将被重新获取并重新开始执行.

CPUs like modern x86 with out-of-order execution and splitting complex instructions into multiple uops still ensure this is the case. When an interrupt arrives, the CPU has to pick a point between two instructions it's in the middle of running as the location where the interrupt architecturally happens. It has to discard any work that's already done on decoding or starting to execute any later instructions. Assuming the interrupt returns, they'll be re-fetched and start over again executing.

参见中断时如果发生,管道中的指令会发生什么变化?.

正如 Andy Glew 所说,当前的 CPU 不会重命名特权级别,因此逻辑上发生的事情(在较早的指令完成后执行中断/异常处理程序)与实际发生的事情相匹配.

As Andy Glew says, current CPUs don't rename the privilege level, so what logically happens (interrupt/exception handler executes after earlier instructions finish) matches what actually happens.

有趣的事实是:x86 中断不是完全序列化的,至少在纸面上不能保证.(在 x86 术语中,像 cpuidiret 这样的指令被定义为序列化;排空 OoO 后端和存储缓冲区,以及任何其他可能重要的东西.这是一个非常强大的屏障和许多其他不是的东西,例如mfence.)

Fun fact, though: x86 interrupts aren't fully serializing, at least not guaranteed on paper. (In x86 terminology, instructions like cpuid and iret are defined as serializing; drain the OoO back-end and store buffer, and anything else that might possibly matter. That's a very strong barrier and lots of other things aren't, e.g. mfence.)

在实践中(因为 CPU 在实践中不会重命名权限级别),当中断处理程序运行时,乱序后端中不会有任何旧的用户空间指令/uop 仍在运行中.

In practice (because CPUs don't in practice rename the privilege level), there won't be any old user-space instructions/uops in the out-of-order back-end still in flight when an interrupt handler runs.

异步(外部)中断也可能耗尽存储缓冲区,具体取决于我们如何解释 英特尔的 SDM vol.3 11.10:*the在以下情况下,存储缓冲区的内容总是被排空到内存中:" ... "当异常或中断生成".显然这适用于异常(CPU 内核本身生成中断)),也可能表示在服务中断之前.

Async (external) interrupts may also drain the store buffer, depending on how we interpret the wording of Intel's SDM vol.3 11.10: *the contents of the store buffer are always drained to memory in the following situations:" ... "When an exception or interrupt is generated". Clearly that applies to exceptions (where the CPU core itself generates the interrupt), and might also mean before servicing an interrupt.

(从 retired 存储指令中存储数据不是推测性的;它肯定会发生,并且 CPU 已经放弃了它在该存储指令之前需要能够回滚到的状态.所以充满分散的缓存未命中存储的大型存储缓冲区可能会影响中断延迟.要么在任何中断处理程序指令可以运行之前等待它耗尽,要么至少在任何 in/outlocked 之前如果结果表明存储缓冲区没有耗尽,则 ISR 中的指令可能会发生.)

(Store data from retired store instructions is not speculative; it definitely will happen, and the CPU has already dropped the state it would need to be able to roll back to before that store instruction. So a large store buffer full of scattered cache-miss stores can hurt interrupt latency. Either from waiting for it to drain before any interrupt-handler instructions can run at all, or at least before any in/out or locked instruction in an ISR can happen if it turns out that the store buffer isn't drained.)

相关:沙堆(https://www.sandpile.org/x86/coherent.htm) 有一个正在序列化的事物表.中断和异常不是.但同样,这并不意味着他们不会耗尽存储缓冲区.这可以通过实验进行测试:在用户空间中的存储和 ISR 中的负载(不同共享变量的)之间寻找 StoreLoad 重新排序,如另一个内核所观察到的那样.

Related: Sandpile (https://www.sandpile.org/x86/coherent.htm) has a table of things that are serializing. Interrupts and exceptions aren't. But again, this doesn't mean they don't drain the store buffer. This would be testable with an experiment: look for StoreLoad reordering between a store in user-space and a load (of a different shared variable) in an ISR, as observed by another core.

本节的一部分并不真正属于这个答案,应该移到其他地方. 这是因为 预期内存语义(例如写后读)发生什么情况一个线程被安排在不同的 CPU 内核上? 引用此作为可能错误的断言的来源,即中断不会耗尽存储缓冲区,这是我在误解不序列化"之后写的.

Part of this section doesn't really belong in this answer and should be moved somewhere else. It's here because discussion in comments on What happens to expected memory semantics (such as read after write) when a thread is scheduled on a different CPU core? cited this as a source for the probably wrong claim that interrupts don't drain the store buffer, which I wrote after misinterpreting "not serializing".

这篇关于正在运行时中断汇编指令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆