在x86-64 CPU上使用交叉修改代码重现意外行为 [英] Reproducing Unexpected Behavior w/Cross-Modifying Code on x86-64 CPUs

查看:103
本文介绍了在x86-64 CPU上使用交叉修改代码重现意外行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

交叉修改代码的一些想法有哪些可能会触发x86或x86-x64系统上的意外行为,其中在交叉修改代码中所有操作均已正确完成,但事先在执行处理器上执行了序列化指令除外要执行修改后的代码?

What are some ideas for cross-modifying code that could trigger unexpected behavior on x86 or x86-x64 systems, where everything is done correctly in the cross-modifying code, with the exception of executing a serializing instruction on the executing processor prior to executing the modified code?

如下所述,我有一个要测试的Core 2 Duo E6600处理器,明确提到它是容易出现此问题的处理器.我将在此计算机上测试与我分享的所有想法并进行更新.

As noted below, I have a Core 2 Duo E6600 processor to test on, which is explicitly mentioned as a processor that is prone to issues regarding this. I will test any ideas shared with me on this machine and give updates.

在x86和x64系统上,编写交叉修改代码的官方指南是执行以下操作:

On x86 and x64 systems, the official guidance for writing cross-modifying code is to do the following:

; Action of Modifying Processor
Store modified code (as data) into code segment;
Memory_Flag ← 1; 

; Action of Executing Processor
WHILE (Memory_Flag ≠ 1)
  Wait for code to update;
ELIHW;
Execute serializing instruction; (* For example, CPUID instruction *)
Begin executing modified code;

在某些处理器的勘误表中,必要时明确提到了串行化指令.例如,英特尔酷睿2 Duo E6000系列具有以下错误:(来自 http://www.mathemainzel.info/files/intelX6800andintelE6000.pdf )

The serializing instruction is explicitly mentioned as necessary in the errata for some processors. For example, Intel Core 2 Duo E6000 series have the following erratum: (from http://www.mathemainzel.info/files/intelX6800andintelE6000.pdf)

一个处理器或系统总线主控器将数据写入硬盘的动作. 当前正在执行第二个处理器的代码段的意图 让第二个处理器以代码形式执行该数据的过程 交叉修改代码(XMC).不强制第二个的XMC 处理器在执行之前执行同步指令 的新代码称为非同步XMC.

The act of one processor, or system bus master, writing data into a currently executing code segment of a second processor with the intent of having the second processor execute that data as code is called cross-modifying code (XMC). XMC that does not force the second processor to execute a synchronizing instruction, prior to execution of the new code, is called unsynchronized XMC.

使用非同步XMC修改指令字节的软件 处理器流可以看到意外或不可预测的执行 来自执行修改后代码的处理器的行为.

Software using unsynchronized XMC to modify the instruction byte stream of a processor can see unexpected or unpredictable execution behavior from the processor that is executing the modified code.

有人猜测,如果 http://linux.kernel.narkive.com/FDc9TB0d/patch-linux-kernel-markers :

完成i提取并且微操作已完成时 缓存,那么原件之间就不再直接相关了 机器指令边界和微操作.这是因为 优化.例如(用于说明目的的人工工具):

When the i-fetch has been done and the micro-ops are in the trace cache then there's no longer a direct correlation between the original machine instruction boundaries and the micro ops. This is due to optimization. For example (artificial one for illustrative purposes):

mov eax,ebx

mov eax,ebx

移动内存,eax

mov eax,1

mov eax,1

(使用智力符号而不是ATT-习惯力量)

(using intel notation not ATT - force of habit)

在跟踪缓存中,没有微操作可以使用ebx更新eax.

In the trace cache there would be no micro ops to update eax with ebx.

即时将"mov eax,ebx"更改为"mov ecx,ebx"将使 优化的跟踪缓存,因此唯一资源是GPF.如果 修改不会使跟踪缓存失效,然后不会导致GPF失效.这 问题是:我们可以预测跟踪缓存具有以下情况的情况吗? 尚未失效",并且通常的答案是否",因为 微架构不是公开的.但人们可以猜测,修改 具有中断指令的单字节操作码-int3-不 导致无法处理的不一致.这就是英特尔 确认的.继续并存储int3,而无需同步 (即,强制刷新跟踪缓存).

Altering the "mov eax,ebx" to "mov ecx,ebx" on the fly invalidates the optimized trace cache, hence the onlhy recourse is a GPF. If the modification doens't invalidate the trace cache then no GPF. The question is: "can we predict th circumstances when the trace cache has not been invalidated", and the answer in general is no since the microarchtecture is not public. But one can guess that modifying the single byte opcode with in interrupting instruction - int3 - doesn't cause an inconsistency that can't be handled. And that's what Intel confirmed. Go ahead and store int3 without the need to synchronise (i.e. force the trace cache to be flushed).

https://sourceware.org/ml/systemtap/2005- q3/msg00208.html :

当我们意识到这一点时,我与英特尔的 微体系结构的家伙.事实证明,这种错误的原因 (顺便说一下,英特尔无意修复)是因为跟踪 缓存-指令产生的微型作物流 解释-无法保证有效.之间的阅读 我认为此问题是由于在 跟踪缓存,不再可能标识原始缓存 指令边界.如果CPU发现者发现跟踪缓存 由于不同步的交叉修改而被无效,然后 GPF将中止指令执行.进一步讨论 英特尔透露,将第一个操作码字节替换为int3 不会受到这种错误的影响.

When we became aware of this I had a long discussion with Intel's microarchitecture guys. It turns out that the reason for this erratum (which incidentally Intel does not intend to fix) is because the trace cache - the stream of micorops resulting from instruction interpretation - cannot guaranteed to be valid. Reading between the lines I assume this issue arises because of optimization done in the trace cache, where it is no longer possible to identify the original instruction boundaries. If the CPU discoverers that the trace cache has been invalidated because of unsynchronized cross-modification then instruction execution will be aborted with a GPF. Further discussion with Intel revealed that replacing the first opcode byte with an int3 would not be subject to this erratum.

除了我在这里发布的内容外,关于这个问题,我在互联网上看到的还不多.此外,我还没有发现任何公开的例子,说明在x86和x86-64系统上使用交叉修改代码时,由于未能执行序列化指令而被人咬伤.

Beyond what I've posted here, there's not too much I've seen on the internet regarding this issue. Additionally, I haven't found any public examples of people getting bitten by failing to execute the serializing instruction when using cross-modifying code on x86 and x86-64 systems.

我有一台运行Intel Core 2 Duo E6600处理器的计算机,该计算机被明确记录为容易出现此问题,并且 无法编写触发此问题的代码.

I have a computer running an Intel Core 2 Duo E6600 Processor, which is explicitly documented as being prone to this problem, and I have not been able to write code that triggers this issue.

编写代码执行此操作对我来说是个人的好奇心.在生产代码中,我只是遵循规则,但我认为在重现此代码时可能需要学习一些东西.

Writing code to do this is a personal curiosity for me. In production code, I'd just follow the rules, but I figure there's probably something for me to learn in reproducing this.

推荐答案

认为处理器具有非常长的指令流水线,其中寄存器和内存仅在流水线的最后阶段被修改.当您为此处理器编写自修改代码并修改管道中已经存在的内存中的指令时,修改将无效.在这种情况下,程序的行为取决于处理器流水线的长度.

Think of a processor that has a very long instruction pipeline where registers and memory are only modified in the last pipeline stage. When you write self modifying code for this processor and modify an instruction in memory that is already present in the pipeline, the modification will have no effect. In this case the behaviour of the program depends on how long the pipeline of the processor is.

为使具有更长流水线的新处理器的行为与旧型号完全相同,Intel处理器包含一种机制,如果检测到这种情况,则刷新(清空)流水线.刷新后,将修改后的代码提取到管道中,因此新处理器的行为与旧处理器完全相同.

To make new processors with longer pipelines behave exactly as older models, Intel processors include a mechanism that flushes (empties) the pipeline if this case is detected. After the flush, the modified code is fetched into the pipeline, so the new processor behaves exactly as old ones.

序列化指令是刷新管道的另一种方法.当它到达流水线的末尾时,将刷新流水线,并在执行序列化指令后再次开始获取.

A serializing instruction is another way to flush the pipeline. When it reaches the end of the pipeline, the pipeline is flushed and starts fetching again after the serializing instruction.

所以,勘误表实质上是在说某些处理器模型不检查其他处理器的写操作是否会覆盖已在其管道中执行的指令.该检查仅适用于本地写入,不适用于外部写入.但是,如果您插入序列化指令,则会迫使处理器刷新管道,并且一切都会按预期进行.

So what the errata is essentially saying is that some processor models do not check if writes from other processors overwrite instructions that are already executing in their pipeline. The check works only for local writes, not for external writes. But if you insert a serializing instruction you force the processor to flush the pipeline and everything will behave as expected.

要重现勘误表中描述的行为,您需要确保从一个处理器修改的代码位于另一处理器的管道中.看一下分支预测(确定流水线内的哪个代码路径)和同步原语.

To reproduce the behaviour described in the errata you need to make sure that the code you are modifying from one processor is inside the pipeline of the other processor. Take a look at branch prediction (decides which code path is inside the pipeline) and synchronization primitives.

这篇关于在x86-64 CPU上使用交叉修改代码重现意外行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆