x86指令高速缓存同步如何? [英] How is x86 instruction cache synchronized?

查看:118
本文介绍了x86指令高速缓存同步如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我喜欢的例子,所以我写了一个在C位自修改code的...

I like examples, so I wrote a bit of self-modifying code in c...

#include <stdio.h>
#include <sys/mman.h> // linux

int main(void) {
    unsigned char *c = mmap(NULL, 7, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|
                            MAP_ANONYMOUS, -1, 0); // get executable memory
    c[0] = 0b11000111; // mov (x86_64), immediate mode, full-sized (32 bits)
    c[1] = 0b11000000; // to register rax (000) which holds the return value
                       // according to linux x86_64 calling convention 
    c[6] = 0b11000011; // return
    for (c[2] = 0; c[2] < 30; c[2]++) { // incr immediate data after every run
        // rest of immediate data (c[3:6]) are already set to 0 by MAP_ANONYMOUS
        printf("%d ", ((int (*)(void)) c)()); // cast c to func ptr, call ptr
    }
    putchar('\n');
    return 0;
}

...它的工作原理,显然是:

...which works, apparently:

>>> gcc -Wall -Wextra -std=c11 -D_GNU_SOURCE -o test test.c; ./test
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

不过说实话,我没有的期望的它在所有的工作。我预计包含指令 C [2] = 0 在第一次调用缓存到 C ,之后所有ç连续通话会忽略C语言所做的反复变化(除非我不知explicitedly失效缓存)。幸运的是,我的CPU似乎是聪明得多。

But honestly, I didn't expect it to work at all. I expected the instruction containing c[2] = 0 to be cached upon the first call to c, after which all consecutive calls to c would ignore the repeated changes made to c (unless I somehow explicitedly invalidated the cache). Luckily, my cpu appears to be smarter than that.

我猜的CPU比较RAM(假设 C 即使驻留在RAM)的指令缓存每当指令指针使得大十岁上下跳跃(与调用上述mmapped内存),和无效当它不匹配缓存(这一切),但我希望得到更多的precise信息。我特别想知道,如果这种行为可以被认为是predictable(不包括硬件和操作系统的任何差异),并依靠?

I guess the cpu compares RAM (assuming c even resides in RAM) with the instruction cache whenever the instruction pointer makes a large-ish jump (as with the call to the mmapped memory above), and invalidates the cache when it doesn't match (all of it?), but I'm hoping to get more precise information on that. In particular, I'd like to know if this behavior can be considered predictable (barring any differences of hardware and os), and relied on?

(我也许应该指的是英特尔手册,但是那个东西是长数千页,我往往迷失在它...)

(I probably should refer to the Intel manual, but that thing is thousands of pages long and I tend to get lost in it...)

推荐答案

你要做的就是通常所说的什么的自修改code 。英特尔的平台(也可能是AMD的太)为你维持的 I / D高速缓存一致性的做的工作的,因为手动点出来(的手动3A,系统编程

What you do is usually referred as self-modifying code. Intel's platforms (and probably AMD's too) do the job for you of maintaining an i/d cache-coherency, as the manual points it out (Manual 3A, System Programming)

11.6自修改code

11.6 SELF-MODIFYING CODE

写入到存储位置在code段当前在缓存
  处理器使关联的缓存行(或线)为无效。

A write to a memory location in a code segment that is currently cached in the processor causes the associated cache line (or lines) to be invalidated.

但作为同一个线性地址用于修改和提取,这是不是对的调试器的和的二进制装载机的,因为他们不'的情况下,这种说法是只要有效在同一个地址,SPACE T运行:

But this assertion is valid as long as the same linear address is used for modifying and fetching, which is not the case for debuggers and binary loaders since they don't run in the same address-space:

应用程序,包括自修改code使用相同的
  线性地址修改和取指令。系统软件,如
  调试器,这可能使用不同的线性地址可能修改的指令
  比用于获取指令,将执行串行化操作,例如一个
  CPUID指令,在执行修正的指令之前,自动将
  重新同步指令高速缓存和prefetch队列。

Applications that include self-modifying code use the same linear address for modifying and fetching the instruction. Systems software, such as a debugger, that might possibly modify an instruction using a different linear address than that used to fetch the instruction, will execute a serializing operation, such as a CPUID instruction, before the modified instruction is executed, which will automatically resynchronize the instruction cache and prefetch queue.

有关例如,序列化操作总是由许多其他体系结构,如PowerPC的,在那里它必须被明确地进行( e500内核手册):

For instance, serialization operation are always requested by many other architectures such as PowerPC, where it must be done explicitely (E500 Core Manual):

3.3.1.2.1自修改code

3.3.1.2.1 Self-Modifying Code

当处理器修改可以包含指令的任何存储位置,软件必须
  确保该指令高速缓冲存储器由具有数据存储器和该修改一致
  由可见的取指令的机制。必须这么做,即使缓存
  禁用或者如果页面标记缓存-抑制。

When a processor modifies any memory location that can contain an instruction, software must ensure that the instruction cache is made consistent with data memory and that the modifications are made visible to the instruction fetching mechanism. This must be done even if the cache is disabled or if the page is marked caching-inhibited.

有趣的是,注意到的PowerPC要求即使当高速缓存禁用上下文同步指令的问题;我怀疑它强制更深数据的处理单元的冲水如加载/存储缓冲器。

It is interesting to notice that PowerPC requires the issue of a context-synchronizing instruction even when caches are disabled; I suspect it enforces a flush of deeper data processing units such as the load/store buffers.

您所提出的code是在没有可靠的架构的窥探的或先进的高速缓存一致性的设施,因此很可能会失败。

The code you proposed is unreliable on architectures without snooping or advanced cache-coherency facilities, and therefore likely to fail.

希望这有助于。

这篇关于x86指令高速缓存同步如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆