为什么 RDTSC 不是序列化指令? [英] Why isn't RDTSC a serializing instruction?

查看:30
本文介绍了为什么 RDTSC 不是序列化指令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于 RDTSC 指令的英特尔手册警告说乱序执行RDTSC 实际执行时可以改变,所以他们建议在它前面插入一条 CPUID 指令,因为 CPUID 会序列化指令流(CPUID 永远不会乱序执行).我的问题很简单:如果他们有能力使指令序列化,他们为什么不进行 RDTSC 序列化?它的全部意义似乎是获得周期准确的计时.是否存在您不想想在它前面加上序列化指令的情况?

The Intel manuals for the RDTSC instruction warn that out of order execution can change when RDTSC is actually executed, so they recommend inserting a CPUID instruction in front of it because CPUID will serialize the instruction stream (CPUID is never executed out of order). My question is simple: if they had the ability to make instructions serializing, why didn't they make RDTSC serializing? The entire point of it appears to be to get cycle accurate timings. Is there a situation under which you would not want to precede it with a serializing instruction?

较新的 Intel CPU 有一个单独的 RDTSCP 指令进行序列化.英特尔选择引入单独的指令,而不是更改 RDTSC 的行为,这向我表明必须存在某些情况,其中可能出现故障的时序正是您想要的.它是什么?

Newer Intel CPUs have a separate RDTSCP instruction that is serializing. Intel opted to introduce a separate instruction rather than change the behavior of RDTSC, which suggests to me that there has to be some situation where a potentially out of order timing is what you want. What is it?

推荐答案

如果您尝试使用 rdtsc 来查看分支是否预测错误,那么非序列化版本就是您想要的.

If you are trying to use rdtsc to see if a branch mispredicts, the non-serializing version is what you want.

//math here
rdtsc
branch if zero to done
//do some work that always takes 1 cycle
done: rdtsc

如果分支被正确预测,增量会很小(甚至可能是负数?).如果分支预测错误,增量会很大.

If the branch is predicted correctly, the delta will be small (maybe even negative?). If the branch is mispredicted, the delta will be large.

在序列化版本中,分支条件将得到解决,因为第一个 rdtsc 等待数学完成.

With the serializing version, the branch condition will be resolved because the first rdtsc waits for the math to finish.

这篇关于为什么 RDTSC 不是序列化指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆