使用RDTSC获取CPU周期 - 为什么RDTSC的价值总是增加吗? [英] Getting cpu cycles using RDTSC - why does the value of RDTSC always increase?

查看:227
本文介绍了使用RDTSC获取CPU周期 - 为什么RDTSC的价值总是增加吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在一个特定的点,以获得CPU周期。我用这个功能在这一点:

I want to get the CPU cycles at a specific point. I use this function at that point:

static __inline__ unsigned long long rdtsc(void)
{
    unsigned long long int x;
    __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
    return x;
}

问题是,它总是返回的增加编号(每次运行)。这是因为如果它是指绝对时间

The problem is that it returns always an increasing number (in every run). It's as if it is referring to the absolute time.

我使用的功能不正确?

推荐答案

只要你的线程停留在相同的CPU核心,RDTSC指令将保持,直到它环绕返回越来越多。对于2GHz的CPU,这发生后292年,所以它不是一个真正的问题。你可能不会看到它发生。如果您希望在活那么久,请确保您的计算机重新启动,也就是说,每50年。

As long as your thread stays on the same CPU core, the RDTSC instruction will keep returning an increasing number until it wraps around. For a 2GHz CPU, this happens after 292 years, so it is not a real issue. You probably won't see it happen. If you expect to live that long, make sure your computer reboots, say, every 50 years.

与RDTSC的问题是,你不能保证它开始在同一时间点上的老人多核CPU的所有核心和不能保证它开始在同一时间点的时间上的所有CPU上的老人多-CPU板。结果
现代系统通常不具有这样的问题,但问题也可以通过设置一个线程的亲和力,因此只有一个CPU上运行被加工围绕在旧系统。这是不好的应用程序的性能,所以,一次不宜通常这样做,但测量蜱,它就好了。

The problem with RDTSC is that you have no guarantee that it starts at the same point in time on all cores of an elderly multicore CPU and no guarantee that it starts at the same point in time time on all CPUs on an elderly multi-CPU board.
Modern systems usually do not have such problems, but the problem can also be worked around on older systems by setting a thread's affinity so it only runs on one CPU. This is not good for application performance, so one should not generally do it, but for measuring ticks, it's just fine.

(另一个问题是很多人使用RDTSC测量时间,这是的的它做什么,但你写你想要的CPU周期,所以这是很好。如果你的的使用RDTSC来衡量的时候,你可能有意外的惊喜时,省电或hyperboost或任何频率变化的技术,众多被称为踢,对于实际的时间,在 clock_gettime 系统调用是Linux下出奇的好。)

(Another "problem" is that many people use RDTSC for measuring time, which is not what it does, but you wrote that you want CPU cycles, so that is fine. If you do use RDTSC to measure time, you may have surprises when power saving or hyperboost or whatever the multitude of frequency-changing techniques are called kicks in. For actual time, the clock_gettime syscall is surprisingly good under Linux.)

我只想写 RDTSC ASM 语句,这对我的工作就好了,比更具可读性内一些不起眼的十六进制code。假设它是正确的十六进制code(而且由于它没有崩溃并返回一个数量不断增加的,似乎这样),你的code是不错的。

I would just write rdtsc inside the asm statement, which works just fine for me and is more readable than some obscure hex code. Assuming it's the correct hex code (and since it neither crashes and returns an ever-increasing number, it seems so), your code is good.

如果要测量的蜱一张code的号码需要,你要打勾区别,你只需要减去不断增加的计数器的两个值。类似 uint64_t中T0 = RDTSC(); ... uint64_t中T1 = RDTSC() - T0; 结果
请注意,如果从周围code分离的非常精确的测量是必要的,你需要序列化,就是搪塞管道,调用 RDTSC 之前(或使用 rdtscp 它仅支持较新的处理器)。一个序列化的指令,可以在每一个privilegue级别采用的是 CPUID

If you want to measure the number of ticks a piece of code takes, you want a tick difference, you just need to subtract two values of the ever-increasing counter. Something like uint64_t t0 = rdtsc(); ... uint64_t t1 = rdtsc() - t0;
Note that for if very accurate measurements isolated from surrounding code are necessary, you need to serialize, that is stall the pipeline, prior to calling rdtsc (or use rdtscp which is only supported on newer processors). The one serializing instruction that can be used at every privilegue level is cpuid.

在注释答复进一步的问题:

该TSC从零开始,当你打开电脑(与BIOS重置所有CPU为相同的值所有计数器,虽然某些BIOS前几年没有这样做可靠)。

The TSC starts at zero when you turn on the computer (and the BIOS resets all counters on all CPUs to the same value, though some BIOSes a few years ago did not do so reliably).

因此​​,从程序的角度,计数器开始,在过去的一些未知的时间,它总是在每个时钟滴答增加CPU的情思。因此,如果你在一个不同的进程后执行指令现在返回该计数器和任何时候,它会返回一个更大的价值(除非该CPU已被暂停或之间的关闭)。同一程序的不同运行得到更大的数字,因为计数器不断增加。始终。

Thus, from your program's point of view, the counter started "some unknown time in the past", and it always increases with every clock tick the CPU sees. Therefore if you execute the instruction returning that counter now and any time later in a different process, it will return a greater value (unless the CPU was suspended or turned off in between). Different runs of the same program get bigger numbers, because the counter keeps growing. Always.

现在, clock_gettime(CLOCK_PROCESS_CPUTIME_ID)则另当别论。这是该操作系统已经给予处理的CPU时间。它开始于零的进程启动时。一个新的进程开始于零了。这样,在对方运行的两个进程将得到非常相似或相同的数字,而不是不断增长的。

Now, clock_gettime(CLOCK_PROCESS_CPUTIME_ID) is a different matter. This is the CPU time that the OS has given to the process. It starts at zero when your process starts. A new process starts at zero, too. Thus, two processes running after each other will get very similar or identical numbers, not ever growing ones.

clock_gettime(CLOCK_MONOTONIC_RAW)接近RDTSC如何工作(以及在一些较老的系统与它实现的)。它返回以往增加值。现今,这是一个典型的HPET。然而,这是真的时间,而不是。如果您的计算机进入低功耗状态(例如,在1/2正常频率运行),它会的还是的提前以同样的速度。

clock_gettime(CLOCK_MONOTONIC_RAW) is closer to how RDTSC works (and on some older systems is implemented with it). It returns a value that ever increases. Nowadays, this is typically a HPET. However, this is really time, and not ticks. If your computer goes into low power state (e.g. running at 1/2 normal frequency), it will still advance at the same pace.

这篇关于使用RDTSC获取CPU周期 - 为什么RDTSC的价值总是增加吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆