测量内存的延迟 [英] measuring latencies of memory

查看:120
本文介绍了测量内存的延迟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过链接他们正在处理有关主内存,L1和L2缓存的延迟的统计数据.

I was going through this link where they are dealing with the statistical data for latencies of main memory, L1 and L2 cache.

我想知道是否可以在不使用基准测试工具的情况下使用C/c ++代码进行计算?

I was wondering is it possible to compute the same using a C/c++ code without using the benchmarks tools?

推荐答案

基准测试工具(如LMBench)是用C编写的.因此,当您询问是否可以用C完成时,答案很简单,是".

The benchmark tools, like LMBench, are written in C. So when you ask if it can be done in C, the answer is quite simply, "yes".

LMBench通过执行重复的指针间接测试来测试内存延迟(在lat_mem_rd.c中).这与跟随链接列表相同,只是列表中没有内容,只是指向下一个单元格的指针.

LMBench tests memory latency (in lat_mem_rd.c) by doing repeated pointer indirections. This is the same thing as following a linked list, except there is no content in the list, just a pointer to the next cell.

struct cell { struct cell *next };

struct cell *ptr = ...;
for (i = 0; i < count; i++) {
    ptr = ptr->next;
    ptr = ptr->next;
    ... 100 of these, unrolled ...
    ptr = ptr->next;
    ptr = ptr->next;
}

通过调整列表的大小,可以控制内存访问是L1高速缓存,L2高速缓存还是主内存.但是,如果要测试L2高速缓存或主内存,则需要确保每次访问的内存都足够早,直到再次访问该高速缓存行时,该行已从较快的高速缓存中逐出.某些缓存也支持预取,因此跨步"方法也可能意味着在某些情况下您可以使用更快的缓存.

By adjusting the size of the list, you can control whether the memory accesses hit L1 cache, L2 cache, or main memory. If you are testing L2 cache or main memory, however, you will need to ensure that each memory access is to a cache line old enough that it has been evicted from the faster caches by the time you access it again. Some caches also have support for prefetching, so a "strided" approach may also mean that you hit a faster cache, for certain strides.

您还需要确保启用优化功能(-O2,使用GCC/Clang).否则,ptr可能会存储在堆栈中,从而增加延迟.最后,您需要确保编译器不会将ptr视为死"变量.经验丰富的编译器可能会注意到上面的代码实际上没有做任何事情.有时在编写基准测试时,编译器就是敌人. LMBench代码具有用于此目的的功能use_pointer().

You will also need to be sure to enable optimizations (-O2, with GCC/Clang). Otherwise ptr may get stored on the stack, increasing the latency. Finally, you will need to make sure that the compiler does not consider ptr to be a "dead" variable. A sophisticated compiler might notice that the above code doesn't actually do anything. Sometimes when writing benchmarks, the compiler is the enemy. The LMBench code has a function use_pointer() just for this purpose.

这篇关于测量内存的延迟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆