主存带宽测量 [英] Main memory bandwidth measurement

查看:91
本文介绍了主存带宽测量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想测量主内存带宽,在寻找方法时,我发现,

I want to measure the main memory bandwidth and while looking for the methodology, I found that,

  1. 许多人使用"bcopy"功能将字节从源复制到目标,然后测量它们报告的带宽时间.
  2. 完成此操作的其他方法是分配和排列数组,并逐步遍历数组(这需要一定的步伐)-基本上,这给了读取整个数组的时间.
  1. many used 'bcopy' function to copy bytes from a source to destination and then measure the time which they report as the bandwidth.
  2. Others ways of doing it is to allocate and array and walk through the array (with some stride) - this basically gives the time to read the entire array.

我尝试对1GB的数据大小执行(1),并且获得的带宽为"700MB/秒"(我使用rdtsc来计算复制所经过的周期数).但是我怀疑这是不正确的,因为我的RAM配置如下:

I tried doing (1) for data size of 1GB and the bandwidth I got is '700MB/sec' (I used rdtsc to count the number of cycles elapsed for the copy). But I suspect that this is not correct because my RAM config is as follows:

  1. 速度:1333 MHz
  2. 总线宽度:32位

根据维基百科,理论带宽的计算如下:

As per wikipedia, the theoretical bandwidth is calculated as follows:

时钟速度*总线宽度*每行每个时钟周期#位(对于ddr 3为2 ram)1333 MHz * 32 * 2〜= 8GB/秒.

clock speed * bus width * # bits per clock cycle per line (2 for ddr 3 ram) 1333 MHz * 32 * 2 ~= 8GB/sec.

因此,我的带宽与估计的带宽完全不同.知道我在做什么错吗?

So mine is completely different from the estimated bandwidth. Any idea of what am I doing wrong?

=========

=========

另一个问题是,bcopy涉及读取和写入.那么,这是否意味着我应该将计算出的带宽除以二,以获得仅读取带宽或仅获得写入带宽?我想确认带宽是否只是延迟的倒数?请提出其他测量带宽的方法.

Other question is, bcopy involves both read and write. So does it mean that I should divide the calculated bandwidth by two to get only the read or only the write bandwidth? I would like to confirm whether the bandwidth is just the inverse of latency? Please suggest any other ways of measuring the bandwidth.

推荐答案

我无法评论bcopy的有效性,但是最直接的方法是您所说的第二种方法(跨度为1).此外,您在内存带宽公式中会将位与字节混淆. 32位= 4字节.现代计算机使用64位宽的内存总线.因此,您的有效传输速率(假设使用DDR3技术)

I can't comment on the effectiveness of bcopy, but the most straightforward approach is the second method you stated (with a stride of 1). Additionally, you are confusing bits with bytes in your memory bandwidth equation. 32 bits = 4bytes. Modern computers use 64 bit wide memory buses. So your effective transfer rate (assuming DDR3 tech)

1333Mhz * 64bit/(8bits/byte)= 10666MB/s(也分类为PC3-10666)

1333Mhz * 64bit/(8bits/byte) = 10666MB/s (also classified as PC3-10666)

1333Mhz已经考虑了2个传输/时钟.

The 1333Mhz already has the 2 transfer/clock factored in.

查看Wiki页面以获取更多信息: http://en.wikipedia.org/wiki/DDR3_SDRAM

Check out the wiki page for more info: http://en.wikipedia.org/wiki/DDR3_SDRAM

关于结果,请尝试使用阵列访问权限. Malloc 1GB并遍历整个对象.您可以对数组的每个元素求和并打印出来,这样编译器就不会认为它是无效代码.

Regarding your results, try again with the array access. Malloc 1GB and traverse the entire thing. You can sum each element of the array and print it out so your compiler doesn't think it's dead code.

类似这样的东西:

double time;
int size = 1024*1024*1024;
int sum;
*char *array = (char*)malloc(size);
//start timer here
for(int i=0; i < size; i++)
  sum += array[i];
//end timer
printf("time taken: %f \tsum is %d\n", time, sum);

这篇关于主存带宽测量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆