程序什么时候受内存带宽限制? [英] When is a program limited by the memory bandwidth?

查看:158
本文介绍了程序什么时候受内存带宽限制?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道我正在使用且需要大量内存的程序是否受内存带宽的限制.

I want to know if a program that I am using and which requires a lot of memory is limited by the memory bandwidth.

您预计何时会发生这种情况?在现实生活中,您是否曾经遇到过这种情况?

When do you expect this to happen? Did it ever happen to you in a real-life scenario?

我发现了几篇讨论此问题的文章,包括:

I found several articles discussing this issue, including:

  • http://www.cs.virginia.edu/~mccalpin/papers/bandwidth/node12.html
  • http://www.cs.virginia.edu/~mccalpin/papers/bandwidth/node13.html
  • http://ispass.org/ucas5/session2_3_ibm.pdf

第一个链接有些陈旧,但建议您每个浮点变量执行的浮点运算少于1-40个,才能看到这种效果(如果我错了,请纠正我).

The first link is a bit old, but suggests that you need to perform less than about 1-40 floating point operations per floating point variable in order to see this effect (correct me if I'm wrong).

如何测量给定程序正在使用的内存带宽,以及如何测量系统可以提供的(峰值)带宽?

How can I measure the memory bandwidth that a given program is using and how do I measure the (peak) bandwidth that my system can offer?

我不想在这里讨论任何复杂的缓存问题.我只对CPU和内存之间的通信感兴趣.

I don't want to discuss any complicated cache issues here. I'm only interested in the communication between the CPU and the memory.

推荐答案

要对系统的内存性能进行基准测试,请尝试 STREAM基准.请仔细研究基准测试任务和获得的结果,因为它们提供了有关内存的基本数据,您需要进一步执行这些操作.您需要弄清楚缓存的影响-您必须了解它们-以及带宽何时达到峰值.

To benchmark your system's memory performance try the STREAM benchmark. Study the benchmark tasks and the results you get carefully since they provide the basic data about your memory that you need to do anything further. You need to figure out the effect(s) of cache(s) -- you do have to understand them -- and when the bandwidth hits a peak.

要弄清楚程序的内存性能,请执行以下操作:

To figure out the memory performance of your program:

  1. 测量一系列问题大小的执行时间.
  2. 手动计算在相同范围的问题大小下,程序从内存读取和向内存写入的数据量.
  3. 按时间划分内存使用量.

警告:这是一种粗略的方法,仅应用于确定是否应注意内存带宽问题.如果粗略的判断告诉您程序使用的内存不足可用内存带宽的50%(您从当时的STREAM基准测试中获得的数据),那么您就不必再考虑了.

WARNING: this is an crude approach and should only be used to figure out if you ought to pay attention to memory bandwidth issues. If your crude figuring tells you that your program uses less than 50% of the available memory bandwidth (the figures you got from then STREAM benchmark) then you shouldn't give it any more thought.

当您的程序使用简单的访问模式来处理相对较少的非常大的数据结构时,这种粗略的方法效果最好.这确实描述了许多高性能的科学程序,但可能没有描述很多其他类型的程序.

This crude approach works best when your program manipulates relatively few very large data structures with simple access patterns. This does describe a lot of high-performance scientific programs but perhaps not a lot of other types of program.

如果您的程序正在使用虚拟内存或在执行时正在执行I/O,那么内存带宽就不是问题,除非您对磁盘带宽进行了整理.

If your program is using virtual memory or if it is doing I/O as it executes, then memory bandwidth is not a problem, not until you sort out disk bandwidth that is.

最后,是的,每次我运行我们的科学代码之一时,执行速度都受到内存带宽的限制.根据经验,如果代码执行了处理器规范所承诺的FLOPS的10%,我会很高兴.

Finally, yes, every time I run one of our scientific codes the speed of execution is limited by memory bandwidth. As a rule of thumb, if a code executes 10% of the FLOPS that the processor specification promises I'm happy.

这篇关于程序什么时候受内存带宽限制?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆