为什么在这个C ++ for循环的执行时间有显着的差异？ [英] Why is there a significant difference in this C++ for loop's execution time?

查看：211 发布时间：2016/10/22 17:17:35 c++ performance nested-loops

本文介绍了为什么在这个C ++ for循环的执行时间有显着的差异？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我经历了循环，发现访问循环有重大差异。
我不明白在这两种情况下会导致这种差异的原因是什么？

第一个例子：

执行时间; 8秒

  for（int kk = 0; kk< 1000; kk ++）
 {
 sum = 0; 
 for（int i = 0; i <1024; i ++）
 for（int j = 0; j <1024; j ++）
 {
 sum + i] [j]。 
} 
}

第二个例子：

执行时间：23秒

  for（int kk = 0; kk< 1000; kk ++） 
 {
 sum = 0; 
 for（int i = 0; i <1024; i ++）
 for（int j = 0; j <1024; j ++）
 {
 sum + j] [i]。 
} 
}

  matrix [i] [j]

到

  matrix [j] [i] 
  
 
 
 ？
解决方案
缓存
 
 
   matrix [i] [j]  j] [i] ，因为 matrix [i] [j] 具有更多的连续存储器访问机会。
 
 
 例如，当我们访问 matrix [i] [0] 时，缓存可以加载包含 [i] [0] ，因此，访问 matrix [i] [1] ， ] ，...，将受益于缓存速度，因为 matrix [i] [1] ， matrix [i ] [2] ，...靠近 matrix [i] [0] 。
 
 $ b $然而，当我们访问 matrix [j] [0] 时，它远离 matrix [j-1] [0] / code>，并且可能未被缓存，并且不能受益于缓存速度。特别是，矩阵通常作为连续的大段存储器存储，并且cacher可以断言存储器访问的行为并且总是缓存存储器。
 
 
 这是为什么 matrix [i] [j] 更快。这是典型的基于CPU缓存的性能优化。
 
I was going through loops and found a significant difference in accessing loops.
I can't understand what is the thing that causes such difference in both cases?

First Example:

Execution Time; 8 seconds
for (int kk = 0; kk < 1000; kk++)
{
    sum = 0;
    for (int i = 0; i < 1024; i++)
        for (int j = 0; j < 1024; j++)
        {
            sum += matrix[i][j];
        }
}
Second Example:

Execution Time: 23 seconds
for (int kk = 0; kk < 1000; kk++)
{
    sum = 0;
    for (int i = 0; i < 1024; i++)
        for (int j = 0; j < 1024; j++)
        {
            sum += matrix[j][i];
        }
}
What causes so much execution time difference just exchanging
matrix[i][j] 
to 
matrix[j][i]
?
 解决方案 
It's an issue of memory cache.

matrix[i][j] has better cache hits than matrix[j][i], since matrix[i][j] has more continuous memory accessing chances.

For example, when we access matrix[i][0], the cache may load a continuous segment of memory containing matrix[i][0], thus, accessing matrix[i][1], matrix[i][2], ..., will benefit from caching speed, since matrix[i][1], matrix[i][2], ... are near to matrix[i][0].

However, when we access matrix[j][0], it is far from matrix[j - 1][0] and may not been cached, and can not benefit from caching speed. Especially, a matrix is normally stored as a continuous big segment of memory, and the cacher may predicate the behavior of memory accessing and always cache the memory.

That's why matrix[i][j] is faster. This is typical in CPU cache based performance optimizing.

                        这篇关于为什么在这个C ++ for循环的执行时间有显着的差异？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

为什么在这个C ++ for循环的执行时间有显着的差异？ [英] Why is there a significant difference in this C++ for loop's execution time?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

为什么在这个C ++ for循环的执行时间有显着的差异？ [英] Why is there a significant difference in this C++ for loop&#39;s execution time?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

为什么在这个C ++ for循环的执行时间有显着的差异？ [英] Why is there a significant difference in this C++ for loop's execution time?

登录关闭