在调试模式下,哪些因素使迭代器变得如此缓慢(VC ++ 2012) [英] What factors make iterators so slow in debug mode (VC++ 2012)

查看:216
本文介绍了在调试模式下,哪些因素使迭代器变得如此缓慢(VC ++ 2012)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个10000个随机数(mod 100)的向量,我想计算这些数字中有两对总数为100.我写了以下内容:

I've got a vector of 10000 random numbers (mod 100) and I'd like to count how many pairs of two of those numbers sum to 100. I've written the following:

auto noPairsSumTo100 = 0;
const auto itEnd = end(myNums);
for (auto it1 = begin(myNums); it1 != itEnd ; ++it1) {
  for (auto it2 = it1; it2 != itEnd ; ++it2) {
    if (*it1 + *it2 == 100) {
      noPairsSumTo100 ++;
    }
  }
}

在我的机器上这需要在调试模式下运行21.6秒。如果我设置_ITERATOR_DEBUG_LEVEL = 0(将_SECURE_SCL和_HAS_ITERATOR_DEBUGGING都设置为0),则执行时间减少到~9.5秒。替换!= < 的比较可将时间进一步缩短至约8.5秒。

On my machine this takes about 21.6 seconds to run in debug mode. If I set _ITERATOR_DEBUG_LEVEL=0 (which sets both _SECURE_SCL and _HAS_ITERATOR_DEBUGGING to 0) the execution time is reduced to ~9.5 seconds. Replacing the != comparisons with < reduces the time further to ~8.5 seconds.

如果我通过索引这样的向量来实现相同的算法:

If I implement the same algorithm by indexing the vectors like this:

auto noPairsSumTo100 = 0;
const auto itEnd = end(myNums);
for (auto index1 = 0; index1 < noTerms; ++index1) {
  for (auto index2 = index1; index2 < noTerms; ++index2) {
    if (myNums[index1] + myNums[index2] == 100) {
      noPairsSumTo100 ++;
    }
  }
}

大约需要2.1秒在调试模式下运行。我认为这与我可以将算法放在迭代器使用之外一样接近。我的问题是,是什么让第一次实现比第二次实现需要大约4倍?

It takes about 2.1 seconds to run in debug mode. I think this is as close as I can make the algorithms aside from iterator usage. My question is, what makes the first implementation take ~4 times longer than the second?

注意,两种版本的算法在发布模式下运行大约需要34毫秒,因此差异已经过优化。

Note, both versions of the algorithm take about 34 milli-seconds to run in release mode, so the difference is optimised out.

推荐答案

检查边界,调试STL代码版本会产生大量的函数调用。
一些无辜的行如:

Bounds checking aside, debug builds of STL code produces an insane amount of function calls. Some innocent lines like:

if (a.empty())

可以产生多达8个(或更多)函数调用。
有些(全部?)STL实现根本没有针对调试版本进行优化。

can produce as much as 8 (or more) function calls. Some (all?) STL implementations are not optimized for debug builds at all.

STL的常见性能问题是开发人员认为函数内联始终有效。它没有。如果调用了太多的内联函数,那么底部函数就不会被内联,并且只是函数调用开销就会产生巨大的性能损失。
这在拥有容器容器时很常见:

A common performance issue with STL is that devs think that function inlining always works. It doesn't. If too many inline functions are called the bottom ones do not get inlined and you have a massive performance hit just by the function calls overhead. This is very common when having containers of containers:

map< string, map< int, string>>

外部地图上的操作可能会导致内联函数保持正常功能。

operations on the outer map can cause inline functions to stay as normal functions.

这篇关于在调试模式下,哪些因素使迭代器变得如此缓慢(VC ++ 2012)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆