虚拟函数和性能C ++ [英] Virtual Functions and Performance C++

查看:199
本文介绍了虚拟函数和性能C ++的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在你嘲弄重复的标题之前,另一个问题不适合我在这里问(IMO)。所以。

Before you cringe at the duplicate title, the other question wasn't suited to what I ask here (IMO). So.

我真的想在我的应用程序中使用虚函数使事情更容易一百倍(不是那么简单的OOP)。但是我在某个地方看到了他们的性能成本,看到没有什么,只是同样老的炒作的过早优化,我决定给它一个快速的旋转在一个小的基准测试使用:

I am really wanting to use virtual functions in my application to make things a hundred times easier (isn't that what OOP is all about ;)). But I read somewhere they came at a performance cost, seeing nothing but the same old contrived hype of premature optimization, I decided to give it a quick whirl in a small benchmark test using:

CProfiler.cpp

#include "CProfiler.h"

CProfiler::CProfiler(void (*func)(void), unsigned int iterations) {
    gettimeofday(&a, 0);
    for (;iterations > 0; iterations --) {
        func();
    }
    gettimeofday(&b, 0);
    result = (b.tv_sec * (unsigned int)1e6 + b.tv_usec) - (a.tv_sec * (unsigned int)1e6 + a.tv_usec);
};

main.cpp

#include "CProfiler.h"

#include <iostream>

class CC {
  protected:
    int width, height, area;
  };

class VCC {
  protected:
    int width, height, area;
  public:
    virtual void set_area () {}
  };

class CS: public CC {
  public:
    void set_area () { area = width * height; }
  };

class VCS: public VCC {
  public:
    void set_area () {  area = width * height; }
  };

void profileNonVirtual() {
    CS *abc = new CS;
    abc->set_area();
    delete abc;
}

void profileVirtual() {
    VCS *abc = new VCS;
    abc->set_area();
    delete abc;
}

int main() {
    int iterations = 5000;
    CProfiler prf2(&profileNonVirtual, iterations);
    CProfiler prf(&profileVirtual, iterations);

    std::cout << prf.result;
    std::cout << "\n";
    std::cout << prf2.result;

    return 0;
}

最初我只做了100次和10000次迭代,结果令人担忧: 4ms为非虚拟化,250ms为虚拟化!我几乎去了nooooooo里面,但后来我把迭代增加到大约500,000;看到结果几乎完全相同(如果没有启用优化标志,可能会慢5%)。

At first I only did 100 and 10000 iterations, and the results were worrying: 4ms for non virtualised, and 250ms for the virtualised! I almost went "nooooooo" inside, but then I upped the iterations to around 500,000; to see the results become almost completely identical (maybe 5% slower without optimization flags enabled).

我的问题是,为什么会有这么大的变化,迭代相比高量?纯粹是因为在许多迭代中,虚拟函数在缓存中很热?

My question is, why was there such a significant change with a low amount of iterations compared to high amount? Was it purely because the virtual functions are hot in cache at that many iterations?

免责声明

我理解我的'profiling'代码不是完美的,但它,因为它,给出了一个事情的估计,这是所有重要的在这一层。

Disclaimer
I understand that my 'profiling' code is not perfect, but it, as it has, gives an estimate of things, which is all that matters at this level. Also I am asking these questions to learn, not to solely optimize my application.

推荐答案

扩展 Charles的回答

这里的问题是,你的循环不只是测试虚拟调用本身(内存分配可能使虚拟调用的开销比较大),所以他的建议是更改代码,以便只测试虚拟调用。

The problem here is that your loop is doing more than just testing the virtual call itself (the memory allocation probably dwarfs the virtual call overhead anyway), so his suggestion is to change the code so that only the virtual call is tested.

函数是模板,因为模板可以被内联,而调用通过函数指针是不可能的。

Here the benchmark function is template, because template may be inlined while call through function pointers are unlikely to.

template <typename Type>
double benchmark(Type const& t, size_t iterations)
{
  timeval a, b;
  gettimeofday(&a, 0);
  for (;iterations > 0; --iterations) {
    t.getArea();
  }
  gettimeofday(&b, 0);
  return (b.tv_sec * (unsigned int)1e6 + b.tv_usec) -
         (a.tv_sec * (unsigned int)1e6 + a.tv_usec);
}

课程:

struct Regular
{
  Regular(size_t w, size_t h): _width(w), _height(h) {}

  size_t getArea() const;

  size_t _width;
  size_t _height;
};

// The following line in another translation unit
// to avoid inlining
size_t Regular::getArea() const { return _width * _height; }

struct Base
{
  Base(size_t w, size_t h): _width(w), _height(h) {}

  virtual size_t getArea() const = 0;

  size_t _width;
  size_t _height;
};

struct Derived: Base
{
  Derived(size_t w, size_t h): Base(w, h) {}

  virtual size_t getArea() const;
};

// The following two functions in another translation unit
// to avoid inlining
size_t Derived::getArea() const  { return _width * _height; }

std::auto_ptr<Base> generateDerived()
{
  return std::auto_ptr<Base>(new Derived(3,7));
}

而测量:

int main(int argc, char* argv[])
{
  if (argc != 2) {
    std::cerr << "Usage: %prog iterations\n";
    return 1;
  }

  Regular regular(3, 7);
  std::auto_ptr<Base> derived = generateDerived();

  double regTime = benchmark<Regular>(regular, atoi(argv[1]));
  double derTime = benchmark<Base>(*derived, atoi(argv[1]));

  std::cout << "Regular: " << regTime << "\nDerived: " << derTime << "\n";

  return 0;
}



注意:这将测试虚拟调用的开销一个常规函数。功能不同(因为在第二种情况下没有运行时分派),但是这是最糟糕的开销。

运行结果(gcc.3.4.2,-O2,SLES10 quadcore服务器)注意:单位,以防止内嵌

Results of the run (gcc.3.4.2, -O2, SLES10 quadcore server) note: with the functions definitions in another translation unit, to prevent inlining

> ./test 5000000
Regular: 17041
Derived: 17194

这篇关于虚拟函数和性能C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆