避免在数字C ++中进行虚拟函数调用 [英] Avoiding virtual function calls in numerical C++

查看:108
本文介绍了避免在数字C ++中进行虚拟函数调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在C ++中编写一些数值模拟代码。在这个模拟中,有一些是局部的,在二维网格上的每个点具有浮点值,而另一些是全局的,只有一个全局浮点值。

I'm writing some numerical simulation code in C++. In this simulation, there are some things that are "local", having a floating point value at every point on a two-dimensional grid, and others that are "global", having only a single global floating point value.

除了这个区别之外,两种类型的对象表现相似,所以我想能够拥有一个包含两种类型的对象的数组。然而,因为这是一个数值模拟,我需要这样做,(a)尽可能避免虚函数调用开销,(b)允许编译器尽可能多地使用优化 - 特别是,允许编译器在可能的情况下执行SIMD自动运算。

Aside from this difference, the two types of object behave similarly, and so I would like to be able to have an array that contains both types of object. However, because this is a numerical simulation, I need to do this in a way that (a) avoids virtual function call overheads as much as possible, and (b) allows the compiler to use optimisations as much as possible - and in particular, allows the compiler to do SIMD auto-vecotorisation where possible.

目前我发现自己编写的代码像这样(我现在意识到, ):

Currently I'm finding myself writing code like this (which, I now realise, will not actually work as intended):

class Base {};

class Local: public Base {
public:
    float data[size];
    // plus constructors etc.
};

class Global: public Base {
public:
    float data;
    // ...
};

void doStuff(Local a, Local b) {
    for (int i; i<size; ++i) {
        a.data[i] += b.data[i];
    }
}

void doStuff(Local a, Global b) {
    for (int i; i<size; ++i) {
        a.data[i] += b.data;
    }
}

void doStuff(Global a, Local b) {
    for (int i; i<size; ++i) {
        a.data += b.data[i];
    }
}

void doStuff(Global a, Global b) {
    a.data += b.data*size;
}



我的代码比这更复杂 - 数组是二维的,并且有几个 doStuff 类型的函数有三个而不是两个参数,所以我必须为每一个写8个专业。

My code is a bit more complex than this - the array is two dimensional, and there are several doStuff-type functions that have three rather than two arguments, so I have to write eight specialisations for each one.

这不符合预期的原因是 doStuff 的参数类型在编译时实际上是未知的。我想要做的是有一个数组 Base * ,并在其两个成员调用 doStuff 。然后我想要正确的专门化 doStuff 被调用的特定类型的参数。 (如果在 doStuff 中涉及到一个虚拟方法调用并不重要 - 我只想在内循环中避免它们。)

The reason this doesn't work as intended is that the types of the arguments to doStuff are not actually known at compile time. What I want to do is to have an array of Base * and to call doStuff on two of its members. I then want the correct specialisation of doStuff to be called for the specific types of its arguments. (It doesn't matter if there's a virtual method call involved in doStuff - I just want to avoid them in the inner loop.)

这样做而不是(例如)重载 operator [] 的点是编译器可以(希望)进行SIMD自动矢量化 doStuff(Local,Local) doStuff(Local,Global),我可以完全失去循环 doStuff(Global,Global)。也许还有其他编译器优化也可能发生在这些函数中。

The point of doing it this way rather than (for example) overloading operator[] is that the compiler can (hopefully) do SIMD auto-vectorisation for doStuff(Local, Local) and doStuff(Local, Global), and I can lose the loop entirely in doStuff(Global, Global). Perhaps there are other compiler optimisations that can happen in these functions as well.

然而,这是烦人的必须写这样重复的代码。因此,我想知道是否有一个方法来实现这个使用模板,所以我可以只写一个函数 doStuff(Base,Base)和等效于上面的代码将是生成。 (我希望gcc能够在 doStuff(Global,Global)的情况下优化掉循环。)

However, it's annoying to have to write such repetitive code. Consequently I'm wondering whether there's a way to achieve this using templates, so that I can just write one function doStuff(Base, Base) and code equivalent to the above will be generated. (I hope that gcc is smart enough to optimise away the loop in the case of doStuff(Global, Global).)

我强调以下解决方案不是我正在寻找的,因为它涉及在每次迭代通过循环的虚拟函数调用,这增加了开销,可能会阻止大量的编译器优化。

I stress that the following solution is not what I'm looking for, since it involves a virtual function call on every iteration through the loop, which adds overhead and probably prevents a lot of compiler optimisations.

class Base {
    virtual float &operator[](int) = 0;
};

class Local: public Base {
    float data[size];
public:
    float &operator[](int i) {
        return data[i];
    }
    // …
};

class Global: public Base {
    float data;
public:
    float &operator[](int i) {
        return data;
    }
    // ...
};

void doStuff(Base a, Base b) {
    for (int i; i<size; ++i) {
        a[i] += b[i];
    }
}

我想实现类似的效果,但没有在通过内循环的每次迭代上调用虚函数的开销。 (除非我完全错了,编译器实际上可以优化掉所有的虚函数调用并生成类似上面的代码)在这种情况下,你可以告诉我这么多时间!)

I would like to achieve a similar effect to the above, but without the overhead of a virtual function call on every iteration through the inner loop. (Unless I'm completely wrong, and the compiler can actually optimise away all the virtual function calls and generate code like the above. In that case you could save me a lot of time by telling me this!)

我曾看过 CRTP ,但它并不明显如何适应这种情况,至少不对我,因为多个重载的参数 doStuff

I did have a look at CRTP, but it's not obvious how to adapt it to this case, at least not to me, because of the multiple overloaded arguments to doStuff.

推荐答案

你几乎有答案。这样的模板函数应该工作(虽然我不知道 size 是从哪里来的):

You almost have the answer. A template function like this should work (though I don't know where size is coming from):

template<typename A, typename B>
void doStuff(A & a, B & b) {
    for (int i; i<size; ++i) {
        a[i] += b[i];
    }
}

这里有一个重载的

Here you have an overloaded operator[] but it isn't virtual.

如果你不知道调用时间你有什么类型,但你有固定数量的派生类型,然后创建一个静态dispatch是一个选项

If you don't know at call time what types you have, but you have a fixed number of derived types, then creating a static dispatch is an option

void doStuff( Base & a, Base & b ) {
    Local * a_local = dynamic_cast<Local*>(&a);
    Global * a_global = dynamic_cast<Global*>(&a);
    //same for b
    if( a_local && b_local ) {
        doStuffImpl(*a, *b); {
    } else if( a_local && b_global ) {
        doStuffImpl(*a, *b):
    } ...
}

你会注意到if块中的代码对于每个条件都是一样的,假设 doStuffImpl 是一个模板函数。我建议在宏中包装这个,以减少代码开销。您还可以自己跟踪类型,并且不使用 dynamic_cast 。在 Base 类中有一个枚举,它明确地列出了类型。这是一个安全机制,基本上防止未知派生类出现在 doStuff

You'll notice the code in the if block is the same for every condition, assuming doStuffImpl is a template function. I'd suggest wrapping this up in a macro to reduce the code overhead. You may also wish to track the type on your own and no use dynamic_cast. Have an enum in your Base class which explicitly lists the types. This is a safety mechanism that basically prevents unknown derived classes from appearing at doStuff.

需要。这是从动态类型转换为静态类型的唯一方法。如果您希望使用模板,则需要使用静态模板。

Unfortunately this type of approach is required. It's the only way to convert from dynamic types to static ones. And if you wish to use templates you need the static ones.

这篇关于避免在数字C ++中进行虚拟函数调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆