内联成员运算符与内联运算符C ++ [英] Inline member operators vs inline operators C++

查看:104
本文介绍了内联成员运算符与内联运算符C ++的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有两个结构:

struct A
{
    float x, y;
    inline A operator*(A b) 
    {
        A out;
        out.x = x * b.x;
        out.y = y * b.y;
        return out;
    } 
}

和等效的结构

struct B
{
    float x, y;
}

inline B operator*(B a, B b) 
{
    B out;
    out.x = a.x * b.x;
    out.y = a.y * b.y;
    return out;
} 

您是否知道B的运算符*进行任何不同编译或以比A的运算符*慢或快的任何原因运行(函数内部发生的实际操作无关紧要)吗?

Would you know of any reason for B's operator* to compile any differently, or run any slower or faster than A's operator* (the actual actions that go on inside the functions should be irrelevant)?

我的意思是……将内联运算符声明为成员而不是成员不对实际函数的速度产生任何一般性影响?

What I mean is... would declaring the inline operator as a member, vs not as a member, have any generic effect on the speed of the actual function, whatsoever?

我有许多不同的结构,这些结构目前遵循内联成员运算符的样式...但是,我想将其修改为有效的C代码.所以在我这样做之前,我想知道性能/编译是否会发生任何变化.

I've got a number of different structs that currently follow the inline member operator style... But I was wanting to modify it to be valid C code, instead; so before I do that I wanted to know if there would be any changes to performance/compilation.

推荐答案

编写方式,我希望B::operator*的运行速度稍慢.这是因为A::operator*的幕后"实现类似于:

The way you have it written, I'd expect B::operator* to run slightly slower. This is because the "under the hood" implementation of A::operator* is like:

inline A A::operator*(A* this, A b) 
{ 
    A out;
    out.x = this->x * b.x;
    out.y = this->y * b.y;
    return out;
}

因此,A将指针传递给该函数的左侧参数,而B必须在调用该函数之前复制该参数.两者都必须复制其右侧参数.

So A passes a pointer to its left-hand-side argument to the function, while B has to make a copy of that parameter before calling the function. Both have to make copies of their right-hand-side parameters.

如果您使用引用编写并使其const正确,则您的代码会更好,并且可能会为AB实施相同的代码:

Your code would be much better off, and probably would implement the same for A and B, if you wrote it using references and made it const correct:

struct A
{
    float x, y;
    inline A operator*(const A& b) const 
    {
        A out;
        out.x = x * b.x;
        out.y = y * b.y;
        return out;
    } 
}

struct B
{
    float x, y;
}

inline B operator*(const B& a, const B& b) 
{
    B out;
    out.x = a.x * b.x;
    out.y = a.y * b.y;
    return out;
}

您仍然要返回对象,而不是引用,因为结果实际上是临时的(您不返回已修改的现有对象).

You still want to return objects, not references, since the results are effectively temporaries (you're not returning a modified existing object).

附录

但是,在B中使用两个参数的const传递引用时,由于取消引用,它会使其实际上比A更快吗?

However, with the const pass-by-reference for both arguments, in B, would it make it effectively faster than A, due to the dereferencing?

首先,当您拼写所有代码时,两者都涉及相同的取消引用. (请记住,访问this的成员意味着要取消指针的引用.)

First off, both involve the same dereferencing when you spell out all the code. (Remember, accessing members of this implies a pointer dereference.)

但是即使如此,这仍然取决于编译器的智能程度.在这种情况下,假设它查看您的结构,并决定它不能将其填充到寄存器中,因为它是两个浮点数,因此它将使用指针来访问它们.因此,取消引用的指针情况(这是实现引用的方式)是最好的.程序集看起来像这样(这是伪程序集代码):

But even then, it depends on how smart your compiler is. In this case, let's say it looks at your structure and decides it can't stuff it in a register because it's two floats, so it will use pointers to access them. So the dereferenced pointer case (which is what references get implemented as) is the best you'll get. The assembly is going to look something like this (this is pseudo-assembly-code):

// Setup for the function. Usually already done by the inlining.
r1 <- this
r2 <- &result
r3 <- &b

// Actual function.
r4 <- r1[0]
r4 <- r4 * r3[0]
r2[0] <- r4
r4 <- r1[4]
r4 <- r4 * r3[4]
r2[4] <- r4

这是假定类似RISC的体系结构(例如ARM). x86可能使用较少的步骤,但是无论如何,它会被指令解码器扩展到大约这一详细级别.关键在于,所有这些都是寄存器中指针的固定偏移量取消引用,这将与它获得的速度一样快.优化器可以尝试变得更聪明,并在多个寄存器中实现对象,但是这种优化器很难编写. (尽管我暗中怀疑,如果result只是一个未保留的临时对象,则LLVM类型的编译器/优化器可以轻松地完成该优化.)

This is assuming a RISC-like architecture (say, ARM). x86 probably uses less steps but it gets expanded to about this level of detail by the instruction decoder anyway. The point being that it's all fixed-offset dereferences of pointers in registers, which is about as fast as it will get. The optimizer can try to be smarter and implement the objects across several registers, but that kind of optimizer is a lot harder to write. (Though I have a sneaking suspicion that an LLVM-type compiler/optimizer could do that optimization easily if result were merely a temporary object that is not preserved.)

因此,由于使用的是this,因此具有隐式指针取消引用.但是如果对象在堆栈上怎么办?无济于事;堆栈变量变成堆栈指针(或帧指针(如果使用))的固定偏移量取消引用.因此,除非您的编译器足够明亮以容纳对象并将其分散到多个寄存器中,否则您将在最后引用指针.

So, since you're using this, you have an implicit pointer dereference. But what if the object were on the stack? Doesn't help; stack variables turn into fixed-offset dereferences of the stack pointer (or frame pointer, if used). So you're dereferencing a pointer somewhere in the end, unless your compiler is bright enough to take your object and spread it across multiple registers.

可以随意将-S选项传递给gcc,以获取最终代码的反汇编,以了解您所遇到的实际情况.

Feel free to pass the -S option to gcc to get a disassembly of the final code to see what's really happening in your case.

这篇关于内联成员运算符与内联运算符C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆