C++:访问结构比基本变量慢? [英] C++: Structs slower to access than basic variables?

查看:23
本文介绍了C++:访问结构比基本变量慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现了一些像这样优化"的代码:

I found some code that had "optimization" like this:

void somefunc(SomeStruct param){
    float x = param.x; // param.x and x are both floats. supposedly this makes it faster access
    float y = param.y;
    float z = param.z;
}

并且评论说它会使变量访问更快,但我一直认为结构元素访问与毕竟不是结构一样快.

And the comments said that it will make the variable access faster, but i've always thought structs element access is as fast as if it wasnt struct after all.

有人能帮我清理一下吗?

Could someone clear my head off this?

推荐答案

通常的优化规则 (Michael A. Jackson) 适用:1. 不要这样做.2.(仅限专家:)先不要这样做.

The usual rules for optimization (Michael A. Jackson) apply: 1. Don't do it. 2. (For experts only:) Don't do it yet.

话虽如此,让我们假设它是最内层的循环,它占用了性能关键型应用程序 80% 的时间.即便如此,我怀疑你永远不会看到任何不同.让我们以这段代码为例:

That being said, let's assume it's the innermost loop that takes 80% of the time of a performance-critical application. Even then, I doubt you will ever see any difference. Let's use this piece of code for instance:

struct Xyz {
    float x, y, z;
};

float f(Xyz param){
    return param.x + param.y + param.z;
}

float g(Xyz param){
    float x = param.x;
    float y = param.y;
    float z = param.z;
    return x + y + z;
}

通过 LLVM 运行 显示:只有没有优化,两者都按预期运行(g 将结构体成员复制到局部变量中,然后对它们求和;f 对从 param 直接获取的值求和).使用标准优化级别,两者都会产生相同的代码(提取一次值,然后对它们求和).

Running it through LLVM shows: Only with no optimizations, the two act as expected (g copies the struct members into locals, then proceeds sums those; f sums the values fetched from param directly). With standard optimization levels, both result in identical code (extracting the values once, then summing them).

对于短代码,这种优化"实际上是有害的,因为它不必要地复制了浮点数.对于在多个地方使用成员的较长代码,如果您主动告诉编译器是愚蠢的,它可能会有所帮助.使用 65 个(而不是 2 个)成员/局部变量添加的快速测试证实了这一点:在没有优化的情况下,f 重复加载结构成员,而 g 重用已经提取的局部变量.优化后的版本再次相同,并且都只提取一次成员.(令人惊讶的是,即使启用了 LTO,也没有将加法转换为乘法的强度降低,但这只是表明所使用的 LLVM 版本无论如何都没有进行过多的优化 - 因此它应该在其他编译器中也能正常工作.)

For short code, this "optimization" is actually harmful, as it copies the floats needlessly. For longer code using the members in several places, it might help a teensy bit if you actively tell your compiler to be stupid. A quick test with 65 (instead of 2) additions of the members/locals confirms this: With no optimizations, f repeatedly loads the struct members while g reuses the already extracted locals. The optimized versions are again identical and both extract the members only once. (Surprisingly, there's no strength reduction turning the additions into multiplications even with LTO enabled, but that just indicates the LLVM version used isn't optimizing too agressively anyway - so it should work just as well in other compilers.)

所以,底线是:除非你知道你的代码必须由一个非常愚蠢和/或过时的编译器编译,它不会优化任何东西,你现在有证据编译器将使两种方式等效,从而可以消除这种以性能为名犯下的违反可读性和酿造性的罪行.(如有必要,请为您的特定编译器重复实验.)

So, the bottom line is: Unless you know your code will have to be compiled by a compiler that's so outragously stupid and/or ancient that it won't optimize anything, you now have proof that the compiler will make both ways equivalent and can thus do away with this crime against readability and brewity commited in the name of performance. (Repeat the experiment for your particular compiler if necessary.)

这篇关于C++:访问结构比基本变量慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆