在 C# 中,x+=y 和 x=x+y(x 和 y 都是简单类型)之间有什么性能差异? [英] Is there any performance difference between x+=y and x=x+y (x and y are both simple types) in C#?

查看:130
本文介绍了在 C# 中,x+=y 和 x=x+y(x 和 y 都是简单类型)之间有什么性能差异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 C/C++ 中,

<块引用>

复合赋值运算符将简单赋值运算符与另一个二元运算符组合在一起.复合赋值运算符执行附加运算符指定的操作,然后将结果分配给左操作数.例如,复合赋值表达式如

expression1 += expression2

可以理解为

expression1 = expression1 + expression2

然而,复合赋值表达式并不等同于扩展版本,因为复合赋值表达式只计算 expression1 一次,而扩展版本计算 expression1 两次:在加法运算中和在赋值操作.

(引自 Microsoft Docs)


例如:

  1. 对于 i+=2;i 将被直接修改,无需创建任何新对象.
  2. 对于i=i+2;,首先会创建i 的副本.复制的将被修改,然后被分配回 i.

 i_copied = i;i_copyed += 2;i = i_copyed;

如果没有编译器的任何优化,第二种方法将构造一个无用的实例,从而降低性能.


在 C# 中,不允许像 += 这样的运算符被重载.以及所有 简单类型intdouble 被声明为 readonly struct (这是否意味着 C# 中的所有结构实际上都是不可变的?).

我想知道在 C# 中,是否有某种表达式来强制对象被直接修改(至少对于简单类型而言),而不会创建任何无用的实例.>

此外,如果没有来自构造函数的副作用,C# 编译器是否有可能按预期将表达式 x=x+y 优化为 x+=y和解构器.

解决方案


C#

当您将 C# 编译为 .NET 程序集时,代码采用 MSIL(Microsoft 中间语言).这允许代码是可移植的..NET 运行时将编译它 JIT 以执行.

MSIL 是一种堆栈语言.它不知道目标硬件的详细信息(例如 CPU 有多少个寄存器).只有一种方法可以编写该加法:

 ldloc.0ldloc.1添加stloc.0

加载堆栈中的第一个本地,加载第二个,添加※,从堆栈中设置第一个本地.

※: add 从堆栈中弹出两个元素,将它们相加,并将结果压回到堆栈中.

因此,x=x+yx+=y 将产生相同的代码.

<小时>

当然,之后会发生一些优化.JIT 编译器会将其转换为实际的机器代码.

这是我在 SharpLab 中看到的:

mov ecx, [ebp-4]添加 ecx, [ebp-8]mov [ebp-4], ecx

所以,我们把[ebp-4]复制到ecx中,在里面加上[ebp-8],然后复制ecx 回到 [ebp-4].

那么...寄存器 ecx 是一个无用的实例吗?

<小时>

嗯,那是 SharpLab,那是 JIT.理论上,不同的编译器可以将代码转换为不同平台上的不同内容.

您可以将 .NET 代码 AOT 编译为本机映像,这将更积极地进行优化.虽然,我不知道你将如何改进一个简单的添加.哦,我知道,它可能会看到您没有使用此值并将其删除,或者可能会看到您总是添加相同的值并用常量替换它.

可能值得注意的是,现代 .NET JIT 能够在执行期间继续优化代码(它会迅速生成优化不佳的代码本机版本,稍后 - 一旦准备就绪 - 将其替换为更好的版本).这个决定来自这样一个事实,即在 JIT 运行时上,性能取决于创建本机代码所需的时间和运行本机代码所需的时间.

<小时>

C++

让我们看看 C++ 是做什么的.这是我看到的 x = x + yx += y 使用 godbolt(默认设置※):

 mov eax, DWORD PTR [rbp-8]添加 DWORD PTR [rbp-4], eaxmov eax, DWORD PTR [rbp-4]

指令 movaddmov 与我们从 SharpLab 获得的指令相匹配,但寄存器选择不同.

※: x86-64 gcc 9.3 with -g -o/tmp/compiler-explorer-compiler2020424-22672-17cap6k.bjoj/output.s -masm=intel -S -fdiagnostics-color=always/tmp/compiler-explorer-compiler2020424-22672-17cap6k.bjoj/example.cpp

添加编译器选项 -O 使代码消失.这是有道理的,因为我没有使用它.

In C/C++,

The compound-assignment operators combine the simple-assignment operator with another binary operator. Compound-assignment operators perform the operation specified by the additional operator, then assign the result to the left operand. For example, a compound-assignment expression such as

expression1 += expression2

can be understood as

expression1 = expression1 + expression2

However, the compound-assignment expression is not equivalent to the expanded version because the compound-assignment expression evaluates expression1 only once, while the expanded version evaluates expression1 twice: in the addition operation and in the assignment operation.

(Quoted from Microsoft Docs)


For example:

  1. For i+=2;, i would be modified directly without any new objects being created.
  2. For i=i+2;, a copy of i would be created at first. The copied one would be modified and then be assigned back to i.

        i_copied = i;
        i_copied += 2;
        i = i_copied;

Without any optimizations from compiler, the second method will construct a useless instance, which degrades the performance.


In C#, operators like += are not permitted to be overloaded. And all simple types like int or double are declared as readonly struct (Does that mean all of structs in C# are immutable in truth?).

I wonder in C#, is there a certain expression to force object be modified (at least for simple types) directly, without any useless instances being created.

And also, is it possible the C#-compiler optimizes the expression x=x+y to x+=y as expected, if there's no side-effect from constructors and deconstructors.

解决方案


C#

When you compile C# into a .NET assembly, the code is in MSIL (Microsoft Intermediate Language). This allows the code to be portable. The .NET Runtime will compile it JIT for execution.

MSIL is an stack language. It does not know details of the target hardware (such as how many registers does the CPU have). There is only one way to write that addition:

    ldloc.0
    ldloc.1
    add
    stloc.0

Load the first local in the stack, load the second, add※ them, set the first local from the stack.

※: add pops two elements from the stack, adds them, and pushes the result back into the stack.

Thus, both x=x+y and x+=y will yield the same code.


Of course, there are optimizations that happens after. The JIT compiler will convert that into actual machine code.

This is what I see with SharpLab:

mov ecx, [ebp-4]
add ecx, [ebp-8]
mov [ebp-4], ecx

So, we copy [ebp-4] into ecx, add [ebp-8] to it, and then copy ecx back to [ebp-4].

So... Is the register ecx a useless instance?


Well, that is SharpLab, and that is JIT. A different compiler could in theory convert the code into something different on a different platform.

You can compile .NET code AOT to a native image, which will be more aggressive with optimizations. Although, I do not see how you are going to improve upon a simple addition. Oh, I know, it might see that you do not use this value and remove it, or may see that you are always adding the same values and replace it with a constant.

It might be worth noting that modern .NET JIT is able to continue to optimize the code during execution (it will quickly make a poorly optimized native version of the code, and later - once it is ready - replace it with a better version). This decision comes from the fact that on a JIT runtime, the performance depends on both the time it takes to create the native code and the time that native code takes to run.


C++

Let us see what C++ does. This is what I see for both x = x + y and x += y using godbolt (default settings※):

    mov     eax, DWORD PTR [rbp-8]
    add     DWORD PTR [rbp-4], eax
    mov     eax, DWORD PTR [rbp-4]

The instructions mov, add, mov match the ones we got from SharpLab, with a different choice of registers.

※: x86-64 gcc 9.3 with -g -o /tmp/compiler-explorer-compiler2020424-22672-17cap6k.bjoj/output.s -masm=intel -S -fdiagnostics-color=always /tmp/compiler-explorer-compiler2020424-22672-17cap6k.bjoj/example.cpp

Adding the compiler option -O made the code go away. Which makes sense because I was not using it.

这篇关于在 C# 中,x+=y 和 x=x+y(x 和 y 都是简单类型)之间有什么性能差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆