在 C 中复制两个相邻字节的最快方法是什么? [英] What's the fastest way to copy two adjacent bytes in C?

查看:55
本文介绍了在 C 中复制两个相邻字节的最快方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,让我们从最明显的解决方案开始:

memcpy(Ptr, (const char[]){'a', 'b'}, 2);

调用库函数的开销很大.编译器有时不会优化它,好吧,我不会依赖编译器优化,但即使 GCC 很聪明,如果我将程序移植到带有垃圾编译器的更奇特的平台上,我也不想依赖它.

所以现在有一个更直接的方法:

ptr[0] = 'a';Ptr[1] = 'b';

它不涉及任何库函数的开销,而是进行两个不同的分配.第三,我们有一个双关语:

*(uint16_t*)Ptr = *(uint16_t*)(unsigned char[]){'a', 'b'};

如果遇到瓶颈,我应该使用哪个?在 C 中只复制两个字节的最快方法是什么?

问候,
汉克·萨里

解决方案

您建议的方法中只有两种是正确的:

memcpy(Ptr, (const char[]){'a', 'b'}, 2);

ptr[0] = 'a';Ptr[1] = 'b';

在 X86 GCC 10.2 上,两者都编译为相同的代码:

mov eax, 25185mov WORD PTR [某物], ax

这是可能的,因为 as-if 规则.

由于优秀的编译器可以确定这些是相同的,因此请使用在您的 cse 中更容易编写的那个.如果您要设置一两个字节,请使用后者,如果有多个则使用前者或使用 string 而不是复合文字数组.


你推荐的第三个

*(uint16_t*)Ptr = *(uint16_t*)(unsigned char[]){'a', 'b'};

在使用 x86-64 GCC 10.2 时编译为 相同的代码,即在这种情况下它的行为相同.

但此外,它还有2-4 点未定义行为,因为它有两次严格的别名违规和两次,再加上可能在源和目标上未对齐的内存访问.未定义的行为并不意味着它不能像您预期的那样工作,但也不意味着它必须按照您的预期工作.行为未定义.它可能无法在任何处理器上工作,包括 x86.为什么你会如此关心糟糕编译器的性能,以至于你编写的代码在一个好的编译器上无法运行?!

Ok so let's start with the most obvious solution:

memcpy(Ptr, (const char[]){'a', 'b'}, 2);

There's quite an overhead of calling a library function. Compilers sometimes don't optimize it, well I wouldn't rely on compiler optimizations but even though GCC is smart, if I'm porting a program to more exotic platforms with trashy compilers I don't want to rely on it.

So now there's a more direct approach:

Ptr[0] = 'a';
Ptr[1] = 'b';

It doesn't involve any overhead of library functions, but is making two different assignments. Third we have a type pun:

*(uint16_t*)Ptr = *(uint16_t*)(unsigned char[]){'a', 'b'};

Which one should I use if in a bottleneck? What's the fastest way to copy only two bytes in C?

Regards,
Hank Sauri

解决方案

Only two of the approaches you suggested are correct:

memcpy(Ptr, (const char[]){'a', 'b'}, 2);

and

Ptr[0] = 'a';
Ptr[1] = 'b';

On X86 GCC 10.2, both compile to identical code:

mov     eax, 25185
mov     WORD PTR [something], ax

This is possible because of the as-if rule.

Since a good compiler could figure out that these are identical, use the one that is easier to write in your cse. If you're setting one or two bytes, use the latter, if several use the former or use a string instead of a compound literal array.


The third one you suggested

*(uint16_t*)Ptr = *(uint16_t*)(unsigned char[]){'a', 'b'};

also compiles to the same code when using x86-64 GCC 10.2, i.e. it would behave identically in this case.

But in addition it has 2-4 points of undefined behaviour, because it has twice strict aliasing violation and twice, coupled with possible unaligned memory access at both source and destination. Undefined behaviour does not mean that it must not work like you intended, but neither does it mean that it has to work as you intended. The behaviour is undefined. And it can fail to work on any processor, including x86. Why would you care about the performance on a bad compiler so much that you would write code that would fail to work on a good compiler?!

这篇关于在 C 中复制两个相邻字节的最快方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆