为什么要使用XOR用文字而不是反转(按位NOT) [英] Why use xor with a literal instead of inversion (bitwise not)
问题描述
我所遇到这个CRC32 code ,并很好奇为什么笔者会选择使用
CRC = CRC ^〜0U;
而不是
CRC =〜CRC;
据我所知,他们是等价的。
我甚至拆解在Visual Studio 2010的两个版本。
未优化编译:
CRC = CRC ^〜0U;
009D13F4 MOV EAX,DWORD PTR [CRC]
009D13F7 XOR EAX,0FFFFFFFFH
009D13FA MOV DWORD PTR [CRC],EAX CRC =〜CRC;
011C13F4 MOV EAX,DWORD PTR [CRC]
011C13F7不EAX
011C13F9 MOV DWORD PTR [CRC],EAX
我也不能考虑每个指令获取因为两者应该采取1周期完成的周期数证明code。事实上,的 XOR 的可能会被其从某处加载的文字,虽然我不能肯定这有一个点球。
所以我想离开,这可能是只是一个preferred的方式来描述算法,而不是一个最优化...这是正确?
编辑1:
由于我刚刚意识到类型的CRC
变量可能是必须提到我,包括整个code(更少的查找表,太大了)在这里,所以你不必跟随链接。
uint32_t的CRC32(CRC uint32_t的,常量无效* buf中,为size_t大小)
{
常量uint8_t有* P; P = BUF;
CRC = CRC ^〜0U; 而(size--)
{
CRC = crc32_tab [(CRC ^ * P +)及为0xFF] ^(CRC>→8);
} 返回CRC ^〜0U;
}
编辑2:
既然有人提出了一个事实,即优化的建立将是兴趣,我做了一个,包括它下面的。
优化编译:
请注意,全功能(包括在下面的最后一次编辑)的内联。
// CRC = CRC ^〜0U;
zeroCrc = 0;
zeroCrc = CRC32(zeroCrc,zeroBufferSmall,sizeof的(zeroBufferSmall));
00971148 MOV ECX,14H
0097114D LEA EDX,[EBP-40H]
00971150或EAX,0FFFFFFFFH
00971153 MOVZX ESI,字节PTR [EDX]
00971156异或ESI,EAX
00971158和ESI,0FFh的
0097115E SHR EAX,8
00971161 XOR EAX,DWORD PTR ___ defaultmatherr + 4(973018h)ESI * 4]
00971168添加EDX,EBX
0097116A子ECX,EBX
0097116C JNE主+ 153H(971153h)
0097116E不EAX
00971170 MOV EBX,EAX// CRC =〜CRC;
zeroCrc = 0;
zeroCrc = CRC32(zeroCrc,zeroBufferSmall,sizeof的(zeroBufferSmall));
01251148 MOV ECX,14H
0125114D LEA EDX,[EBP-40H]
01251150或EAX,0FFFFFFFFH
01251153 MOVZX ESI,字节PTR [EDX]
01251156异或ESI,EAX
01251158和ESI,0FFh的
0125115E SHR EAX,8
01251161 XOR EAX,DWORD PTR ___ defaultmatherr + 4(1253018h)ESI * 4]
01251168添加EDX,EBX
0125116A子ECX,EBX
0125116C JNE主+ 153H(1251153h)
0125116E不EAX
01251170 MOV EBX,EAX
东西没有人没有提到;如果code被编译的机器上16位 unsigned int类型
那么这两个code片段是的不同的
CRC
指定为32位无符号整型。 〜CRC
将反转所有位,但如果 unsigned int类型
是16bit那么 CRC = CRC ^ 〜0U
只会反转低16位。
我不知道有足够的了解了CRC算法知道这是否是有意还是一个bug,或许可以HIVERT澄清;虽然看着张贴OP样品code,它的确做出后面的循环中的差异。
NB。对不起张贴这是一个答案,因为它不是一个答案,但它太大只适合评论:)
I have come across this CRC32 code and was curious why the author would choose to use
crc = crc ^ ~0U;
instead of
crc = ~crc;
As far as I can tell, they are equivalent.
I have even disassembled the two versions in Visual Studio 2010.
Not optimized build:
crc = crc ^ ~0U;
009D13F4 mov eax,dword ptr [crc]
009D13F7 xor eax,0FFFFFFFFh
009D13FA mov dword ptr [crc],eax
crc = ~crc;
011C13F4 mov eax,dword ptr [crc]
011C13F7 not eax
011C13F9 mov dword ptr [crc],eax
I also cannot justify the code by thinking about the number of cycles that each instruction takes since both should be taking 1 cycle to complete. In fact, the xor might have a penalty by having to load the literal from somewhere, though I am not certain of this.
So I'm left thinking that it is possibly just a preferred way to describe the algorithm, rather than an optimization... Would that be correct?
Edit 1:
Since I just realized that the type of the crc
variable is probably important to mention I am including the whole code (less the lookup table, way too big) here so you don't have to follow the link.
uint32_t crc32(uint32_t crc, const void *buf, size_t size)
{
const uint8_t *p;
p = buf;
crc = crc ^ ~0U;
while (size--)
{
crc = crc32_tab[(crc ^ *p++) & 0xFF] ^ (crc >> 8);
}
return crc ^ ~0U;
}
Edit 2:
Since someone has brought up the fact that an optimized build would be of interest, I have made one and included it below.
Optimized build:
Do note that the whole function (included in the last edit below) was inlined.
// crc = crc ^ ~0U;
zeroCrc = 0;
zeroCrc = crc32(zeroCrc, zeroBufferSmall, sizeof(zeroBufferSmall));
00971148 mov ecx,14h
0097114D lea edx,[ebp-40h]
00971150 or eax,0FFFFFFFFh
00971153 movzx esi,byte ptr [edx]
00971156 xor esi,eax
00971158 and esi,0FFh
0097115E shr eax,8
00971161 xor eax,dword ptr ___defaultmatherr+4 (973018h)[esi*4]
00971168 add edx,ebx
0097116A sub ecx,ebx
0097116C jne main+153h (971153h)
0097116E not eax
00971170 mov ebx,eax
// crc = ~crc;
zeroCrc = 0;
zeroCrc = crc32(zeroCrc, zeroBufferSmall, sizeof(zeroBufferSmall));
01251148 mov ecx,14h
0125114D lea edx,[ebp-40h]
01251150 or eax,0FFFFFFFFh
01251153 movzx esi,byte ptr [edx]
01251156 xor esi,eax
01251158 and esi,0FFh
0125115E shr eax,8
01251161 xor eax,dword ptr ___defaultmatherr+4 (1253018h)[esi*4]
01251168 add edx,ebx
0125116A sub ecx,ebx
0125116C jne main+153h (1251153h)
0125116E not eax
01251170 mov ebx,eax
Something nobody's mentioned yet; if this code is being compiled on a machine with 16 bit unsigned int
then these two code snippets are different.
crc
is specified as a 32-bit unsigned integral type. ~crc
will invert all bits, but if unsigned int
is 16bit then crc = crc ^ ~0U
will only invert the lower 16 bits.
I don't know enough about the CRC algorithm to know whether this is intentional or a bug, perhaps hivert can clarify; although looking at the sample code posted by OP, it certainly does make a difference to the loop that follows.
NB. Sorry for posting this as an "answer" because it isn't an answer, but it's too big to just fit in a comment :)
这篇关于为什么要使用XOR用文字而不是反转(按位NOT)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!