使用-O3时确定segfault的原因? [英] Determine cause of segfault when using -O3?

查看:289
本文介绍了使用-O3时确定segfault的原因?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用GCC 4.8 / 4.9 / 5.1编译 -O3 的程序时,我无法确定segfault的原因。对于GCC 4.9.x,我看到它在Cygwin,Debian 8(x64)和Fedora 21(x64)。其他人在 GCC 4.8和5.1 上体验过。



程序在 -O2 下很好,与其他版本的GCC完全一样,在其他编译器和Clang)。



下面是GDB下的崩溃,但没有什么是跳出来。源自 misc.cpp:26 的源代码如下,但它的简单XOR:

 ((word64 *)buf)[i] ^ =((word64 *)mask)[i]; 

有问题的代码检查64位字对齐 > 。从 -O3 下的反汇编,我知道它与 vmovdqa 指令有关:

 (gdb)disass 0x0000000000539fc3 
...

0x0000000000539fbc < 220> ;:vxorps 0x0(%r13,%r10,1),%ymm0,%ymm0
=> 0x0000000000539fc3< + 227> ;:vmovdqa%ymm0,0x0(%r13,%r10,1)
0x0000000000539fca< + 234> ;: add $ 0x20,%r10
/ pre>

看起来GCC在 -O3 使用SSE向量,而不是在 -O2 。 (感谢Alejandro的建议)。



我会天真地问:是 vmovdqa 大于64位字?是这样的,为什么GCC在字不是128位对齐时选择它?



这里是什么原因导致segfault?如何进一步解决问题?






另请参阅错误66852 - 在64位对齐数组上发出的vmovdqa指令导致segfault






  $ gdb ./cryptest.exe 
GNU gdb(Debian 7.7.1 + dfsg-5)7.7.1
...
(gdb)rv
...
测试MessageDigest算法SHA-3-224。
.....
编程接收信号SIGSEGV,分段故障。在CryptoPP :: xorbuf(BUF = 0x98549aefghijde,
掩码=面具@项= 0x7fffffffbfebefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu,一个'<
0x0000000000539fc3;重复106次> ...,数=计@ entry = 0x5e)at misc.cpp:26
26((word64 *)buf)[i] ^ =((word64 *)mask)[i]






 (GDB),其中
#0 0x0000000000539fc3在CryptoPP :: xorbuf(BUF = 0x98549aefghijde,
掩码=面具@项= 0x7fffffffbfebefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu,一个'<在CryptoPP :: SHA3 :: Update(这是= 0x985480,
input = 0x7fffffffbfeb)中的misc.cpp:26
#1 0x0000000000561eb0处重复106次(...,count = efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu,一个&下;重复106次> ...,
长度= 0x5e)在sha3.cpp:264
#2 0x00000000005bac1a在CryptoPP :: HashVerificationFilter :: NextPutMultiple(
此= 0x7fffffffd390,
inString = 0x7fffffffbfebefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu,'一个'&下;重复106次> ...,
长度= 0x5e)在filters.cpp:786
#3 0x00000000005bd8a2在NextPutMaybeModifiable(修改=<优化掉了>中
长度= 0x5e,
inString = 0x7fffffffbfebefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu,一个'<重复106次> ...,
本= 0x7fffffffd390)在filters.h:200
#4 CryptoPP :: FilterWithBufferedInput :: PutMaybeModifiable(
=这个0x7fffffffd390,
inString = 0x7fffffffbfebefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu,一个'<重复106次> .. 。,
length =< optimized out>,messageEnd = 0x0,blocking =< optimized out> ;,
...






-O3 反汇编和注册值。

 (gdb)disass 0x0000000000539fc3 
函数CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)的汇编代码转储:
0x0000000000539ee0< + 0> ;: lea 0x8(%rsp),%r10
0x0000000000539ee5 +:和$ 0xffffffffffffffe0,%rsp
0x0000000000539ee9 + mov% rdx,%rax
0x0000000000539eec <+ 12>:pushq -0x8(%r10)
0x0000000000539ef0 <+ 16>:push%rbp
0x0000000000539ef1 <+ 17>:shr $ 0x3 ,%rax
0x0000000000539ef5 <+ 21>:mov%rsp,%rbp
0x0000000000539ef8 <+ 24>:push%r15
0x0000000000539efa< + 26> ;: push%r14
0x0000000000539efc 1 + 28计算值:推%的r13
0x0000000000539efe 1 + 30计算值:推%R12
0x0000000000539f00 1 + 32计算值:推%R10
0x0000000000539f02 1 + 34> :push%rbx
0x0000000000539f03< + 35>:je 0x53a00a< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 298>
0x0000000000539f09< + 41> ;: lea 0x20(%rdi),%r​​cx
0x0000000000539f0d <+ 45>:cmp%rcx,%rsi
0x0000000000539f10 + 48:lea 0x20 (%rsi),%r​​cx
0x0000000000539f14 <+ 52>:setae%r8b
0x0000000000539f18 + 56:cmp%rcx,%rdi
0x0000000000539f1b <+ 59>:setae %CL
0x0000000000539f1e 1 + 62> ;:或%CL,%R8B
0x0000000000539f21 1 + 65计算值:JE 0x53a300< CryptoPP :: xorbuf(无符号字符*,无符号的字符常量*,无符号长)+ 1056。
0x0000000000539f27< + 71> ;:cmp $ 0x8,%rax
0x0000000000539f2b +75:jbe 0x53a300< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 1056。
0x0000000000539f31< + 81> ;: mov%rdi,%rcx
0x0000000000539f34 + 84> ;:和$ 0x1f,%ecx
0x0000000000539f37 +87:shr $ 0x3,% rcx
0x0000000000539f3b< + 91> ;: neg%rcx
0x0000000000539f3e + 94> ;:和$ 0x3,%ecx
0x0000000000539f41 <+ 97>:cmp%rax,%rcx
0x0000000000539f44 <+ 100>:cmova%rax,%rcx
0x0000000000539f48 <+ 104>:xor%r8d,%r8d
0x0000000000539f4b <+ 107>:test%rcx,%rcx
0x0000000000539f4e< + 110> ;:je 0x539f80< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 160>
0x0000000000539f50< + 112> ;: mov(%rsi),%r​​8
0x0000000000539f53 <+ 115>:xor%r8,(%rdi)
0x0000000000539f56 + $ 0x1,%rcx
0x0000000000539f5a + 122:je 0x53a371< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 1169>
0x0000000000539f60< + 128> ;: mov 0x8(%rsi),%r​​8
0x0000000000539f64 <+ 132>:xor%r8,0x8(%rdi)
0x0000000000539f68& :cmp $ 0x3,%rcx
0x0000000000539f6c< + 140>:jne 0x53a366< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 1158>
0x0000000000539f72< + 146> ;: mov 0x10(%rsi),%r​​8
0x0000000000539f76 <+ 150>:xor%r8,0x10(%rdi)
0x0000000000539f7a + :mov $ 0x3,%r8d
0x0000000000539f80< + 160> ;: mov%rax,%r11
0x0000000000539f83< + 163> ;:xor%r10d,%r10d
0x0000000000539f86 ;:sub%rcx,%r11
0x0000000000539f89< + 169> ;:shl $ 0x3,%rcx
0x0000000000539f8d <+ 173>:xor%ebx,%ebx
0x0000000000539f8f < 175%:lea-0x4(%r11),%r9
0x0000000000539f93 <+ 179>:lea(%rdi,%rcx,1),%r13
0x0000000000539f97 + 183& x2,%r9
0x0000000000539f9b< + 187> ;: add%rsi,%rcx
0x0000000000539f9e< + 190> ;: add $ 0x1,%r9
0x0000000000539fa2< + 194> ;: lea 0x0(,%r9,4),%r12
0x0000000000539faa< + 202> ;: add $ 0x1,%rbx
0x0000000000539fae< + 206> ;:vmovdqu(%rcx,%r10,1) %xmm0
0x0000000000539fb4< + 212> ;:vinsertf128 $ 0x1,0x10(%rcx,%r10,1),%ymm0,%ymm0
0x0000000000539fbc +220:vxorps 0x0(%r13, %r10,1),%ymm0,%ymm0
=> 0x0000000000539fc3< + 227> ;:vmovdqa%ymm0,0x0(%r13,%r10,1)
0x0000000000539fca <+ 234>:add $ 0x20,%r10
0x0000000000539fce< + 238> ;:cmp %r9,%rbx
0x0000000000539fd1< + 241>:jb 0x539faa< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 202&
0x0000000000539fd3< + 243> ;: lea(%r8,%r12,1),%rcx
0x0000000000539fd7< + 247> ;:cmp%r12,%r11
0x0000000000539fda< + 250 ;:je 0x53a006< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 294>
0x0000000000539fdc + 252:mov(%rsi,%rcx,8),%r8
0x0000000000539fe0 <+ 256>:xor%r8,(%rdi,%rcx,8)
0x0000000000539fe4< + 260> ;: lea 0x1(%rcx),%r8
0x0000000000539fe8 + 264> ;:cmp%r8,%rax
0x0000000000539feb< + 267> ;:jbe 0x53a006& CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 294>
0x0000000000539fed< + 269> ;: add $ 0x2,%rcx
0x0000000000539ff1 <+ 273>:mov(%rsi,%r8,8),%r9
0x0000000000539ff5& ;:xor%r9,(%rdi,%r8,8)
0x0000000000539ff9 + 281:cmp%rcx,%rax
0x0000000000539ffc + 284:jbe 0x53a006< CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 294>
0x0000000000539ffe< + 286> ;: mov(%rsi,%rcx,8),%r8
0x000000000053a002 + 290:xor%r8,(%rdi,%rcx,8)
0x000000000053a006< + 294> ;:shl $ 0x3,%rax



 (gdb)info r ymm0 r13 r10 
ymm0 {v8_float = {0x0,0x0,0x0,为0x0,为0x0,为0x0,为0x0,为0x0}
v4_double = {0x8000000000000000,0x8000000000000000,0x8000000000000000,
0x8000000000000000},v32_int8 = {0x66,0x67,为0x68,0×69,的0x6A,0x6b,0x65,
0x66,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x66,0x67,0x68,0x69,0x6a,
0x6b,0x6c,0x6d,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x6d, 0x6e,0x68,
0x69},v16_int16 = {0x6766,0x6968,0x6b6a,0x6665,0x6867,0x6a69,
0x6c6b,0x6766,0x696,0x6b6a,0x6d6c,0x6867,0x6a69,0x6c6b,0x6e6d,
0x6968},v8_int32 = {0x69686766,0x66656b6a,0x6a696867,0x67666c6b,
0x6b6a6968,0x68676d6c,0x6c6b6a69,0x69686e6d},v4_int64 = {
0x66656b6a69686766,0x67666c6b6a696867,0x68676d6c6b6a6968,
0x69686e6d6c6b6a69},v2_int128 = {0x67666c6b6a69686766656b6a69686766,
0x69686e6d6c6b6a6968676d6c6b6a6968}}
r13 0x9854a2 0x9854a2
r10 0x0 0x0


$ b b




当编译时使用 -O2 和断点时,这里是反汇编。 ((word64 *)buf)[i] ^ =((word64 *)mask)[i]; 移动到第31行:

 断点1,CryptoPP :: xorbuf(buf = 0x985488,
mask = mask @ entry = 0x7fffffffc01d (f(x,y),b(x,y),b(x,y))。 word64 *)mask)[i];
(gdb)disass
函数CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)的汇编代码转储:
0x0000000000532150 + 0:mov%rdx ,%RCX
0x0000000000532153 1 + 3计算值:SHR $ 0x3中,%RCX
0x0000000000532157 1 + 7计算值:JE 0x532170< CryptoPP :: xorbuf(无符号字符*,无符号的字符常量*,无符号长)+ 32>
0x0000000000532159< + 9> ;:xor%eax,%eax
=> 0x000000000053215b +11:mov(%rsi,%rax,8),%r8
0x000000000053215f + 15:xor%r8,(%rdi,%rax,8)
0x0000000000532163< ; + 19> ;: add $ 0x1,%rax
0x0000000000532167< + 23> ;:cmp%rcx,%rax
0x000000000053216a< + 26>:jne 0x53215b< CryptoPP :: xorbuf *,unsigned char const *,unsigned long)+ 11>
0x000000000053216c 1 + 28计算值:SHL $ 0x3中,%RCX
0x0000000000532170 1 + 32计算值:子%RCX,%RDX
0x0000000000532173 1 + 35计算值:JE 0x5321d0&下; CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 128>
0x0000000000532175< + 37> ;: mov%rdx,%r8
0x0000000000532178< + 40> ;: add%rcx,%rdi
0x000000000053217b< + 43> ;: add%rcx, RSI
0x000000000053217e 1 + 46计算值:SHR $ 0X2,%R8
0x0000000000532182 1 + 50计算值:JE 0x5321a8< CryptoPP :: xorbuf(无符号字符*,无符号的字符常量*,无符号长) + 88>
0x0000000000532184< + 52> ;:xor%eax,%eax
0x0000000000532186< + 54> ;:nopw%cs:0x0(%rax,%rax,1)
0x0000000000532190 < 64> ;: mov(%rsi,%rax,4),%ecx
0x0000000000532193< + 67> ;:xor%ecx,(%rdi,%rax,4)
0x0000000000532196& :加$为0x1,%RAX
0x000000000053219a 1 + 74计算值:CMP%R8,%RAX
0x000000000053219d 1 + 77计算值:JNE 0x532190< CryptoPP :: xorbuf(无符号字符*,unsigned char型const *,unsigned long)+ 64>
0x000000000053219f 1 + 79计算值:SHL $ 0X2,%R8
0x00000000005321a3 1 + 83计算值:子%R8,%RDX
0x00000000005321a6 1 + 86计算值:JE 0x5321d8&下; CryptoPP :: xorbuf(unsigned char *,unsigned char const *,unsigned long)+ 136>
0x00000000005321a8< + 88> ;: lea(%rdi,%r8,1),%rcx
0x00000000005321ac + 92:xor%eax,%eax
0x00000000005321ae + ;:lea(%rsi,%r8,1),%rdi
0x00000000005321b2 <+ 98>:nopw 0x0(%rax,%rax,1)
0x00000000005321b8 <+ 104>:movzbl %xdi,%rdi,%rax,1),%esi
0x00000000005321bc + 108:xor%sil,(%rcx,%rax,1)
0x00000000005321c0 + %RAX
0x00000000005321c4 1 + 116计算值:CMP%的RDX,%RAX
0x00000000005321c7 1 + 119计算值:JB 0x5321b8< CryptoPP :: xorbuf(无符号字符*,无符号的字符常量*,无符号长)+ 104>
0x00000000005321c9< + 121> ;:retq
0x00000000005321ca< + 122> ;:nopw 0x0(%rax,%rax,1)
0x00000000005321d0 <+ 128>:retq
0x00000000005321d1< + 129> ;: nopl 0x0(%rax)
0x00000000005321d8 <+ 136>:retq
汇编器转储结束。






misc.cpp ,第26行是((word64 *)buf)[i] ^ =((word64 *)mask)[i]; p>

  void xorbuf(byte * buf,const byte * mask,size_t count)
{
size_t i;

if(IsAligned< word32>(buf)&& IsAligned< word32>(mask))
{
if(!CRYPTOPP_BOOL_SLOW_WORD64& (i = 0; i ((word64 *)buf)(bf(buf)& i] ^ =((word64 *)mask)[i];
count - = 8 * i;
if(!count)
return;
buf + = 8 * i;
mask + = 8 * i;
}

for(i = 0; i ((word32 *)buf)[i] ^ = [一世];
count - = 4 * i;
if(!count)
return;
buf + = 4 * i;
mask + = 4 * i;
}

for(i = 0; i buf [i] ^ = mask [i]
}


解决方案

c $ c> g ++ -Wall -Wextra -O3 -g ;您希望启用警告,因为其中一些可能仅在启用了 GCC 通过时生成> -O3 ;你想要启用调试信息( -g )使用 gdb ,但是请注意,调试信息并不总是可靠的强烈的优化。



您可能有一些指针别名问题。



请务必避免未定义的行为。您可以使用 -fsanitize = 选项(特别是 -fsanitize = address -fsanitize = undefined g ++ 编译器(最好是版本5)使用 -fdump-tree-all



< c $ c>(警告,他们产生了数百个文件!)了解更多的内部行为 g ++ ;您甚至可以使用
MELT 自定义您的GCC编译器。



此外,如果查看生成的汇编器,请使用 g ++ -Wall -S -O3 -fverbose-asm 编译,因为 -fverbose-asm 要求GCC发出一些汇编语句解释(不多,但是一点点)编译后的代码。


I'm having trouble determining the cause of a segfault when a program is compiled with -O3 with GCC 4.8/4.9/5.1. For GCC 4.9.x, I've seen it on Cygwin, Debian 8 (x64) and Fedora 21 (x64). Others have experienced it on GCC 4.8 and 5.1.

The program is fine under -O2, fine with other versions of GCC, and fine under other compilers (like MSVC, ICC and Clang).

Below is the crash under GDB, but nothing is jumping out at me. The source code from misc.cpp:26 is below, but its a simple XOR:

((word64*)buf)[i] ^= ((word64*)mask)[i];

The code in question checks for 64-bit word alignment prior to the cast. From the disassembly under -O3, I know it has something to do with the vmovdqa instruction:

(gdb) disass 0x0000000000539fc3
...

   0x0000000000539fbc <+220>:   vxorps 0x0(%r13,%r10,1),%ymm0,%ymm0
=> 0x0000000000539fc3 <+227>:   vmovdqa %ymm0,0x0(%r13,%r10,1)
   0x0000000000539fca <+234>:   add    $0x20,%r10

It appears GCC is using SSE vectors at -O3, and not using them at -O2. (Thanks to Alejandro for the suggestion).

I'm going to naively ask: does vmovdqa have alignment requirements greater than 64-bit word? Is so, why is GCC selecting it when the words are not 128-bit aligned?

What is causing the segfault here? How do I troubleshoot it further?


Also see Bug 66852 - vmovdqa instructions issued on 64-bit aligned array, causes segfault. It was filed in response to this issue, so its unconfirmed at the moment.


$ gdb ./cryptest.exe 
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
...
(gdb) r v
...
Testing MessageDigest algorithm SHA-3-224.
.....
Program received signal SIGSEGV, Segmentation fault.
0x0000000000539fc3 in CryptoPP::xorbuf (buf=0x98549a "efghijde", 
    mask=mask@entry=0x7fffffffbfeb "efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu", 'a' <repeats 106 times>..., count=count@entry=0x5e) at misc.cpp:26
26                  ((word64*)buf)[i] ^= ((word64*)mask)[i];


(gdb) where
#0  0x0000000000539fc3 in CryptoPP::xorbuf (buf=0x98549a "efghijde", 
    mask=mask@entry=0x7fffffffbfeb "efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu", 'a' <repeats 106 times>..., count=count@entry=0x5e) at misc.cpp:26
#1  0x0000000000561eb0 in CryptoPP::SHA3::Update (this=0x985480, 
    input=0x7fffffffbfeb "efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu", 'a' <repeats 106 times>..., 
    length=0x5e) at sha3.cpp:264
#2  0x00000000005bac1a in CryptoPP::HashVerificationFilter::NextPutMultiple (
    this=0x7fffffffd390, 
    inString=0x7fffffffbfeb "efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu", 'a' <repeats 106 times>..., 
    length=0x5e) at filters.cpp:786
#3  0x00000000005bd8a2 in NextPutMaybeModifiable (modifiable=<optimized out>, 
    length=0x5e, 
    inString=0x7fffffffbfeb "efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu", 'a' <repeats 106 times>..., 
    this=0x7fffffffd390) at filters.h:200
#4  CryptoPP::FilterWithBufferedInput::PutMaybeModifiable (
    this=0x7fffffffd390, 
    inString=0x7fffffffbfeb "efghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu", 'a' <repeats 106 times>..., 
    length=<optimized out>, messageEnd=0x0, blocking=<optimized out>, 
...


-O3 disassembly and register values.

(gdb) disass 0x0000000000539fc3
Dump of assembler code for function CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long):
   0x0000000000539ee0 <+0>: lea    0x8(%rsp),%r10
   0x0000000000539ee5 <+5>: and    $0xffffffffffffffe0,%rsp
   0x0000000000539ee9 <+9>: mov    %rdx,%rax
   0x0000000000539eec <+12>:    pushq  -0x8(%r10)
   0x0000000000539ef0 <+16>:    push   %rbp
   0x0000000000539ef1 <+17>:    shr    $0x3,%rax
   0x0000000000539ef5 <+21>:    mov    %rsp,%rbp
   0x0000000000539ef8 <+24>:    push   %r15
   0x0000000000539efa <+26>:    push   %r14
   0x0000000000539efc <+28>:    push   %r13
   0x0000000000539efe <+30>:    push   %r12
   0x0000000000539f00 <+32>:    push   %r10
   0x0000000000539f02 <+34>:    push   %rbx
   0x0000000000539f03 <+35>:    je     0x53a00a <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+298>
   0x0000000000539f09 <+41>:    lea    0x20(%rdi),%rcx
   0x0000000000539f0d <+45>:    cmp    %rcx,%rsi
   0x0000000000539f10 <+48>:    lea    0x20(%rsi),%rcx
   0x0000000000539f14 <+52>:    setae  %r8b
   0x0000000000539f18 <+56>:    cmp    %rcx,%rdi
   0x0000000000539f1b <+59>:    setae  %cl
   0x0000000000539f1e <+62>:    or     %cl,%r8b
   0x0000000000539f21 <+65>:    je     0x53a300 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+1056>
   0x0000000000539f27 <+71>:    cmp    $0x8,%rax
   0x0000000000539f2b <+75>:    jbe    0x53a300 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+1056>
   0x0000000000539f31 <+81>:    mov    %rdi,%rcx
   0x0000000000539f34 <+84>:    and    $0x1f,%ecx
   0x0000000000539f37 <+87>:    shr    $0x3,%rcx
   0x0000000000539f3b <+91>:    neg    %rcx
   0x0000000000539f3e <+94>:    and    $0x3,%ecx
   0x0000000000539f41 <+97>:    cmp    %rax,%rcx
   0x0000000000539f44 <+100>:   cmova  %rax,%rcx
   0x0000000000539f48 <+104>:   xor    %r8d,%r8d
   0x0000000000539f4b <+107>:   test   %rcx,%rcx
   0x0000000000539f4e <+110>:   je     0x539f80 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+160>
   0x0000000000539f50 <+112>:   mov    (%rsi),%r8
   0x0000000000539f53 <+115>:   xor    %r8,(%rdi)
   0x0000000000539f56 <+118>:   cmp    $0x1,%rcx
   0x0000000000539f5a <+122>:   je     0x53a371 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+1169>
   0x0000000000539f60 <+128>:   mov    0x8(%rsi),%r8
   0x0000000000539f64 <+132>:   xor    %r8,0x8(%rdi)
   0x0000000000539f68 <+136>:   cmp    $0x3,%rcx
   0x0000000000539f6c <+140>:   jne    0x53a366 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+1158>
   0x0000000000539f72 <+146>:   mov    0x10(%rsi),%r8
   0x0000000000539f76 <+150>:   xor    %r8,0x10(%rdi)
   0x0000000000539f7a <+154>:   mov    $0x3,%r8d
   0x0000000000539f80 <+160>:   mov    %rax,%r11
   0x0000000000539f83 <+163>:   xor    %r10d,%r10d
   0x0000000000539f86 <+166>:   sub    %rcx,%r11
   0x0000000000539f89 <+169>:   shl    $0x3,%rcx
   0x0000000000539f8d <+173>:   xor    %ebx,%ebx
   0x0000000000539f8f <+175>:   lea    -0x4(%r11),%r9
   0x0000000000539f93 <+179>:   lea    (%rdi,%rcx,1),%r13
   0x0000000000539f97 <+183>:   shr    $0x2,%r9
   0x0000000000539f9b <+187>:   add    %rsi,%rcx
   0x0000000000539f9e <+190>:   add    $0x1,%r9
   0x0000000000539fa2 <+194>:   lea    0x0(,%r9,4),%r12
   0x0000000000539faa <+202>:   add    $0x1,%rbx
   0x0000000000539fae <+206>:   vmovdqu (%rcx,%r10,1),%xmm0
   0x0000000000539fb4 <+212>:   vinsertf128 $0x1,0x10(%rcx,%r10,1),%ymm0,%ymm0
   0x0000000000539fbc <+220>:   vxorps 0x0(%r13,%r10,1),%ymm0,%ymm0
=> 0x0000000000539fc3 <+227>:   vmovdqa %ymm0,0x0(%r13,%r10,1)
   0x0000000000539fca <+234>:   add    $0x20,%r10
   0x0000000000539fce <+238>:   cmp    %r9,%rbx
   0x0000000000539fd1 <+241>:   jb     0x539faa <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+202>
   0x0000000000539fd3 <+243>:   lea    (%r8,%r12,1),%rcx
   0x0000000000539fd7 <+247>:   cmp    %r12,%r11
   0x0000000000539fda <+250>:   je     0x53a006 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+294>
   0x0000000000539fdc <+252>:   mov    (%rsi,%rcx,8),%r8
   0x0000000000539fe0 <+256>:   xor    %r8,(%rdi,%rcx,8)
   0x0000000000539fe4 <+260>:   lea    0x1(%rcx),%r8
   0x0000000000539fe8 <+264>:   cmp    %r8,%rax
   0x0000000000539feb <+267>:   jbe    0x53a006 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+294>
   0x0000000000539fed <+269>:   add    $0x2,%rcx
   0x0000000000539ff1 <+273>:   mov    (%rsi,%r8,8),%r9
   0x0000000000539ff5 <+277>:   xor    %r9,(%rdi,%r8,8)
   0x0000000000539ff9 <+281>:   cmp    %rcx,%rax
   0x0000000000539ffc <+284>:   jbe    0x53a006 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+294>
   0x0000000000539ffe <+286>:   mov    (%rsi,%rcx,8),%r8
   0x000000000053a002 <+290>:   xor    %r8,(%rdi,%rcx,8)
   0x000000000053a006 <+294>:   shl    $0x3,%rax

And:

(gdb) info r ymm0 r13 r10
ymm0           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 
  v4_double = {0x8000000000000000, 0x8000000000000000, 0x8000000000000000, 
    0x8000000000000000}, v32_int8 = {0x66, 0x67, 0x68, 0x69, 0x6a, 0x6b, 0x65, 
    0x66, 0x67, 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x66, 0x67, 0x68, 0x69, 0x6a, 
    0x6b, 0x6c, 0x6d, 0x67, 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x68, 
    0x69}, v16_int16 = {0x6766, 0x6968, 0x6b6a, 0x6665, 0x6867, 0x6a69, 
    0x6c6b, 0x6766, 0x6968, 0x6b6a, 0x6d6c, 0x6867, 0x6a69, 0x6c6b, 0x6e6d, 
    0x6968}, v8_int32 = {0x69686766, 0x66656b6a, 0x6a696867, 0x67666c6b, 
    0x6b6a6968, 0x68676d6c, 0x6c6b6a69, 0x69686e6d}, v4_int64 = {
    0x66656b6a69686766, 0x67666c6b6a696867, 0x68676d6c6b6a6968, 
    0x69686e6d6c6b6a69}, v2_int128 = {0x67666c6b6a69686766656b6a69686766, 
    0x69686e6d6c6b6a6968676d6c6b6a6968}}
r13            0x9854a2 0x9854a2
r10            0x0  0x0


When compiled with -O2 and a breakpoint on the line in question, here's the disassembly. ((word64*)buf)[i] ^= ((word64*)mask)[i]; moved to line 31:

Breakpoint 1, CryptoPP::xorbuf (buf=0x985488 "", 
    mask=mask@entry=0x7fffffffc01d "The quick brown fox", 'a' <repeats 181 times>..., count=count@entry=0x13) at misc.cpp:31
31                  ((word64*)buf)[i] ^= ((word64*)mask)[i];
(gdb) disass
Dump of assembler code for function CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long):
   0x0000000000532150 <+0>: mov    %rdx,%rcx
   0x0000000000532153 <+3>: shr    $0x3,%rcx
   0x0000000000532157 <+7>: je     0x532170 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+32>
   0x0000000000532159 <+9>: xor    %eax,%eax
=> 0x000000000053215b <+11>:    mov    (%rsi,%rax,8),%r8
   0x000000000053215f <+15>:    xor    %r8,(%rdi,%rax,8)
   0x0000000000532163 <+19>:    add    $0x1,%rax
   0x0000000000532167 <+23>:    cmp    %rcx,%rax
   0x000000000053216a <+26>:    jne    0x53215b <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+11>
   0x000000000053216c <+28>:    shl    $0x3,%rcx
   0x0000000000532170 <+32>:    sub    %rcx,%rdx
   0x0000000000532173 <+35>:    je     0x5321d0 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+128>
   0x0000000000532175 <+37>:    mov    %rdx,%r8
   0x0000000000532178 <+40>:    add    %rcx,%rdi
   0x000000000053217b <+43>:    add    %rcx,%rsi
   0x000000000053217e <+46>:    shr    $0x2,%r8
   0x0000000000532182 <+50>:    je     0x5321a8 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+88>
   0x0000000000532184 <+52>:    xor    %eax,%eax
   0x0000000000532186 <+54>:    nopw   %cs:0x0(%rax,%rax,1)
   0x0000000000532190 <+64>:    mov    (%rsi,%rax,4),%ecx
   0x0000000000532193 <+67>:    xor    %ecx,(%rdi,%rax,4)
   0x0000000000532196 <+70>:    add    $0x1,%rax
   0x000000000053219a <+74>:    cmp    %r8,%rax
   0x000000000053219d <+77>:    jne    0x532190 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+64>
   0x000000000053219f <+79>:    shl    $0x2,%r8
   0x00000000005321a3 <+83>:    sub    %r8,%rdx
   0x00000000005321a6 <+86>:    je     0x5321d8 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+136>
   0x00000000005321a8 <+88>:    lea    (%rdi,%r8,1),%rcx
   0x00000000005321ac <+92>:    xor    %eax,%eax
   0x00000000005321ae <+94>:    lea    (%rsi,%r8,1),%rdi
   0x00000000005321b2 <+98>:    nopw   0x0(%rax,%rax,1)
   0x00000000005321b8 <+104>:   movzbl (%rdi,%rax,1),%esi
   0x00000000005321bc <+108>:   xor    %sil,(%rcx,%rax,1)
   0x00000000005321c0 <+112>:   add    $0x1,%rax
   0x00000000005321c4 <+116>:   cmp    %rdx,%rax
   0x00000000005321c7 <+119>:   jb     0x5321b8 <CryptoPP::xorbuf(unsigned char*, unsigned char const*, unsigned long)+104>
   0x00000000005321c9 <+121>:   retq   
   0x00000000005321ca <+122>:   nopw   0x0(%rax,%rax,1)
   0x00000000005321d0 <+128>:   retq   
   0x00000000005321d1 <+129>:   nopl   0x0(%rax)
   0x00000000005321d8 <+136>:   retq   
End of assembler dump.


From misc.cpp, line 26 is ((word64*)buf)[i] ^= ((word64*)mask)[i];.

void xorbuf(byte *buf, const byte *mask, size_t count)
{
    size_t i;

    if (IsAligned<word32>(buf) && IsAligned<word32>(mask))
    {
        if (!CRYPTOPP_BOOL_SLOW_WORD64 && IsAligned<word64>(buf) && IsAligned<word64>(mask))
        {
            for (i=0; i<count/8; i++)
                ((word64*)buf)[i] ^= ((word64*)mask)[i];
            count -= 8*i;
            if (!count)
                return;
            buf += 8*i;
            mask += 8*i;
        }

        for (i=0; i<count/4; i++)
            ((word32*)buf)[i] ^= ((word32*)mask)[i];
        count -= 4*i;
        if (!count)
            return;
        buf += 4*i;
        mask += 4*i;
    }

    for (i=0; i<count; i++)
        buf[i] ^= mask[i];
}

解决方案

You could compile with g++ -Wall -Wextra -O3 -g ; you want to enable warnings, because some of them are possibly generated only in GCC passes enabled with -O3; you want to enable debugging info (-g) to use gdb but be aware that debugging info are not always reliable with strong optimizations.

You might have some pointer aliasing issues. Perhaps use (or remove) the restrict keyword.

Be sure to avoid undefined behavior. You might use -fsanitize= options (notably -fsanitize=address and -fsanitize=undefined....) to the g++ compiler (version 5 preferably) Use also valgrind.

BTW, you could use dump options like -fdump-tree-all (warning, they produces hundreds of files!) to understand more the internal behavior of g++; and you might even customize your GCC compiler with MELT.

Also, if looking at the produced assembler, compile with g++ -Wall -S -O3 -fverbose-asm since the -fverbose-asm asks GCC to emit some assembler comments "explaining" (not much, but a tiny bit) the compiled code.

这篇关于使用-O3时确定segfault的原因?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆