对于 .bss 中的符号和 .data 中的符号,gdb 的行为有所不同 [英] gdb behaves differently for symbols in the .bss, vs. symbols in .data
问题描述
我最近开始使用 YASM 为 Intel x86-64 架构学习汇编语言.在解决一本书(Ray Seyfarth 着)中建议的一项任务时,我遇到了以下问题:
I recently started learning assembly language for the Intel x86-64 architecture using YASM. While solving one of the tasks suggested in a book (by Ray Seyfarth) I came to following problem:
当我将一些字符放入 .bss 部分的缓冲区中时,我在 gdb 中调试它时仍然看到一个空字符串.将字符放入 .data 部分的缓冲区中会在 gdb 中按预期显示.
When I place some characters into a buffer in the .bss section, I still see an empty string while debugging it in gdb. Placing characters into a buffer in the .data section shows up as expected in gdb.
segment .bss
result resb 75
buf resw 100
usage resq 1
segment .data
str_test db 0, 0, 0, 0
segment .text
global main
main:
mov rbx, 'A'
mov [buf], rbx ; LINE - 1 STILL GET EMPTY STRING AFTER THAT INSTRUCTION
mov [str_test], rbx ; LINE - 2 PLACES CHARACTER NICELY.
ret
在gdb中我得到:
在第 1 行之后:
x/s &buf
,结果 -0x7ffff7dd2740
: ""
在第 2 行之后:x/s &str_test
,结果 - 0x601030:A"
after LINE 2: x/s &str_test
, result - 0x601030: "A"
看起来 &buf
没有计算到正确的地址,所以它仍然看到全零.根据其 /proc/PID/maps
,0x7ffff7dd2740 不在被调试进程的 BSS 中,因此这是没有意义的.为什么 &buf
计算出错误的地址,而 &str_test
计算出正确的地址?全局"符号也不是,但我们确实使用调试信息进行构建.
It looks like &buf
isn't evaluating to the correct address, so it still sees all-zeros. 0x7ffff7dd2740 isn't in the BSS of the process being debugged, according to its /proc/PID/maps
, so that makes no sense. Why does &buf
evaluate to the wrong address, but &str_test
evaluates to the right address? Neither are "global" symbols, but we did build with debug info.
在 x86-64 Ubuntu 15.10 上使用 GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10 测试.
Tested with GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10 on x86-64 Ubuntu 15.10.
我正在构建
yasm -felf64 -Worphan-labels -gdwarf2 buf-test.asm
gcc -g buf-test.o -o buf-test
可执行文件上的
nm
显示正确的符号地址:
nm
on the executable shows the correct symbol addresses:
$ nm -n buf-test # numeric sort, heavily edited to omit symbols from glibc
...
0000000000601028 D __data_start
0000000000601038 d str_test
...
000000000060103c B __bss_start
0000000000601040 b result
000000000060108b b buf
0000000000601153 b usage
(编者注:我重写了很多问题,因为奇怪的是 gdb 的行为,而不是 OP 的 asm!).
(editor's note: I rewrote a lot of the question because the weirdness is in gdb's behaviour, not the OP's asm!).
推荐答案
glibc 也包含一个名为 buf
的符号.
glibc includes a symbol named buf
, as well.
(gdb) info variables ^buf$
All variables matching regular expression "^buf$":
File strerror.c:
static char *buf;
Non-debugging symbols:
0x000000000060108b buf <-- this is our buf
0x00007ffff7dd6400 buf <-- this is glibc's buf
gdb 碰巧从 glibc 中选择符号,而不是从可执行文件中选择符号.这就是为什么 ptype buf
显示 char *
.
gdb happens to choose the symbol from glibc over the symbol from the executable. This is why ptype buf
shows char *
.
为缓冲区使用不同的名称可以避免这个问题,global buf
也是如此,使它成为一个全局符号.如果您编写了一个不链接 libc(即定义 _start
并进行退出系统调用而不是运行 ret
的独立程序),您也不会有问题>)
Using a different name for the buffer avoids the problem, and so does a global buf
to make it a global symbol. You also wouldn't have a problem if you wrote a stand-alone program that didn't link libc (i.e. define _start
and make an exit system call instead of running a ret
)
注意0x00007ffff7dd6400
(我系统上buf
的地址;与你的不同)实际上不是一个堆栈地址.它在视觉上看起来像一个堆栈地址,但它不是:它在 7
之后有不同数量的 f
数字.对于评论中的混淆和问题的较早编辑,我们深表歉意.
Note that 0x00007ffff7dd6400
(address of buf
on my system; different from yours) is not actually a stack address. It visually looks like a stack address, but it's not: it has a different number of f
digits after the 7
. Sorry for that confusion in comments and an earlier edit of the question.
共享库也在低47位的顶部附近加载虚拟地址空间,靠近栈被映射的地方.它们与位置无关,但库的 BSS 空间必须相对于其代码位于正确的位置.再次更仔细地检查 /proc/PID/maps
,gdb 的 &buf
实际上位于匿名内存的 rwx 块(未映射到任何文件)旁边libc-2.21.so
的映射.
Shared libraries are also loaded near the top of the low 47 bits of virtual address space, near where the stack is mapped. They're position-independent, but a library's BSS space has to be in the right place relative to its code. Checking /proc/PID/maps
again more carefully, gdb's &buf
is in fact in the rwx block of anonymous memory (not mapped to any file) right next to the mapping for libc-2.21.so
.
7ffff7a0f000-7ffff7bcf000 r-xp 00000000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7bcf000-7ffff7dcf000 ---p 001c0000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dcf000-7ffff7dd3000 r-xp 001c0000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dd3000-7ffff7dd5000 rwxp 001c4000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dd5000-7ffff7dd9000 rwxp 00000000 00:00 0 <--- &buf is in this mapping
...
7ffffffdd000-7ffffffff000 rwxp 00000000 00:00 0 [stack] <---- more FFs before the first non-FF than in &buf.
普通的call
指令用rel32编码是不能到达库函数的,但是不需要,因为GNU/Linux共享库必须支持符号插入,所以call
s 到库函数实际上跳转到 PLT,在那里一个间接的 jmp
(带有来自 GOT 的指针)到达最终目的地.
A normal call
instruction with a rel32 encoding can't reach a library function, but it doesn't need to because GNU/Linux shared libraries have to support symbol interposition, so call
s to library functions actually jump to the PLT, where an indirect jmp
(with a pointer from the GOT) goes to the final destination.
这篇关于对于 .bss 中的符号和 .data 中的符号,gdb 的行为有所不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!