GNU as:如何将.bss/.data符号加载到寄存器? [英] GNU as: how to load a .bss/.data symbol to a register?
问题描述
我的问题很基本.我正在用汇编程序制作我的第一个裸机程序.该体系结构是ARMv7-M,我使用的是GNU,而我正在UAL中编写.
我在.bss(或.data,没关系)中有一个变量,声明如下:
.lcomm a_variable,4
然后,我想在程序中的某处读取其值.为此,我首先将其地址加载到寄存器中,然后将变量本身的值加载到另一个寄存器中:
adr r0,a_variableldr r1,[r0,#0]
到目前为止,一切都很好.编译后的对象包含我的 a_variable 符号:
00000000 b a_variable
生成的指令如下:
0:f2af 0004 subw r0,pc,#44:6801 ldr r1,[r0,#0]
问题是在我要将对象链接到生成的图像时开始的.ld将 a_variable 符号重新定位到新地址的最终.bss部分:
20001074 b a_variable
但是最终代码保持不变,并且该程序确实尝试从地址0x0而不是从0x20001074读取 a_variable .
我希望ld以某种方式替代新地址,因为当您链接由GCC编译的对象时,它似乎可以替代新地址.我的意思是,如果我写了一段类似的C代码:
static int a_variable;无效foo(void){a_variable = 5;}
...然后在目标文件中得到以下说明:
0:f240 0300 movw r3,#04:f2c0 0300 movt r3,#08:2005年movs r0,#5a:6018 str r0,[r3,#0]
...但是最终图像如下所示:
800c:f242 338c movw r3,#9100;0x238c8010:f2c0 0301 movt r3,#18014:2005 movs r0,#58016:6018 str r0,[r3,#0]
所以ld似乎已经用实际地址代替了占位符.
我的问题是,在手写汇编代码的情况下,为什么这不起作用?我想念什么?
ADR指令仅与在同一节和源文件中定义的附近符号(在Thumb2模式下为+/- 4095)一起使用时才有效.GNU汇编程序应该在引用不同部分中的符号时出现错误.在ARM模式下,您的代码会生成 Error:符号.bss在不同的部分
错误,但是GAS在Thumb模式下处理ADR指令的方式显然存在一个错误,导致它默默地接受它.>
相反,您可以使用LDR或MOVW/MOVT指令将任意32位常量(包括地址)加载到寄存器中.LDR指令会将地址放入一个常量池中,然后从那里进行加载,而MOVW/MOVT指令则分两步形成常量,就像编译器一样.前一条指令仅占用6个字节(2个用于指令,4个用于常量),后两个指令占用8个字节.例如:
.syntax统一.arch armv7-m.code 16.bss.lcomm a_variable,4.文本ldr r1 = a_variablemovw r2,#:lower16:a_variablemovt r2,#:upper16:a_variable
组装,链接和拆卸时会给出以下信息:
$ arm-linux-gnueabi-as -o test.o test.s$ arm-linux-gnueabi-ld -Tbss = f0000000 test.oarm-linux-gnueabi-ld:警告:找不到条目符号_start;默认为0000000000010074$ arm-linux-gnueabi-objdump -d a.out...00010074< .text> ;:10074:4902 ldr r1,[pc,#8];(10080< __ bss_start-0x10f80>)10076:f240 0200 movw r2,#01007a:f2cf 0200 movt r2,#61440;0xf0001007e:0000 movs r0,r010080:f0000000 .word 0xf0000000
My problem is very basic. I'm making my first bare-metal program in assembler. The architecture is ARMv7-M and I'm using GNU as and I'm writing in UAL.
I have a variable in .bss (or .data, doesn't matter) declared as follows:
.lcomm a_variable, 4
Then I want to read its value somewhere in the program. For that I first load its address into a register and then load the value of the variable itself into another register:
adr r0, a_variable
ldr r1, [r0, #0]
So far so good. The compiled object contains my a_variable symbol:
00000000 b a_variable
And the generated instructions look like this:
0: f2af 0004 subw r0, pc, #4
4: 6801 ldr r1, [r0, #0]
The problem begins when I want to link the object into resulting image. ld relocates a_variable symbol into the final .bss section at a new address:
20001074 b a_variable
But the final code remains the same and the program really tries to read a_variable from address 0x0 but not from 0x20001074.
I expect that ld somehow substitutes the new address because it seems to do so when you link objects compiled by GCC. I mean if I write a piece of C code doing something similar:
static int a_variable;
void foo(void)
{
a_variable = 5;
}
...then I get the following instructions in the object file:
0: f240 0300 movw r3, #0
4: f2c0 0300 movt r3, #0
8: 2005 movs r0, #5
a: 6018 str r0, [r3, #0]
...but the final image looks like this:
800c: f242 338c movw r3, #9100 ; 0x238c
8010: f2c0 0301 movt r3, #1
8014: 2005 movs r0, #5
8016: 6018 str r0, [r3, #0]
So ld appears to have substituted the real address for the placeholder which as left.
My question is why doesn't this work in case of hand-written assembler code? What do I miss?
The ADR instruction only works when used with a nearby symbol (+/- 4095 in Thumb2 mode) defined in the same section and source file. The GNU assembler should have given an error for referencing the symbol in a different section. In ARM mode your code generates a Error: symbol .bss is in a different section
error, but there's apparently a bug in how GAS handles the ADR instruction in Thumb mode that causes it silently accept it.
Instead you can either use the LDR or MOVW/MOVT instructions to load an arbitrary 32-bit constant, including addresses, into a register. The LDR instruction will put the address into a constant pool and load it from there, while the MOVW/MOVT instructions form the constant in two step, just like with your compiler. The former instruction only takes 6 bytes (2 for the instruct, 4 for the constant), the later two instructions take 8 bytes. For example:
.syntax unified
.arch armv7-m
.code 16
.bss
.lcomm a_variable, 4
.text
ldr r1, =a_variable
movw r2, #:lower16:a_variable
movt r2, #:upper16:a_variable
Which when assembled, linked and disassembled gives:
$ arm-linux-gnueabi-as -o test.o test.s
$ arm-linux-gnueabi-ld -Tbss=f0000000 test.o
arm-linux-gnueabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000010074
$ arm-linux-gnueabi-objdump -d a.out
...
00010074 <.text>:
10074: 4902 ldr r1, [pc, #8] ; (10080 <__bss_start-0x10f80>)
10076: f240 0200 movw r2, #0
1007a: f2cf 0200 movt r2, #61440 ; 0xf000
1007e: 0000 movs r0, r0
10080: f0000000 .word 0xf0000000
这篇关于GNU as:如何将.bss/.data符号加载到寄存器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!