GNU as:如何将.bss/.data符号加载到寄存器? [英] GNU as: how to load a .bss/.data symbol to a register?

查看:84
本文介绍了GNU as:如何将.bss/.data符号加载到寄存器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题很基本.我正在用汇编程序制作我的第一个裸机程序.该体系结构是ARMv7-M,我使用的是GNU,而我正在UAL中编写.

我在.bss(或.data,没关系)中有一个变量,声明如下:

  .lcomm a_variable,4 

然后,我想在程序中的某处读取其值.为此,我首先将其地址加载到寄存器中,然后将变量本身的值加载到另一个寄存器中:

  adr r0,a_variableldr r1,[r0,#0] 

到目前为止,一切都很好.编译后的对象包含我的 a_variable 符号:

  00000000 b a_variable 

生成的指令如下:

  0:f2af 0004 subw r0,pc,#44:6801 ldr r1,[r0,#0] 

问题是在我要将对象链接到生成的图像时开始的.ld将 a_variable 符号重新定位到新地址的最终.bss部分:

  20001074 b a_variable 

但是最终代码保持不变,并且该程序确实尝试从地址0x0而不是从0x20001074读取 a_variable .

我希望ld以某种方式替代新地址,因为当您链接由GCC编译的对象时,它似乎可以替代新地址.我的意思是,如果我写了一段类似的C代码:

  static int a_variable;无效foo(void){a_variable = 5;} 

...然后在目标文件中得到以下说明:

  0:f240 0300 movw r3,#04:f2c0 0300 movt r3,#08:2005年movs r0,#5a:6018 str r0,[r3,#0] 

...但是最终图像如下所示:

  800c:f242 338c movw r3,#9100;0x238c8010:f2c0 0301 movt r3,#18014:2005 movs r0,#58016:6018 str r0,[r3,#0] 

所以ld似乎已经用实际地址代替了占位符.

我的问题是,在手写汇编代码的情况下,为什么这不起作用?我想念什么?

解决方案

ADR指令仅与在同一节和源文件中定义的附近符号(在Thumb2模式下为+/- 4095)一起使用时才有效.GNU汇编程序应该在引用不同部分中的符号时出现错误.在ARM模式下,您的代码会生成 Error:符号.bss在不同的部分错误,但是GAS在Thumb模式下处理ADR指令的方式显然存在一个错误,导致它默默地接受它.

相反,您可以使用LDR或MOVW/MOVT指令将任意32位常量(包括地址)加载到寄存器中.LDR指令会将地址放入一个常量池中,然后从那里进行加载,而MOVW/MOVT指令则分两步形成常量,就像编译器一样.前一条指令仅占用6个字节(2个用于指令,4个用于常量),后两个指令占用8个字节.例如:

  .syntax统一.arch armv7-m.code 16.bss.lcomm a_variable,4.文本ldr r1 = a_variablemovw r2,#:lower16:a_variablemovt r2,#:upper16:a_variable 

组装,链接和拆卸时会给出以下信息:

  $ arm-linux-gnueabi-as -o test.o test.s$ arm-linux-gnueabi-ld -Tbss = f0000000 test.oarm-linux-gnueabi-ld:警告:找不到条目符号_start;默认为0000000000010074$ arm-linux-gnueabi-objdump -d a.out...00010074< .text> ;:10074:4902 ldr r1,[pc,#8];(10080< __ bss_start-0x10f80>)10076:f240 0200 movw r2,#01007a:f2cf 0200 movt r2,#61440;0xf0001007e:0000 movs r0,r010080:f0000000 .word 0xf0000000 

My problem is very basic. I'm making my first bare-metal program in assembler. The architecture is ARMv7-M and I'm using GNU as and I'm writing in UAL.

I have a variable in .bss (or .data, doesn't matter) declared as follows:

.lcomm a_variable, 4

Then I want to read its value somewhere in the program. For that I first load its address into a register and then load the value of the variable itself into another register:

adr     r0, a_variable
ldr     r1, [r0, #0]

So far so good. The compiled object contains my a_variable symbol:

00000000 b a_variable

And the generated instructions look like this:

0:  f2af 0004   subw    r0, pc, #4
4:  6801        ldr     r1, [r0, #0]

The problem begins when I want to link the object into resulting image. ld relocates a_variable symbol into the final .bss section at a new address:

20001074 b a_variable

But the final code remains the same and the program really tries to read a_variable from address 0x0 but not from 0x20001074.

I expect that ld somehow substitutes the new address because it seems to do so when you link objects compiled by GCC. I mean if I write a piece of C code doing something similar:

static int a_variable;
void foo(void)
{
    a_variable = 5;
}

...then I get the following instructions in the object file:

0:  f240 0300   movw    r3, #0
4:  f2c0 0300   movt    r3, #0
8:  2005        movs    r0, #5
a:  6018        str r0, [r3, #0]

...but the final image looks like this:

800c:       f242 338c       movw    r3, #9100       ; 0x238c
8010:       f2c0 0301       movt    r3, #1
8014:       2005            movs    r0, #5
8016:       6018            str     r0, [r3, #0]

So ld appears to have substituted the real address for the placeholder which as left.

My question is why doesn't this work in case of hand-written assembler code? What do I miss?

解决方案

The ADR instruction only works when used with a nearby symbol (+/- 4095 in Thumb2 mode) defined in the same section and source file. The GNU assembler should have given an error for referencing the symbol in a different section. In ARM mode your code generates a Error: symbol .bss is in a different section error, but there's apparently a bug in how GAS handles the ADR instruction in Thumb mode that causes it silently accept it.

Instead you can either use the LDR or MOVW/MOVT instructions to load an arbitrary 32-bit constant, including addresses, into a register. The LDR instruction will put the address into a constant pool and load it from there, while the MOVW/MOVT instructions form the constant in two step, just like with your compiler. The former instruction only takes 6 bytes (2 for the instruct, 4 for the constant), the later two instructions take 8 bytes. For example:

    .syntax unified
    .arch armv7-m
    .code 16

    .bss
    .lcomm a_variable, 4

    .text

    ldr     r1, =a_variable
    movw    r2, #:lower16:a_variable
    movt    r2, #:upper16:a_variable

Which when assembled, linked and disassembled gives:

$ arm-linux-gnueabi-as -o test.o test.s
$ arm-linux-gnueabi-ld -Tbss=f0000000 test.o
arm-linux-gnueabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000010074
$ arm-linux-gnueabi-objdump -d a.out
...    
00010074 <.text>:
   10074:       4902            ldr     r1, [pc, #8]    ; (10080 <__bss_start-0x10f80>)
   10076:       f240 0200       movw    r2, #0
   1007a:       f2cf 0200       movt    r2, #61440      ; 0xf000
   1007e:       0000            movs    r0, r0
   10080:       f0000000        .word   0xf0000000

这篇关于GNU as:如何将.bss/.data符号加载到寄存器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆