存储到内存和从内存中加载是如何工作的;存储 32 位字时,哪些地址会受到影响? [英] how does storing into and loading from memory work; which addresses are affected when you store a 32-bit word?

查看:24
本文介绍了存储到内存和从内存中加载是如何工作的;存储 32 位字时,哪些地址会受到影响?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从事一个二进制分析项目,我正在构建一个将程序集转换为 llvm 的升降机.我建立了一个内存模型,但对 str 和 ldr arm 汇编指令如何在内存上工作有点困惑.所以我的问题是.例如,给定一个内存地址 0000b8f0,我想在其中存储一个 64 位十进制值 20000000.str 指令是将整个 20000000 存储在地址 0000b8f0 中还是将其划分为字节并将第一个字节存储在 0000b8f0 中,第二个字节存储在0000b8f1 和 0000b8f2 中的第 3 个字节等等......从地址 (0000b8f0) 加载也是如此

对不起,如果我的问题非常基本,但我需要确保我正确地实现了 str 和 ldr 对我的内存模型的影响.

解决方案

逻辑上1,内存是一个8位字节的数组.

Word 一次加载/存储访问多个字节,就像 C 中的 SIMD 内在函数一样,或者类似于 ((char*)my_int)[2] 的反面 加载 int 的第 3 个字节.

C 的内存模型是围绕支持更广泛访问(如 PDP-11 或 ARM)的字节可寻址机器设计的,因此如果您了解 char* 的工作原理,您就会习惯使用它C 用于访问其他对象的对象表示,例如为什么 memcpy 有效.

(我没有使用将 int* 指向 char 数组的 C 示例,因为 C 中的严格别名规则会产生未定义的行为.只有 char* 允许在 ISO C 中为其他类型设置别名.Asm 具有明确定义的行为,可以访问任何宽度的内存字节,与早期存储有任何部分或全部重叠,使用 -fno-strict 编译的 GNU C 也是如此-aliasing 禁用基于类型的别名分析/优化.)


str 是 32 位字存储;它一次写入所有 4 个字节.如果您要从 0000b8f1...2...3 加载,您将获得 2nd、3rd 或第 4 个字节,因此 str 等价于 4 个单独的 strb 指令(通过移位来提取正确的字节),除了明显缺乏原子性和性能.

str 总是存储来自 32 位寄存器的 4 个字节.如果一个寄存器保存一个像 2 这样的值,这意味着高字节都是零.

ARM 可以是大端或小端.我认为现代 ARM 系统通常是 little-endian,例如 x86,因此值的最低有效字节存储在最低地址.


0000b8f0 处的字节本身不能容纳 20000000;一个字节不是那么大,如果这就是你的要求.

注意0000b8f4是下一个字的低字节;这是一个 4 字节对齐的地址.

此外,使用 20000000 存储 int64_t 将需要 两个 32 位存储.例如两条 str 指令,或一条 ARMv8 stp 来执行一对寄存器的 64 位存储,或一条 stm 存储多重指令两个寄存器.或者八个 strb 字节存储指令.


脚注 1:这来自软件 PoV,而不是内存控制器、数据总线或 DRAM 芯片的物理组织方式.甚至缓存,因此 字节存储,有时甚至加载可能比 ARM 上的整个单词效率低,即使除了移动数据量的 1/4 或 1/8 为 strstp

I am working on a binary analysis project where I am building a lifter that translates assembly to llvm. I built a memory model but a bit confused on how str and ldr arm assembly instructions work on the memory. So my question is. given a memory address 0000b8f0 for example in which I would like to store a 64 bit decimal value of 20000000. does the str instruction store the entire 20000000 in address 0000b8f0 or does it divide it into bytes and stores first byte in 0000b8f0 and 2nd byte in 0000b8f1 and 3rd byte in 0000b8f2 and so on...and same goes for loading from an address (0000b8f0) does the ldr instruction take just the byte stored at 0000b8f0 or the full set of bytes from 0000b8f0-0000b8f4.

sorry if my question is very basic but I need to make sure I correctly implement the effects of the str and ldr on my memory model.

解决方案

Logically1, memory is an array of 8-bit bytes.

Word load/stores access more than one byte at once, just like SIMD intrinsics in C, or like the opposite of ((char*)my_int)[2] to load the 3rd byte of an int.

C's memory model was designed around a byte-addressable machine that supports wider accesses (like PDP-11 or ARM), so it's what you're used to if you understand how char* works in C for accessing the object-representation of other objects, e.g. why memcpy works.

(I didn't use a C example of pointing an int* at a char array because the strict-aliasing rule in C makes that undefined behaviour. Only char* is allowed to alias other types in ISO C. Asm has well-defined behaviour for accessing bytes of memory with any width, with any partial or full overlap with earlier stores, as does GNU C when compiled with -fno-strict-aliasing to disable type-based alias analysis / optimization.)


str is a 32-bit word store; it writes all 4 bytes at once. If you were to load from 0000b8f1, ...2, or ...3, you'd get the 2nd, 3rd, or 4th byte, so str is equivalent to 4 separate strb instructions (with shifts to extract the right bytes), except for the obvious lack of atomicity and performance.

str always stores 4 bytes from a 32-bit register. If a register holds a value like 2, that means the upper bytes are all zero.

ARM can be big- or little-endian. I think modern ARM systems are most often little-endian, like x86, so the least-significant byte of a value is stored at the lowest address.


The byte at 0000b8f0 can't hold 20000000 on its own; a byte isn't that large, if that's what you're asking.

Note that 0000b8f4 is the low byte of the next word; it's a 4-byte-aligned address.

Also, storing an int64_t with 20000000 would require two 32-bit stores. e.g. two str instructions, or an ARMv8 stp to do a 64-bit store of a pair of registers, or an stm store-multiple instruction with two registers. Or eight strb byte-store instructions.


Footnote 1: That's from a software PoV, not how memory controllers, data busses, or DRAM chips are physically organized. Or even caches, thus byte stores and sometimes even loads can be less efficient than whole words on ARM, even apart from only moving 1/4 or 1/8th the amount of data as str or stp

这篇关于存储到内存和从内存中加载是如何工作的;存储 32 位字时,哪些地址会受到影响?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆