如何隔离 64 位寄存器中的字节和字数组元素 [英] How to isolate byte and word array elements in a 64-bit register

查看:18
本文介绍了如何隔离 64 位寄存器中的字节和字数组元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以说这是一个非常简单的问题,但我还没有弄清楚.基本上,我只是希望能够将一个元素作为一个数组,并使用寄存器从中添加和减去一些数字,然后将结果放入我的结果变量中.

I can tell this is a super simple problem but I have yet to figure it out. Basically, I just want to be able to take one element an array and add and subtract some numbers from it using registers and then put the result into my result variable.

segment .data
  a      dw  4, 234, -212
  b      db  112, -78, 50
  result dq  0
segment .text       
  global main
main:
  mov   rax, [a]        

我知道该解决方案与偏移量和索引有关,但我不明白我应该如何将一个数组元素放入寄存器.

I know the solution has something to do with offsets and indexing, but I don't get how I am supposed to be able to get just one array element into a register.

我能做什么?

推荐答案

如果您想将值视为已签名,则需要 movsx.假设 NASM 语法:

If you want to treat your values as signed, you want movsx. Assuming NASM syntax:

default rel
; ... declarations and whatever    

    movsx   rax, word [a + 1*2]    ; a is an array of dw = words
    movsx   rcx, byte [b + 1*1]    ; b is an array of db = bytes

    add     rax, rcx
    mov     [result], rax         ; result is a qword

(MASM 或 GNU .intel_syntax 将使用 word ptr 而不是 word,只需将 ptr 添加到内存操作数的大小说明符.)

(MASM or GNU .intel_syntax would use word ptr instead of word, just add ptr to the size specifier for the memory operand.)

1 可以是像 [a + rsi*2][b + rsi] 这样的寄存器,所以你可以很容易地循环你的数组.引用内存位置的内容.(x86 寻址模式)

The 1 can be a register like [a + rsi*2] or [b + rsi] so you can easily loop over your arrays. Referencing the contents of a memory location. (x86 addressing modes)

我写了 1*2 而不是 2 来表示它是索引 1(第二个数组元素),按元素大小缩放.汇编器将评估常量表达式,并使用与 [a] 相同的(RIP 相对)寻址模式,但具有不同的偏移量.

I wrote 1*2 instead of just 2 to indicate that it's index 1 (the 2nd array element), scaled by the element size. The assembler will evaluate the constant expression and just use the same (RIP-relative) addressing mode it would for [a] but with a different offset.

如果您需要它在与位置无关的代码中工作(您不能使用带有符号的 32 位绝对地址的 [disp32 + register] 寻址模式),lea rdi, [a] (RIP-relative LEA) 首先执行 [rsi + rsi*2].

If you need it to work in position-independent code (where you can't use a [disp32 + register] addressing mode with a 32-bit absolute address for the symbol), lea rdi, [a] (RIP-relative LEA) first and do [rsi + rsi*2].

如果你想要零扩展,你可以使用 movzx

If you wanted zero-extension, you'd use movzx

    movzx   eax, word [a + 1*2]    ; a is an array of dw = words
    movzx   ecx, byte [b + 1*1]    ; b is an array of db = bytes
    ; word and byte zero-extended into 64-bit registers:
    ; explicitly to 32-bit by MOVZX, and implicitly to 64-bit by writing a 32-bit reg

    ; add     eax, ecx              ; can't overflow 32 bits, still zero-extended to 64
    sub     rax, rcx              ; want the full width 64-bit signed result 
    mov     [result], rax         ; result is a qword

如果您知道完整结果的高位始终为零,则只需使用 EAX(32 位操作数大小),最后除外.在x86-64中使用32位寄存器/指令的优势

If you knew the upper bits of your full result would always be zero, just use EAX (32-bit operand-size) except at the end. The advantages of using 32bit registers/instructions in x86-64

这段代码对应C类

static  uint16_t a[] = {...};
static  uint8_t b[] = {...};
static  int64_t result;

void foo(){
    int64_t rax = a[1] - (int64_t)b[1];
    result = rax;    // why not just return this like a normal person instead of storing?
}

说到这里,你可以看看编译器的输出 在 Godbolt 编译器资源管理器中 并查看这些说明和寻址模式.

Speaking of which, you can look at compiler output on the Godbolt compiler explorer and see these instructions and addressing modes.

请注意,mov al, [b + 1] 会加载一个字节并将其合并到 RAX 的低字节中.

Note that mov al, [b + 1] would load a byte and merge it into the low byte of RAX.

你通常不想要这个;movzx 是现代 x86 中加载字节的正常方式.现代 x86 CPU 将 x86 解码为类似 RISC 的内部微指令,用于寄存器重命名 + 乱序执行.movzx 避免了对完整寄存器旧值的任何错误依赖.它类似于 ARM ldrb、MIPS lbu 等.

You normally don't want this; movzx is the normal way to load a byte in modern x86. Modern x86 CPUs decode x86 to RISC-like internal uops for register renaming + Out-of-Order execution. movzx avoids any false dependency on the old value of the full register. It's analogous to ARM ldrb, MIPS lbu, and so on.

合并到 RAX 的低字节或字是一个奇怪的 CISC 事情,x86 可以做而 RISC 不能.

Merging into the low byte or word of RAX is a weird CISC thing that x86 can do but RISCs can't.

您可以安全地读取 8 位和 16 位寄存器(并且您需要为字存储)但通常避免写入部分寄存器,除非您有充分的理由,并且您了解可能的性能影响(为什么 GCC 不使用部分寄存器?).例如您已经在 cmp + setcc al 之前对完整目标进行异或清零.

You can safely read 8-bit and 16-bit registers (and you need to for a word store) but generally avoid writing partial registers unless you have a good reason, and you understand the possible performance implications (Why doesn't GCC use partial registers?). e.g. you've xor-zeroed the full destination ahead of cmp + setcc al.

这篇关于如何隔离 64 位寄存器中的字节和字数组元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆