如何比较 x86 中的地址,使用静态数组的结束指针作为循环条件? [英] How to compare addresses in x86, using the end pointer of a static array as a loop condition?

查看:33
本文介绍了如何比较 x86 中的地址,使用静态数组的结束指针作为循环条件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从头开始编程的挑战之一是修改程序以使用结束地址而不是数字 0 来知道何时停止."

One of the challenge questions in programming from the ground up is "to modify the program to use an ending address rather than the number 0 to know when to stop."

我发现这很难做到,因为到目前为止本书只介绍了 movlcmplincl(以及寻址模式)和 jmp 指令.基本上,下面代码片段中的所有内容都是迄今为止介绍的内容.我发现的所有解决方案都涉及本书中尚未介绍的说明.下面的代码从集合中找到最大值.

I am finding it difficult to do this since up to this point the book has only introduced movl, cmpl, incl (along with the addressing modes) and jmp instructions. Basically everything in the code snippet below is what has been introduced so far. All solutions I have found involve instructions not yet introduced in this book. The code below finds the maximum value from the set.

.section .data
data_items:             #These are the data items
.long 3,67,34,222,45,75,54,34,44,33,22,11,66,0

.section .text
.globl _start
_start:
    movl $0, %edi                   # move 0 into the index register
    movl data_items(,%edi,4), %eax  # load the first byte of data
    movl %eax, %ebx                 # since this is the first item, %eax is
                                    # the biggest
start_loop:                     # start loop
    cmpl $0, %eax                   # check to see if we’ve hit the end
    je loop_exit
    incl %edi                       # load next value
    movl data_items(,%edi,4), %eax
    cmpl %ebx, %eax                 # compare values
    jle start_loop                  # jump to loop beginning if the new
                                    # one isn’t bigger
    movl %eax, %ebx                 # move the value as the largest
    jmp start_loop                  # jump to loop beginning
loop_exit:
    # %ebx is the status code for the exit system call
    # and it already has the maximum number
    movl $1, %eax   #1 is the exit() syscall
    int $0x80

注意这个问题与后面的问题明显不同,后者要求修改程序以使用长度计数而不是数字 0.对我来说,数组中最后一个数字的地址似乎应该存储在寄存器中然后与指针的地址进行比较.我想不出适合这本书的进展的方法,因为到目前为止这本书只介绍了基本内容.

Note this question is distinctly different from the subsequent question which asks to modify the program to use a length count rather than the number 0. To me it seems like the address of the last number in the array should be stored in a register and then compared to the address of the pointer. I can't figure out a way to do this that fits the progression of this book since the book has only introduced the bare bones thus far.

推荐答案

你可以只用 movcmp 做到这一点,没有 lea需要计算一个端点.(无论如何,您在任何地方都没有长度可用于 LEA).

You can do this with only mov and cmp, no lea required to calculate an end-pointer. (You don't have a length anywhere anyway to use with LEA).

你应该在数组的末尾添加一个新标签,这样你就可以引用内存中的那个位置(也就是地址).并删除终止的0数组,因为我们使用的是地址而不是哨兵值.

You should add a new label at the end of the array, so you can refer to that position in memory (aka address). And remove the terminating 0 from the array because we're using addresses instead of a sentinel value.

.section .data
data_items:
  .long 3,67,34,222,45,75,54,34,44,33,22,11,66     # ,0   remove the sentinel / terminator
data_items_end:                                  # and add this new label

您不需要在寄存器中使用该地址;您可以使用 cmp $data_items_end, %reg 将其用作立即数,链接器将正确的字节填充到机器代码中,就像它为您的 mov data_items(,%edi,4), %eax.(cmp 符号,%reg 将与该地址处的内存进行比较.$symbol 是 AT&T 语法中作为立即数的地址.)

You don't need that address in a register; you can use cmp $data_items_end, %reg to use it as an immediate, with the linker filling in the right bytes into the machine code just like it does for your mov data_items(,%edi,4), %eax. (cmp symbol, %reg would compare with memory at that address. $symbol is the address as an immediate, in AT&T syntax.)

您在寄存器中需要的是起始地址,因此您可以增加和取消引用它.(对于一个函数需要一个指针+长度,你可以计算寄存器中的结束地址.)

What you do need in a register is the start address, so you can increment and deref it. (For a function takes a pointer+length, you could compute the end address in a register.)

_start:
    mov  $data_items, %edi       # int *ptr = &data_items[0]
    mov  (%edi), %ebx            # current max
   # setting %eax is unnecessary here, it's always written before being read in this and the original version
loop_start:
    add  $4, %edi                # ptr++  (4 byte elements)
    cmp  $data_items_end, %edi
    je   loop_exit               # if (ptr == endp) break
    ...                  # compare with (%edi) and update %ebx if greater.
    jmp  loop_start
  ...

更有效的是 do{}while 类似于编译器使用的循环结构,特别是因为您知道数组包含 1 个以上的元素,因此您无需检查循环体应该运行 0 次的情况.请注意,除了 cmp/jcc 之外,没有每次都必须执行的无条件 jmp.

More efficient would be a do{}while loop structure like compilers use, especially since you know the array contains more than 1 element so you don't need to check for the case where the loop body should run 0 times. Notice that there's no unconditional jmp that has to execute every time, in addition to a cmp/jcc.

_start:
    mov  $data_items, %edi       # int *ptr = &data_items[0]
    mov  (%edi), %ebx            # current max

loop_start:                    # do{
    add  $4, %edi                # ptr++;  (4 byte elements)
  ## maybe update max:
    mov  (%edi), %eax            # tmp = *ptr;
    cmp  %ebx, %eax
    cmovg %eax, %ebx             # max = (tmp > max) ? tmp : max;
  ## end of loop body

    cmp  $data_items_end, %edi
    jne  loop_start            # }while(ptr != endp)
## end of loop, but nothing jumps here so no label is needed.

    mov  $1, %eax
    int  $0x80             # SYS_exit(%ebx)

我使用了 cmp/cmovg(条件移动)而不是分支,因为它的输入指令较少,并且在循环中没有分支,更容易看到循环结构.

I used cmp/cmovg (conditional move) instead of branching just because it's fewer instructions to type and no branching within the loop, making it easier to see the loop structure.

循环和指针的其他示例:

Other examples of looping and pointers:

  • Assembly Language (x86): How to create a loop to calculate Fibonacci sequence - functions that take a pointer+length as args, and use LEA to calculate an end pointer. (x86-64 NASM syntax)
  • How to check an "array's length" in Assembly Language (ASM), - defining an assemble-time constant based on the length of a .long static array, instead of putting a label at the end.
  • Copying to arrays in NASM - some tricks for writing efficient loops that loop over two arrays, e.g. indexing one relative to the other to still only use one increment but avoid indexed addressing modes. Or counting a negative index up towards zero, so you can still loop forwards in memory but still not need a separate cmp instruction, just inc / jnz.

这篇关于如何比较 x86 中的地址,使用静态数组的结束指针作为循环条件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆