在 x64 ASM 中循环并打印 argv[] [英] Cycle Through and Print argv[] in x64 ASM

查看:24
本文介绍了在 x64 ASM 中循环并打印 argv[]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我基本上一直在研究一个 while 循环来遍历所有 CLI 参数.在研究仅打印 1 个元素的解决方案时,我注意到了一些事情;这就是导致我来到这里的思考过程.

我注意到,如果我执行 lea 16(%rsp), %someRegisterToWrite,我可以获取/打印 argv[1].接下来我尝试了 lea 24(%rsp), %someRTW 这让我可以访问 argv[2].我一直在上去看看它是否会继续工作,结果确实如此.

我的想法是继续将 8 添加到 %someRTW 并增加一个计数器",直到计数器等于 argc.以下代码在输入单个参数时效果很好,但不打印 2 个参数,当我输入 3 个参数时,它将打印前 2 个,中间没有空格.

.section __DATA,__data.section __TEXT,__text.globl _main_主要的:lea (%rsp), %rbx #argclea 16(%rsp), %rcx #argv[1]mov $0x2, %r14 #counterL1:mov (%rcx), %rsi #%rsi = user_addr_t cbuf移动 (%rcx), %r10移动 16(%rcx), %r11sub %r10, %r11 #获取下一个参数之前的字节数mov $0x2000004, %eax #4 = 写入mov $1, %edi #edi = 文件描述符mov %r11, %rdx #user_size_t nbyte系统调用cmp (%rbx), %r14 #if 计数器 <argcJB L2杰格L3L2:公司%r14mov 8(%rcx), %rcx #mov 24(%rsp) 回到 %rcxmov $0x2000004, %eaxmov $0x20, %rsi #0x20 = 空间移动 $2,%rdx系统调用JMP L1L3:异或 %rax, %rax异或 %edi, %edimov $0x2000001, %eax系统调用

解决方案

我将假设在 64 位 OS/X 上您正在组装和链接,以至于您有意绕过 C 运行时代码.一个示例是在没有 C 运行时启动文件和系统库的情况下进行静态构建,并且您指定 _main 是您的程序入口点._start 通常是流程入口点,除非被覆盖.

在这种情况下,64 位内核会将 macho64 程序加载到内存中,并使用程序参数和环境变量等设置进程堆栈.启动时的 Apple OS/X 进程堆栈状态与

一个观察结果是参数指针列表以 NULL(0) 地址终止.您可以使用它来遍历所有参数,直到找到 NULL(0) 地址作为依赖 argc 中的值的替代方法.

<小时>

问题

一个问题是您的代码假设所有寄存器都保存在 SYSCALL.SYSCALL 指令本身会破坏 RCXR11 的内容:

<块引用>

SYSCALL 在特权级别 0 调用 OS 系统调用处理程序.它通过从 IA32_LSTAR MSR 加载 RIP(在将 SYSCALL 后面的指令的地址保存到 RCX 之后)来实现.(WRMSR 指令确保 IA32_LSTAR MSR 始终包含规范地址.)

SYSCALL 也将 RFLAGS 保存到 R11 中,然后使用 IA32_FMASK MSR(MSR 地址 C0000084H)屏蔽 RFLAGS;具体来说,处理器清除 RFLAGS 中与 IA32_FMASK MSR 中设置的位相对应的每一位

避免这种情况的一种方法是尝试使用除 RCXR11 之外的寄存器.否则,您将不得不通过 SYSCALL 保存/恢复它们.如果您需要保持它们的值不变.内核也会用返回值破坏 RAX.

Apple 操作系统列表/X 系统调用 提供了所有可用内核函数的详细信息.在 64 位 OS/X 代码中,每个系统调用号都有 0x2000000 添加:

<块引用>

在 64 位系统中,Mach 系统调用是正数,但以 0x2000000 为前缀——这清楚地将它们与以 0x1000000 为前缀的 POSIX 调用区分开来并消除歧义

<小时>

您计算命令行参数长度的方法将不起作用.一个参数的地址不必放在前一个参数之后的内存中.正确的方法是编写从您感兴趣的参数的开头开始并搜索 NUL(0) 终止字符的代码.

<小时>

这个用于打印空格或分隔符的代码不起作用:

mov 8(%rcx), %rcx #mov 24(%rsp) 回到 %rcxmov $0x2000004, %eaxmov $0x20, %rsi #0x20 = 空间移动 $2,%rdx系统调用

当使用 sys_write 系统调用时,RSI 寄存器是一个指向字符缓冲区的指针.您不能传递像 0x20(空格)这样的直接值.您需要将空格或其他一些分隔符(如换行符)放入缓冲区并通过 RSI 传递该缓冲区.

<小时>

修订代码

这段代码借鉴了前面信息中的一些想法和额外的清理,将每个命令行参数(不包括程序名称)写入标准输出.每个都由换行符分隔.Darwin OS/X 上的换行符是 0x0a ( ).

# 在 64 位 OSX 系统调用编号 = 0x2000000+(32 位系统调用 #)SYS_EXIT = 0x2000001SYS_WRITE = 0x2000004标准输出 = 1.section __DATA, __const换行符:.ascii "
"newline_end: NEWLINE_LEN = newline_end-newline.section __TEXT, __text.globl _main_主要的:mov (%rsp), %r8 # 0(%rsp) = # args.此代码不使用它# 仅以保存到R8为例.lea 16(%rsp), %rbx # 8(%rsp)=程序名称的指针# 16(%rsp)=指向第一个参数的指针.argloop:mov (%rbx), %rsi # 获取当前cmd行参数指针测试 %rsi, %rsijz .exit # 如果它是零,我们就完成了# 计算当前cmd行参数的长度# 从RSI(当前参数)中的地址开始搜索直到# 我们找到一个 NUL(0) 终止字符.# rdx = 长度不包括终止 NUL 字符xor %edx, %edx # RDX = 字符索引 = 0mov %edx, %eax # RAX = 要查找的终止字符 NUL(0).strlenloop:inc %rdx # 前进到下一个字符索引cmpb %al, -1(%rsi,%rdx)# 是前一个字符索引处的字符# 一个 NUL(0) 字符?jne .strlenloop # 如果不是 NUL(0) 字符则再次循环dec %rdx # 我们不希望 strlen 包含 NUL(0)# 显示 cmd 行参数# sys_write 要求:# rdi = 输出设备号# rsi = 指向字符串的指针(命令行参数)# rdx = 长度#mov $STDOUT, %edimov $SYS_WRITE, %eax系统调用# 显示一个新行移动 $NEWLINE_LEN,%edxlea newline(%rip), %rsi # 我们使用 RIP 寻址#字符串地址mov $SYS_WRITE, %eax系统调用add $8, %rbx # 转到下一个 cmd 行参数指针# 在 64 位指针中是 8 个字节# lea 8(%rbx), %rbx # 这个LEA指令可以代替# ADD 因为我们不关心标志# rbx = 8 + rbx(标志不变)jmp .argloop.出口:#退出程序# sys_exit 要求:# rdi = 返回值#异或 %edi, %edimov $SYS_EXIT, %eax系统调用

如果您打算在不同的地方使用像 strlen 这样的代码,那么我建议您创建一个执行该操作的函数.为简单起见,我将 strlen 硬编码到代码中.如果您希望提高 strlen 实现的效率,那么一个好的起点是 Agner Fog 的 优化汇编语言中的子程序.

这段代码应该编译并链接到一个没有C运行时的静态可执行文件,使用:

gcc -e _main progargs.s -o progargs -nostartfiles -static

I have been working on essentially a while loop to go through all CLI arguments. While working on solution to only print 1 element I noticed a few things; this was the thought process that led me to here.

I noticed that if I did lea 16(%rsp), %someRegisterToWrite, I was able to get/print argv[1]. Next I tried lea 24(%rsp), %someRTW and this gave me access to argv[2]. I kept going up to see if it would continue to work and it did.

My thought was to keep adding 8 to %someRTW and increment a "counter" until the counter was equal to argc. This following code works great when a single argument is entered but prints nothing with 2 arguments and when I enter 3 arguments, it will print the first 2 with no space in between.

.section __DATA,__data
.section __TEXT,__text
.globl _main
_main:
    lea (%rsp), %rbx        #argc
    lea 16(%rsp), %rcx      #argv[1]
    mov $0x2, %r14          #counter
    L1:
    mov (%rcx), %rsi        #%rsi = user_addr_t cbuf
    mov (%rcx), %r10
    mov 16(%rcx), %r11      
    sub %r10, %r11          #Get number of bytes until next arg
    mov $0x2000004, %eax    #4 = write
    mov $1, %edi            #edi = file descriptor 
    mov %r11, %rdx          #user_size_t nbyte
    syscall
    cmp (%rbx), %r14        #if counter < argc
    jb L2
    jge L3
    L2:
    inc %r14                
    mov 8(%rcx), %rcx       #mov 24(%rsp) back into %rcx
    mov $0x2000004, %eax
    mov $0x20, %rsi         #0x20 = space
    mov $2, %rdx
    syscall
    jmp L1
    L3:
    xor %rax, %rax
    xor %edi, %edi
    mov $0x2000001, %eax
    syscall

解决方案

I am going to assume that on 64-bit OS/X you are assembling and linking in such away that you intentionally want to bypass the C runtime code. One example would be to do a static build without the C runtime startup files and the System library, and that you are specifying that _main is your program entry point. _start is generally the process entry point unless overridden.

In this scenario the 64-bit kernel will load the macho64 program into memory and set up the process stack with the program arguments, and environment variables among other things. Apple OS/X process stack state at startup is the same as what is documented in the System V x86-64 ABI in Section 3.4:

One observation is that the list of argument pointers is terminated with a NULL(0) address. You can use this to loop through all parameters until you find the NULL(0) address as an alternative to relying on the value in argc.


The Problems

One problem is that your code assumes that registers are all preserved across a SYSCALL. The SYSCALL instruction itself will destroy the contents of RCX and R11:

SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)

SYSCALL also saves RFLAGS into R11 and then masks RFLAGS using the IA32_FMASK MSR (MSR address C0000084H); specifically, the processor clears in RFLAGS every bit corresponding to a bit that is set in the IA32_FMASK MSR

One way to avoid this is to try and use registers other than RCX and R11. Otherwise you will have to save/restore them across a SYSCALL if you need their values to be untouched. The kernel will also clobber RAX with a return value.

A list of the Apple OS/X system calls provides the details of all the available kernel functions. In 64-bit OS/X code each of the system call numbers has 0x2000000 added to it:

In 64-bit systems, Mach system calls are positive, but are prefixed with 0x2000000 — which clearly separates and disambiguates them from the POSIX calls, which are prefixed with 0x1000000


Your method to compute the length of a command line argument will not work. The address of one argument doesn't necessarily have to be placed in memory after the previous one. The proper way is to write code that starts at the beginning of the argument you are interested in and searches for a NUL(0) terminating character.


This code to print a space or separator character won't work:

mov 8(%rcx), %rcx       #mov 24(%rsp) back into %rcx
mov $0x2000004, %eax
mov $0x20, %rsi         #0x20 = space
mov $2, %rdx
syscall

When using the sys_write system call the RSI register is a pointer to a character buffer. You can't pass an immediate value like 0x20 (space). You need to put the space or some other separator (like a new line) into a buffer and pass that buffer through RSI.


Revised Code

This code takes some of the ideas in the previous information and additional cleanup, and writes each of the command line parameters (excluding the program name) to standard output. Each will be separated by a newline. Newline on Darwin OS/X is 0x0a ( ).

# In 64-bit OSX syscall numbers = 0x2000000+(32-bit syscall #)
SYS_EXIT  = 0x2000001
SYS_WRITE = 0x2000004

STDOUT    = 1

.section __DATA, __const
newline: .ascii "
"
newline_end: NEWLINE_LEN = newline_end-newline

.section __TEXT, __text
.globl _main
_main:
    mov (%rsp), %r8             # 0(%rsp) = # args. This code doesn't use it
                                #    Only save it to R8 as an example.
    lea 16(%rsp), %rbx          # 8(%rsp)=pointer to prog name
                                # 16(%rsp)=pointer to 1st parameter
.argloop:
    mov (%rbx), %rsi            # Get current cmd line parameter pointer
    test %rsi, %rsi
    jz .exit                    # If it's zero we are finished

    # Compute length of current cmd line parameter
    # Starting at the address in RSI (current parameter) search until
    # we find a NUL(0) terminating character.
    # rdx = length not including terminating NUL character

    xor %edx, %edx              # RDX = character index = 0
    mov %edx, %eax              # RAX = terminating character NUL(0) to look for
.strlenloop:
         inc %rdx               # advance to next character index
         cmpb %al, -1(%rsi,%rdx)# Is character at previous char index
                                #     a NUL(0) character?
         jne .strlenloop        # If it isn't a NUL(0) char then loop again
    dec %rdx                    # We don't want strlen to include NUL(0)

    # Display the cmd line argument
    # sys_write requires:
    #    rdi = output device number
    #    rsi = pointer to string (command line argument)
    #    rdx = length
    #
    mov $STDOUT, %edi
    mov $SYS_WRITE, %eax
    syscall

    # display a new line
    mov $NEWLINE_LEN, %edx
    lea newline(%rip), %rsi     # We use RIP addressing for the
                                #     string address
    mov $SYS_WRITE, %eax
    syscall

    add $8, %rbx                # Go to next cmd line argument pointer
                                #     In 64-bit pointers are 8 bytes
    # lea 8(%rbx), %rbx         # This LEA instruction can replace the
                                #     ADD since we don't care about the flags
                                #     rbx = 8 + rbx (flags unaltered)
    jmp .argloop

.exit:
    # Exit the program
    # sys_exit requires:
    #    rdi = return value
    #
    xor %edi, %edi
    mov $SYS_EXIT, %eax
    syscall

If you intend to use code like strlen in various places then I recommend creating a function that performs that operation. I have hard coded strlen into the code for simplicity. If you are looking to improve on the efficiency of your strlen implementation then a good place to start would be Agner Fog's Optimizing subroutines in assembly language.

This code should compile and link to a static executable without C runtime using:

gcc -e _main progargs.s -o progargs -nostartfiles -static

这篇关于在 x64 ASM 中循环并打印 argv[]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆