如何在 x86 汇编中编写自修改代码 [英] How to write self-modifying code in x86 assembly

查看:24
本文介绍了如何在 x86 汇编中编写自修改代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑为我最近一直在研究的业余虚拟机编写 JIT 编译器.我知道一点汇编,(我主要是一个 C 程序员.我可以阅读大多数汇编并参考我不理解的操作码,并编写一些简单的程序.)但是我很难理解这几个例子我在网上找到的自修改代码.

I'm looking at writing a JIT compiler for a hobby virtual machine I've been working on recently. I know a bit of assembly, (I'm mainly a C programmer. I can read most assembly with reference for opcodes I don't understand, and write some simple programs.) but I'm having a hard time understanding the few examples of self-modifying code I've found online.

这是一个这样的例子:http://asm.sourceforge.net/articles/smc.html

所提供的示例程序在运行时进行了大约四种不同的修改,其中没有一个明确解释.Linux 内核中断被多次使用,不作解释或详细说明.(作者在调用中断之前将数据移动到几个寄存器中.我假设他在传递参数,但这些参数根本没有解释,让读者猜测.)

The example program provided does about four different modifications when run, none of which are clearly explained. Linux kernel interrupts are used several times, and aren't explained or detailed. (The author moved data into several registers before calling the interrupts. I assume he was passing arguments, but these arguments aren't explained at all, leaving the reader to guess.)

我正在寻找的是自修改程序代码中最简单、最直接的示例.我可以查看并用于了解必须如何编写 x86 程序集中的自修改代码以及它是如何工作的.是否有任何资源可以指给我,或者您可以举出任何可以充分证明这一点的示例?

What I'm looking for is the simplest, most straightforward example in code of a self-modifying program. Something that I can look at, and use to understand how self-modifying code in x86 assembly has to be written, and how it works. Are there any resources you can point me to, or any examples you can give that would adequately demonstrate this?

我使用 NASM 作为我的汇编程序.

I'm using NASM as my assembler.

我也在 Linux 上运行此代码.

I'm also running this code on Linux.

推荐答案

哇,结果证明这比我预期的要痛苦得多.100% 的痛苦是 linux 保护程序不被覆盖和/或执行数据.

wow, this turned out to be a lot more painful than I expected. 100% of the pain was linux protecting the program from being overwritten and/or executing data.

下面显示了两种解决方案.并且涉及到大量的谷歌搜索,所以有点简单的放置一些指令字节并执行它们是我的,mprotect 和页面大小对齐是从谷歌搜索中挑选出来的,我必须为这个例子学习.

Two solutions shown below. And a lot of googling was involved so the somewhat simple put some instruction bytes and execute them was mine, the mprotect and aligning on page size was culled from google searches, stuff I had to learn for this example.

自修改代码很简单,如果你把程序或者至少只是两个简单的函数,编译然后反汇编你就会得到这些指令的操作码.或使用 nasm 编译汇编程序块等.由此我确定了将立即数加载到 eax 中然后返回的操作码.

The self modifying code is straight forward, if you take the program or at least just the two simple functions, compile and then disassemble you will get the opcodes for those instructions. or use nasm to compile blocks of assembler, etc. From this I determined the opcode to load an immediate into eax then return.

理想情况下,您只需将这些字节放在某个 ram 中并执行该 ram.要让 linux 做到这一点,您必须更改保护,这意味着您必须向它发送一个在 mmap 页面上对齐的指针.因此,分配比您需要的更多,在页面边界上的分配中找到对齐的地址,并从该地址进行 mprotect 并使用该内存来放置您的操作码,然后执行.

Ideally you simply put those bytes in some ram and execute that ram. To get linux to do that you have to change the protection, which means you have to send it a pointer that is aligned on a mmap page. So allocate more than you need, find the aligned address within that allocation that is on a page boundary and mprotect from that address and use that memory to put your opcodes and then execute.

第二个示例将现有函数编译到程序中,同样由于保护机制,您不能简单地指向它并更改字节,您必须取消对它的写入保护.因此,您必须使用该地址和足够的字节备份到先前的页面边界调用 mprotect 以覆盖要修改的代码.然后您可以以任何您想要的方式更改该函数的字节/操作码(只要您不溢出到您想要继续使用的任何函数中)并执行它.在这种情况下,您可以看到 fun() 有效,然后我将其更改为简单地返回一个值,再次调用它,现在它已被修改.

the second example takes an existing function compiled into the program, again because of the protection mechanism you cannot simply point at it and change bytes, you have to unprotect it from writes. So you have to back up to the prior page boundary call mprotect with that address and enough bytes to cover the code to be modified. Then you can change the bytes/opcodes for that function in any way you want (so long as you don't spill over into any function you want to continue to use) and execute it. In this case you can see that fun() works, then I change it to simply return a value, call it again and now it has been modified.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

unsigned char *testfun;

unsigned int fun ( unsigned int a )
{
    return(a+13);
}

unsigned int fun2 ( void )
{
    return(13);
}

int main ( void )
{
    unsigned int ra;
    unsigned int pagesize;
    unsigned char *ptr;
    unsigned int offset;

    pagesize=getpagesize();
    testfun=malloc(1023+pagesize+1);
    if(testfun==NULL) return(1);
    //need to align the address on a page boundary
    printf("%p
",testfun);
    testfun = (unsigned char *)(((long)testfun + pagesize-1) & ~(pagesize-1));
    printf("%p
",testfun);

    if(mprotect(testfun, 1024, PROT_READ|PROT_EXEC|PROT_WRITE))
    {
        printf("mprotect failed
");
        return(1);
    }

    //400687: b8 0d 00 00 00          mov    $0xd,%eax
    //40068d: c3                      retq

    testfun[ 0]=0xb8;
    testfun[ 1]=0x0d;
    testfun[ 2]=0x00;
    testfun[ 3]=0x00;
    testfun[ 4]=0x00;
    testfun[ 5]=0xc3;

    ra=((unsigned int (*)())testfun)();
    printf("0x%02X
",ra);


    testfun[ 0]=0xb8;
    testfun[ 1]=0x20;
    testfun[ 2]=0x00;
    testfun[ 3]=0x00;
    testfun[ 4]=0x00;
    testfun[ 5]=0xc3;

    ra=((unsigned int (*)())testfun)();
    printf("0x%02X
",ra);


    printf("%p
",fun);
    offset=(unsigned int)(((long)fun)&(pagesize-1));
    ptr=(unsigned char *)((long)fun&(~(pagesize-1)));


    printf("%p 0x%X
",ptr,offset);

    if(mprotect(ptr, pagesize, PROT_READ|PROT_EXEC|PROT_WRITE))
    {
        printf("mprotect failed
");
        return(1);
    }

    //for(ra=0;ra&lt;20;ra++) printf("0x%02X,",ptr[offset+ra]); printf("
");

    ra=4;
    ra=fun(ra);
    printf("0x%02X
",ra);

    ptr[offset+0]=0xb8;
    ptr[offset+1]=0x22;
    ptr[offset+2]=0x00;
    ptr[offset+3]=0x00;
    ptr[offset+4]=0x00;
    ptr[offset+5]=0xc3;

    ra=4;
    ra=fun(ra);
    printf("0x%02X
",ra);

    return(0);
}

这篇关于如何在 x86 汇编中编写自修改代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆