用于字符串混淆的简单编码/解码的汇编代码? [英] Assembly code for simple coding/decoding of string confusion?

查看:88
本文介绍了用于字符串混淆的简单编码/解码的汇编代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为考试而学习,我对此汇编代码感到困惑.它是一个程序,其中第一个用户输入一个字符串,然后对该字符串进行编码和打印,然后进行解码和打印.

I am learning for my exam and I am so confused by this assembly code. It is a program in which first user enters a string, than that string gets coded and printed, than decoded and printed.

让我感到困惑的是(De)Coding部分.因此,使用"LEA bx,MyString"将MyString的内存地址保存在寄存器bx中.现在进行编码.这样做的目的是什么?

What confuses me is (De)Coding part. So, with "LEA bx, MyString" memory address of MyString is saved in register bx. Now the coding takes place. What is the purpose of this?

INC bx
MOV cl, [bx]
XOR ch, ch

coding:
    INC bx
    MOV dl, [bx]
    XOR dl, ah
    MOV [bx], dl
LOOP coding

为什么增加内存地址?那不是住址吗?为什么要在循环中再次增加bx.这些指针使我感到困惑.我得到的部分中,来自地址bx的字符被移到dl,然后用掩码编码,然后被放回dl.我只是对内存地址的这种增加感到困惑.这是否意味着它是从第3个字符开始而不是第一个?是否用掩码对字符串的3个以上字符进行编码?前两个比是什么?对不起,如果问题很愚蠢,谢谢!

Why increment memory address? Doesn't that change address? Why increment bx in loop again. These pointers just confuse me. I get the part where character from the address bx is moved to dl than coded with the mask than placed back to dl. I'm just so confused by this incrementing of the memory adress. Does that mean it starts from the 3rd character instead first? Than codes 3+ characters of the string with the mask? What's up with first two than? Sorry if questions are stupid, thanks!

这是完整的代码:

.MODEL small
.DATA
    STR_LENGTH EQU 30
    BUFF_LENGTH EQU STR_LENGTH + 3
    MyString DB BUFF_LENGTH DUP (0)
    Coder_Mask DB 128
.STACK
.CODE

NewLine MACRO
    MOV dl, 10
    MOV ah, 02h
    INT 21h
    MOV dl, 13
    MOV ah, 02h
    INT 21h
ENDM    

DeCode MACRO bx, ah
LOCAL coding

    INC bx
    MOV cl, [bx]
    XOR ch, ch

    coding:
        INC bx
        MOV dl, [bx]
        XOR dl, ah
        MOV [bx], dl
    LOOP coding
ENDM

WriteString MACRO bx
LOCAL writing

    INC bx
    MOV cl, [bx]
    XOR ch, ch

    writing:
        INC bx
        MOV dl, [bx]
        MOV ah, 02h
        INT 21h
    LOOP writing
ENDM

Start:
    MOV ax, @DATA
    MOV ds, ax

    LEA bx, MyString
    MOV cl, BUFF_LENGTH
    MOV [bx], cl
    LEA dx, MyString
    MOV ah, 0Ah
    INT 21h

    NewLine
    LEA bx, MyString
    WriteString bx

    LEA bx, MyString
    MOV ah, Coder_Mask
    DeCode bx, ah

    NewLine
    LEA bx, MyString
    WriteString bx

    NewLine

    LEA bx, MyString
    MOV ah, Coder_Mask
    DeCode bx, ah

    NewLine
    LEA bx, MyString
    WriteString bx

    MOV ax, 4C00h
    INT 21h
END Start

推荐答案

您需要了解内存的结构以及如何存储字符串.

You need to understand the structure of memory, how the string is stored.

老师的代码缺少任何注释,因此,弄清楚它是您的任务(您失败了),或者由于外交原因,我不会再对您的老师发表任何评论.

The teacher's code is missing any comments, so it was either your task to figure it out (and you failed), or I will not comment any further about your teacher due to diplomacy reasons.

字符串缓冲区的结构是MS-DOS用于int 21h的函数0Ah的结构(

The structure of string buffer is the one used by MS-DOS for function 0Ah of int 21h (description):

MyString:
    db string_maximum_size     ; maximum characters to store into buffer
    db character_actually_read ; characters read by INT 21h: 0Ah function
    db string_maximum_size DUP (0)  ; the string characters

因此,通过输入字符串"hello",位于地址MyString的内存将设置为:
33, 5, 104 ('h'), 101 ('e'), 108 ('l'), 108 ('l'), 111 ('o'),后跟26个零(DUP (0)的结果).

So by entering string "hello" the memory at address MyString will be set to:
33, 5, 104 ('h'), 101 ('e'), 108 ('l'), 108 ('l'), 111 ('o') followed by 26 zeroes (result of DUP (0)).

实际上,我认为您的代码有错误,将最大大小设置为总缓冲区大小BUFF_LENGTH EQU STR_LENGTH + 3,而从中断描述中,我希望第一个字节仅包含STR_LENGTH.您可以通过尝试输入33个长字符串来验证这一点,并在调试器中检查MyString缓冲区之后是否覆盖了内存.同样,+3并没有多大意义,因为只有+2个字节用于最大大小和实际大小.

Actually I think your code has bugs, setting up maximum size as total buffer size BUFF_LENGTH EQU STR_LENGTH + 3, while from the interrupt description I would expect the first byte to contain only STR_LENGTH. You may verify this by trying to input 33 characters long string, and check in debugger if the memory is overwritten after the MyString buffer. Also the +3 doesn't make much sense, as only +2 bytes are used for max size, and actual size.

现在在代码中发生这种情况:

Now in code happens this:

LEA bx,[MyString]   ; bx = address of first byte of buffer (contains maximum size)
INC bx              ; bx now points to actual size
; instead LEA bx,[MyString+1] could have been used, skipping one INC bx
MOV cl,[bx]         ; cl = actual string size
XOR ch,ch           ; ch = 0 (extending 8 bit value in cl to unsigned 16 bit in cx)
; other option on 386+ CPU is MOVZX cx,BYTE PTR [bx]
; or XOR cx,cx  MOV cl,[bx]
INC bx              ; bx now points to the first character

然后继续按其需要处理[bx]内容,在循环期间再次增加bx以访问下一个字符,直到cx计数器达到0.

It keeps then doing with [bx] content whatever it wish, incrementing bx again during loop to access next character, till the cx counter does reach 0.

您肯定应该启动调试器,逐条代码地通过该指令,然后将内存窗口指向MyString,并观察如何使用bx访问那里的特定字节,以及这些INC bx如何适合该字节.

You should definitely start up the debugger, step trough that code instruction by instruction, and point memory window to MyString and watch how bx is used to access particular bytes there, and how those INC bx fits that.

这将比其他任何东西都更好地解释它.

This will explain it even better than anything else.

还有一件事.实际上,我对自己保密了一个秘密,这是您的问题不可或缺的一部分.

One more thing. I actually kept one secret to myself, which is integral part of your question.

我怎么知道?":您应该始终记得,计算机是计算机.您将一些程序放入(指令列表)中,并放入一些数字,让其执行指令,然后得出结果编号.

So "How did I know?": you should always recall, that computers are computational machines. You put some program in (list of instructions), you put some numbers in, let it execute the instructions, and get the resulting numbers out.

我有代码(说明).我在您的代码中寻找的下一件事是如何定义字符串".我发现它是由用户输入的,由int 21h函数读取.因此,我搜索了该函数,其工作方式以及返回的数据. snap :突然间所有的事情都变得有意义了(除了最大大小的错误,我认为这只是您的讲课者的一个错误,即使是经验丰富的程序员也很容易在ASM中进行一些错误).

I had the code (instructions). Next thing I was looking for in your code was "how do you define the string". I found it's entered by user, read by int 21h function. So I googled the function, how it works, what data it returns. snap: suddenly all made sense (except max size bug, which I decided is simply a bug from your lector, it's easy to do some bug in ASM even for seasoned programmers).

因此,请务必确保您已了解所有说明,并充分了解了输入数据(它们的结构和值)是什么.然后,您可以像在CPU上一样运行所有操作,以找出这些输入数据如何转换为输出数据.这是一个纯粹的确定性计算过程,您不需要就可以猜测任何东西,它完全定义了计算的每个阶段接下来会发生什么.

So always make sure you understand all instructions, and you understand well what are the input data (their structure and values). Then you can run everything in your head, just like on the CPU, to find out how those input data turns into output data. It's a purely deterministic computational process, you do not need to guess anything, it's exactly defined what happens next in every stage of the computation.

如果您确切地知道这些定义是什么,那么它实际上比任何高级抽象东西都简单直接,容易,只是乏味得多.

If you know exactly what are those definitions, it's actually straightforwardly easy, easier than any high level abstraction stuff, just lot more tedious.

当您不熟悉ASM时,比在脑海中动起来要容易得多(在调试器中观看此事件的发生(这还将帮助您更快地了解ASM).

When you are new to ASM, it's much easier to watch this happening in debugger (and it will also help you to understand ASM much faster), than doing it in your head.

这篇关于用于字符串混淆的简单编码/解码的汇编代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆