REPNZ SCAS组装说明 [英] REPNZ SCAS Assembly Instruction Specifics

查看:107
本文介绍了REPNZ SCAS组装说明的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对二进制文件进行逆向工程,以下说明使我感到困惑,任何人都可以弄清楚这到底是做什么的吗?

I am trying to reverse engineer a binary and the following instruction is confusing me, can anyone clarify what exactly this does?

=>0x804854e:    repnz scas al,BYTE PTR es:[edi]
  0x8048550:    not    ecx

位置:

EAX: 0x0
ECX: 0xffffffff
EDI: 0xbffff3dc ("aaaaaa\n")
ZF:  1

我看到每次迭代都会以某种方式使ECX减少1,并且EDI沿着字符串的长度增加.我知道它会计算字符串的长度,但是就它到底是如何发生的,以及为什么涉及"al",我不太确定.

I see that it is somehow decrementing ECX by 1 each iteration, and that EDI is incrementing along the length of the string. I know it calculates the length of the string, but as far as exactly HOW it's happening, and why "al" is involved I'm not quite sure.

推荐答案

我将尝试通过将代码反向转换为C来解释它.

I'll try to explain it by reversing the code back into C.

英特尔指令集参考(软件开发人员手册第2卷)对于这种逆向工程非常有价值.

Intel's Instruction Set Reference (Volume 2 of Software Developer's Manual) is invaluable for this kind of reverse engineering.

REPNE和SCASB的逻辑相结合:

The logic for REPNE and SCASB combined:

while (ecx != 0) {
    temp = al - *(BYTE *)edi;
    SetStatusFlags(temp);
    if (DF == 0)   // DF = Direction Flag
        edi = edi + 1;
    else
        edi = edi - 1;
    ecx = ecx - 1;
    if (ZF == 1) break;
}

或更简单地说:

while (ecx != 0) {
    ZF = (al == *(BYTE *)edi);
    if (DF == 0)
        edi++;
    else
        edi--;
    ecx--;
    if (ZF) break;
}

字符串长度

但是,以上内容不足以解释它如何计算字符串的长度.基于您的问题中not ecx的存在,我假设该代码段属于该习惯用法(或类似方法),用于使用REPNE SCASB:

String Length

However, the above is insufficient to explain how it computes the length of a string. Based on the presence of the not ecx in your question, I'm assuming the snippet belongs to this idiom (or similar) for computing string length using REPNE SCASB:

sub ecx, ecx
sub al, al
not ecx
cld
repne scasb
not ecx
dec ecx

转换为C并使用上一部分中的逻辑,我们得到:

Translating to C and using our logic from the previous section, we get:

ecx = (unsigned)-1;
al = 0;
DF = 0;
while (ecx != 0) {
    ZF = (al == *(BYTE *)edi);
    if (DF == 0)
        edi++;
    else
        edi--;
    ecx--;
    if (ZF) break;
}
ecx = ~ecx;
ecx--;

使用al = 0DF = 0进行简化:

ecx = (unsigned)-1;
while (ecx != 0) {
    ZF = (0 == *(BYTE *)edi);
    edi++;
    ecx--;
    if (ZF) break;
}
ecx = ~ecx;
ecx--;

注意事项:

  • 以二进制补码表示法,翻转ecx的位等效于-1 - ecx.
  • 在循环中,ecx在循环中断之前递减,因此总共减少了length(edi) + 1.
  • ecx在循环中永远不能为零,因为字符串必须占用整个地址空间.
  • in two's complement notation, flipping the bits of ecx is equivalent to -1 - ecx.
  • in the loop, ecx is decremented before the loop breaks, so it decrements by length(edi) + 1 in total.
  • ecx can never be zero in the loop, since the string would have to occupy the entire address space.

因此,在上面的循环之后,ecx包含与-(length(edi) + 2)相同的-1 - (length(edi) + 1),我们将这些位翻转以得到length(edi) + 1,最后递减以给出length(edi).

So after the loop above, ecx contains -1 - (length(edi) + 1) which is the same as -(length(edi) + 2), which we flip the bits to give length(edi) + 1, and finally decrement to give length(edi).

或者重新排列循环并简化操作:

Or rearranging the loop and simplifying:

const char *s = edi;
size_t c = (size_t)-1;      // c == -1
while (*s++ != '\0') c--;   // c == -1 - length(s)
c = ~c;                     // c == length(s)

并反转计数:

size_t c = 0;
while (*s++ != '\0') c++;

这是C语言中的strlen函数:

size_t strlen(const char *s) {
    size_t c = 0;
    while (*s++ != '\0') c++;
    return c;
}

这篇关于REPNZ SCAS组装说明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆