如何判断一个二进制序列是否是 x86 机器码? [英] How to tell if a binary sequence is x86 machine code?

查看:23
本文介绍了如何判断一个二进制序列是否是 x86 机器码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们都知道在 x86 arch 中,数据和代码混合在内存或磁盘中.但是怎么告诉他们呢?

We all know that in x86 arch, the data and code is mixed in the memory or disk. But how to tell them?

论文需要该方法,我不期望100%的准确度.80% 还可以,即使是一些想法也可以:)

The method is needed for paper, I wouldn't expect a 100% accuracy. 80%'s just ok, even some ideas would be fine:)

推荐答案

统计确定哪些命令在可执行文件中是通用的.

Statistically determine which commands are common in executables.

例如.一些命令可能是加/减等.

Eg. some commands may be add/subtract etc.

对于未知的二进制序列,将其视为机器码,并查看使用的各种命令的频率(在这里您可以假设命令正确地从字节边界开始).

For the unknown binary sequence, treat it like machine code, and look at the frequency of the various commands used (here you can probably assume commands start correctly at byte boundaries).

如果使用了无效的命令,显然不是机器码.

If an invalid command is used, obviously it is not machine code.

否则,请查看所使用命令的百分比频率是否与通常情况相符.

Otherwise, see whether the percentage frequency of commands used matches what would be usual.

此外,当使用接受地址(例如寄存器或内存/数据位置)的命令时,记录它们.然后检查附近是否正在访问相同的位置.

Also, when a command is used which accepts addresses (eg. registers or memory/data locations), record them. Then check if the same locations are being accessed nearby.

这可以通过按使用频率降序对使用的任何数据位置进行排序来完成,并且看到频率下降的形状与通常情况有些相似.

This can be done by sorting any data locations used by frequency of usage descending, and seeing of the shape of the decreasing frequency somewhat matches what might be usual.

数据(非机器代码)不太可能与这些统计测试相匹配.

Data (non-machine code) is unlikely to match these statistical tests.

请注意,当我说合身时,您可以检查非常宽松的合身.即使它与正常情况相差很多,但它可能仍然是代码,除非在统计上几乎没有相关性.

Do note that when I say fit, you can check for very loose fits. Even if it is quite a bit off what is normal, it probably still is code, unless there is almost no correlation statistically.

这篇关于如何判断一个二进制序列是否是 x86 机器码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆