有关汇编和计算机程序的问题 [英] Question regarding Assembly and computer programs

查看:87
本文介绍了有关汇编和计算机程序的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读了这篇文章: http://en.wikipedia.org/wiki/Assembly_language

它说:

例如,以告诉x86/IA-32处理器的指令为例 将立即数的8位值移动到寄存器中.的二进制代码 该指令是10110,后跟3位标识符, 注册使用. AL寄存器的标识符为000,因此 以下机器代码将数据加载到AL寄存器中 01100001. [4]

Take, for example, the instruction that tells an x86/IA-32 processor to move an immediate 8-bit value into a register. The binary code for this instruction is 10110 followed by a 3-bit identifier for which register to use. The identifier for the AL register is 000, so the following machine code loads the AL register with the data 01100001.[4]

10110000 01100001

它说明了如何更容易地将其编写为:

It explains how it is easier to write it as:

MOV AL, 61h       ; Load AL with 97 decimal (61 hex)

现在这是我的问题.

那么,计算机程序/可执行文件仅仅是二进制数据(0和1)吗?

So, computer programs/executables are just binary data (0's and 1's)?

当使用类似OllyDbg的反汇编程序查看时,它只是试图将0和1还原为某种Assembly(Intel?)语言,并且输出大部分是正确的?

When viewed with a disassembler like OllyDbg it just tries to revert those 0's and 1's back to some Assembly (Intel?) language and the output is mostly correct?

如果我的SSD上有此10110000 01100001程序,并且编写了一个C#/PHP/wtvr应用程序,该应用程序读取文件的内容并将其作为位输出,我会看到这些确切的10110000 01100001数字吗?

If I have this 10110000 01100001 program on my SSD and I write a C#/PHP/wtvr application that reads the contents of the file and output them as bits, will I see these exact 10110000 01100001 figures?

操作系统如何执行实际的执行"?它如何告诉处理器嘿,拿走这些位并运行它们"?我可以直接在C#/C ++中做到吗?

How does the operating system do the actual "execution"? How does it tell the processor that "hey, take these bits and run them"? Can I do that in C#/C++ directly?

推荐答案

那么,计算机程序/可执行文件仅仅是二进制数据(0和1)吗?

So, computer programs/executables are just binary data (0's and 1's)?

是的,例如图像,视频和其他数据.

Yes like images, videos and other data.

当使用类似OllyDbg的反汇编程序查看时,它只是试图将0和1还原为某种Assembly(Intel?)语言,并且输出大部分是正确的?

When viewed with a disassembler like OllyDbg it just tries to revert those 0's and 1's back to some Assembly (Intel?) language and the output is mostly correct?

是的,在这种确切情况下,它总是正确的,因为mov al, 61h总是被组装为0xB0 0x61(在《 Intel 64和IA-32体系结构软件开发人员手册》 以及其他通常以16位,32位和64位模式写为B0 61的位置.请注意,0xB0 0x61 = 0b10110000 0b01100001.

Yes, in this exact case it will always be correct as mov al, 61h is always assembled to 0xB0 0x61 (in Intel 64 and IA-32 Architectures Software Developer's Manuals and other places usually written as B0 61) in 16-, 32- and 64-bit mode. Note that 0xB0 0x61 = 0b10110000 0b01100001.

您可以在第2A卷中找到不同指令的编码.例如,此处为"B0 + rb MOV r8,imm8 E有效有效,将imm8移至r8".在3-644页.

You can find the encoding for different instructions in Volume 2A. For example here it is "B0+ rb MOV r8, imm8 E Valid Valid Move imm8 to r8." on page 3-644.

其他指令的含义不同,取决于它们是以16/32还是64位模式解释的.考虑一下这个短字节序列:66 83 C0 04 41 80 C0 05

Other instructions have different meanings depend on whether they are interpreted in 16/32 or 64-bit mode. Consider this short sequence of bytes: 66 83 C0 04 41 80 C0 05

在16位模式下,它们的意思是:

In 16-bit mode they mean:

00000000  6683C004          add eax,byte +0x4
00000004  41                inc cx
00000005  80C005            add al,0x5

在32位模式下,它们的意思是:

In 32-bit mode they mean:

00000000  6683C004          add ax,byte +0x4
00000004  41                inc ecx
00000005  80C005            add al,0x5

最后是64位模式:

00000000  6683C004          add ax,byte +0x4
00000004  4180C005          add r8b,0x5

因此,在不了解上下文的情况下,无法总是正确地反汇编指令(这甚至没有考虑到代码以外的其他东西都可以驻留在文本段中,并且代码可以做一些讨厌的事情,例如即时生成代码或自我生成代码).修改).

So the instructions cannot always be disassembled correctly without knowing the context (this is not even taking into account that other things than code can reside in the text segment and the code can do nasty stuff like generate code on the fly or self-modify).

如果我的SSD上有这个10110000 01100001程序,并且编写了一个C#/PHP/wtvr应用程序来读取文件的内容并将其输出为位,我会看到这些确切的10110000 01100001数字吗?

If I have this 10110000 01100001 program on my SSD and I write a C#/PHP/wtvr application that reads the contents of the file and output them as bits, will I see these exact 10110000 01100001 figures?

是的,在某种意义上,如果应用程序包含mov al, 61h指令,则文件将包含字节0xB00x61.

Yes, in the sense that if the application contains the mov al, 61h instruction the file will contain the bytes 0xB0 and 0x61.

操作系统如何执行实际的执行"?它如何告诉处理器嘿,拿走这些位并运行它们"?我可以直接在C#/C ++中做到吗?

How does the operating system do the actual "execution"? How does it tell the processor that "hey, take these bits and run them"? Can I do that in C#/C++ directly?

将代码加载到内存中之后(并且内存已正确设置许可权限),它可以直接跳转或调用并运行它.即使操作系统只是另一个程序,您也必须意识到一件事,因为它首先到达处理器,所以它是一个特殊的程序!它以特殊的主管(或管理程序)模式运行,允许它执行不允许正常(用户)程序的操作.就像设置抢先式多任务处理一样,以确保自动产生流程.

After loading the code into memory (and the memory is correctly setup permission-wise) it can just jump to or call it and have it run. One thing you have to realize even though the operating system is just another program it is a special program since it got to the processor first! It runs in a special supervisor (or hypervisor) mode that allows it to things normal (user) programs aren't allowed to. Like set up preemptive multitasking that makes sure processes are automatically yielded.

第一个处理器还负责唤醒多核/多处理器机器上的其他核心/处理器.请参阅 SO问题.

The first processor is also responsible for waking up the other cores/processors on a multi-core/multi-processor machine. See this SO question.

要调用代码,您需要直接在C ++中加载自己(我认为在C#中不求助于不安全/本机代码是不可能的)需要特定于平台的技巧.对于Windows,您可能希望查看 ,并在Linux mprotect(2)下.或者更现实地讲,是使用

To call code you load yourself directly in C++ (I don't think it is possible in C# without resorting to unsafe/native code) requires platform specific tricks. For Windows you probably want to look at VirtualProtect, and under linux mprotect(2). Or perhaps more realistically from a file which is the mapped using either this process for Windows or mmap(2) for linux.

这篇关于有关汇编和计算机程序的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆