有关汇编和计算机程序的问题 [英] Question regarding Assembly and computer programs
问题描述
我阅读了这篇文章: http://en.wikipedia.org/wiki/Assembly_language
它说:
例如,以告诉x86/IA-32处理器的指令为例 将立即数的8位值移动到寄存器中.的二进制代码 该指令是10110,后跟3位标识符, 注册使用. AL寄存器的标识符为000,因此 以下机器代码将数据加载到AL寄存器中 01100001. [4]
Take, for example, the instruction that tells an x86/IA-32 processor to move an immediate 8-bit value into a register. The binary code for this instruction is 10110 followed by a 3-bit identifier for which register to use. The identifier for the AL register is 000, so the following machine code loads the AL register with the data 01100001.[4]
10110000 01100001
它说明了如何更容易地将其编写为:
It explains how it is easier to write it as:
MOV AL, 61h ; Load AL with 97 decimal (61 hex)
现在这是我的问题.
那么,计算机程序/可执行文件仅仅是二进制数据(0和1)吗?
So, computer programs/executables are just binary data (0's and 1's)?
当使用类似OllyDbg的反汇编程序查看时,它只是试图将0和1还原为某种Assembly(Intel?)语言,并且输出大部分是正确的?
When viewed with a disassembler like OllyDbg it just tries to revert those 0's and 1's back to some Assembly (Intel?) language and the output is mostly correct?
如果我的SSD上有此10110000 01100001
程序,并且编写了一个C#/PHP/wtvr应用程序,该应用程序读取文件的内容并将其作为位输出,我会看到这些确切的10110000 01100001
数字吗?
If I have this 10110000 01100001
program on my SSD and I write a C#/PHP/wtvr application that reads the contents of the file and output them as bits, will I see these exact 10110000 01100001
figures?
操作系统如何执行实际的执行"?它如何告诉处理器嘿,拿走这些位并运行它们"?我可以直接在C#/C ++中做到吗?
How does the operating system do the actual "execution"? How does it tell the processor that "hey, take these bits and run them"? Can I do that in C#/C++ directly?
推荐答案
那么,计算机程序/可执行文件仅仅是二进制数据(0和1)吗?
So, computer programs/executables are just binary data (0's and 1's)?
是的,例如图像,视频和其他数据.
Yes like images, videos and other data.
当使用类似OllyDbg的反汇编程序查看时,它只是试图将0和1还原为某种Assembly(Intel?)语言,并且输出大部分是正确的?
When viewed with a disassembler like OllyDbg it just tries to revert those 0's and 1's back to some Assembly (Intel?) language and the output is mostly correct?
是的,在这种确切情况下,它总是正确的,因为mov al, 61h
总是被组装为0xB0 0x61
(在《 Intel 64和IA-32体系结构软件开发人员手册》 以及其他通常以16位,32位和64位模式写为B0 61
的位置.请注意,0xB0 0x61
= 0b10110000 0b01100001
.
Yes, in this exact case it will always be correct as mov al, 61h
is always assembled to 0xB0 0x61
(in Intel 64 and IA-32 Architectures Software Developer's Manuals and other places usually written as B0 61
) in 16-, 32- and 64-bit mode. Note that 0xB0 0x61
= 0b10110000 0b01100001
.
您可以在第2A卷中找到不同指令的编码.例如,此处为"B0 + rb MOV r8,imm8 E有效有效,将imm8移至r8".在3-644页.
You can find the encoding for different instructions in Volume 2A. For example here it is "B0+ rb MOV r8, imm8 E Valid Valid Move imm8 to r8." on page 3-644.
其他指令的含义不同,取决于它们是以16/32还是64位模式解释的.考虑一下这个短字节序列:66 83 C0 04 41 80 C0 05
Other instructions have different meanings depend on whether they are interpreted in 16/32 or 64-bit mode. Consider this short sequence of bytes: 66 83 C0 04 41 80 C0 05
在16位模式下,它们的意思是:
In 16-bit mode they mean:
00000000 6683C004 add eax,byte +0x4
00000004 41 inc cx
00000005 80C005 add al,0x5
在32位模式下,它们的意思是:
In 32-bit mode they mean:
00000000 6683C004 add ax,byte +0x4
00000004 41 inc ecx
00000005 80C005 add al,0x5
最后是64位模式:
00000000 6683C004 add ax,byte +0x4
00000004 4180C005 add r8b,0x5
因此,在不了解上下文的情况下,无法总是正确地反汇编指令(这甚至没有考虑到代码以外的其他东西都可以驻留在文本段中,并且代码可以做一些讨厌的事情,例如即时生成代码或自我生成代码).修改).
So the instructions cannot always be disassembled correctly without knowing the context (this is not even taking into account that other things than code can reside in the text segment and the code can do nasty stuff like generate code on the fly or self-modify).
如果我的SSD上有这个10110000 01100001程序,并且编写了一个C#/PHP/wtvr应用程序来读取文件的内容并将其输出为位,我会看到这些确切的10110000 01100001数字吗?
If I have this 10110000 01100001 program on my SSD and I write a C#/PHP/wtvr application that reads the contents of the file and output them as bits, will I see these exact 10110000 01100001 figures?
是的,在某种意义上,如果应用程序包含mov al, 61h
指令,则文件将包含字节0xB0
和0x61
.
Yes, in the sense that if the application contains the mov al, 61h
instruction the file will contain the bytes 0xB0
and 0x61
.
操作系统如何执行实际的执行"?它如何告诉处理器嘿,拿走这些位并运行它们"?我可以直接在C#/C ++中做到吗?
How does the operating system do the actual "execution"? How does it tell the processor that "hey, take these bits and run them"? Can I do that in C#/C++ directly?
将代码加载到内存中之后(并且内存已正确设置许可权限),它可以直接跳转或调用并运行它.即使操作系统只是另一个程序,您也必须意识到一件事,因为它首先到达处理器,所以它是一个特殊的程序!它以特殊的主管(或管理程序)模式运行,允许它执行不允许正常(用户)程序的操作.就像设置抢先式多任务处理一样,以确保自动产生流程.
After loading the code into memory (and the memory is correctly setup permission-wise) it can just jump to or call it and have it run. One thing you have to realize even though the operating system is just another program it is a special program since it got to the processor first! It runs in a special supervisor (or hypervisor) mode that allows it to things normal (user) programs aren't allowed to. Like set up preemptive multitasking that makes sure processes are automatically yielded.
第一个处理器还负责唤醒多核/多处理器机器上的其他核心/处理器.请参阅此 SO问题.
The first processor is also responsible for waking up the other cores/processors on a multi-core/multi-processor machine. See this SO question.
要调用代码,您需要直接在C ++中加载自己(我认为在C#中不求助于不安全/本机代码是不可能的)需要特定于平台的技巧.对于Windows,您可能希望查看 mprotect(2)
下.或者更现实地讲,是使用
To call code you load yourself directly in C++ (I don't think it is possible in C# without resorting to unsafe/native code) requires platform specific tricks. For Windows you probably want to look at VirtualProtect
, and under linux mprotect(2)
. Or perhaps more realistically from a file which is the mapped using either this process for Windows or mmap(2)
for linux.
这篇关于有关汇编和计算机程序的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!