编译器实际上会产生机器代码吗? [英] Does the compiler actually produce Machine Code?

查看:92
本文介绍了编译器实际上会产生机器代码吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经读到,在大多数情况下(例如gcc),编译器会以高级语言读取源代码,并吐出相应的机器代码.现在,按照定义,机器代码是处理器可以直接理解的代码.因此,机器代码应仅取决于机器(处理器)且与操作系统无关.但这种情况并非如此.即使两个不同的操作系统在同一处理器上运行,我也无法在两个操作系统上运行相同的编译文件(对于Windows为.exe或对于Linux为.out).

那么,我想念什么?gcc编译器(和大多数编译器)的输出不是机器代码吗?还是机器代码不是最低级别的代码,而是操作系统将其进一步翻译为处理器可以执行的一组指令?

解决方案

您正在混淆一些事情.我将可重定位目标的编译器(例如gcc)和其他通用编译器将文件编译为对象,然后链接器随后根据需要将对象与其他库链接在一起,以生成所谓的二进制文件,操作系统随后可以读取,解析,加载可加载的块并开始执行./p>

一个精明的编译器作者将使用汇编语言作为编译器的输出,然后编译器或其makefile中的用户将调用创建该对象的汇编器.这就是gcc的工作方式.以及clang的工作方式sorta,但是llc现在可以直接制作对象,而不仅仅是要组装的程序集.

生成可生成原始机器代码的可调试汇编语言要有意义得多.您确实需要像JIT这样的充分理由才能跳过此步骤.我会避免直接使用机器代码的工具链,因为它们可以,它们更难维护,并且更有可能出现错误,或者花费更长的时间来修复错误.

如果架构相同,则没有理由不能让通用工具链为不兼容的操作系统生成代码.例如,gnu工具可以做到这一点.操作系统差异不是在机器代码级别定义的,大多数是在高级语言级别的C库中,您可以创建gui窗口,等等,与机器代码或处理器体系结构无关,对于某些操作系统而言,它们是相同的操作系统特定的C代码可以在mips或arm或powerpc或x86上使用.特定于体系结构的地方是调用实际系统调用的机制.通常会使用特定的说明.并最终使用了机器代码,是的,但没有理由不能在实际或内联汇编中对此进行编码.

然后导致生成库,即使是通用C调用的fopen和printf最终也必须进行系统调用,因此库支持代码中的大部分可以跨系统高级语言兼容.最后一英里的特定于系统和体系结构的代码.您应该在glibc源代码中看到这一点,或者例如在其他库解决方案中挂接到newlib.举个例子.

对于其他语言,例如C ++和C ++,也是如此.被解释的语言具有附加的层,但它们的虚拟机只是位于相似层上的程序.

低级编程并不意味着机器语言或汇编语言,它仅表示您正在使用的任何编程语言都可以在较低级别,应用程序下方或操作系统下进行访问,等等.

I've been reading that in most cases (like gcc) the compiler reads the source code in a high level language and spits out the corresponding machine code. Now, machine code by definition is the code that a processor can understand directly. So, machine code should be only machine (processor) dependent and OS independent. But this is not the case. Even if 2 different operating systems are running on the same processor, I can not run the same compiled file (.exe for Windows or .out for Linux) on both the Operating Systems.

So, what am I missing? Is the output of a gcc compiler (and most compilers) not Machine Code? Or is Machine Code not the lowest level of code and the OS translated it further to a set of instructions that the processor can execute?

解决方案

You are confusing a few things. I retargettable compiler like gcc and other generic compilers compile files to objects, then the linker later links objects with other libraries as needed to make a so called binary that the operating system can then read, parse, load the loadable blocks and start execution.

A sane compiler author will use assembly language as the output of the compiler then the compiler or the user in their makefile calls the assembler which creates the object. This is how gcc works. And how clang works sorta, but llc can make objects directly now not just assembly that gets assembled.

It makes far more sense to generate debuggable assembly language that produce raw machine code. You really need a good reason like JIT to skip the step. I would avoid toolchains that go straight to machine code just because they can, they are harder to maintain and more likely to have bugs or take longer to fix bugs.

If the architecture is the same there is no reason why you cant have a generic toolchain generate code for incompatible operating systems. the gnu tools for example can do this. Operating system differences are not by definition at the machine code level most are at the high level language level C libraries that you can to create gui windows, etc have nothing to do with the machine code nor the processor architecture, for some operating systems the same operating system specific C code can be used on mips or arm or powerpc or x86. where the architecture becomes specific is the mechanism that actual system calls are invoked. A specific instruction is often used. and machine code is eventually used yes but no reason why this cant be coded in real or inline assembly.

And then this leads to libraries, even fopen and printf which are generic C calls eventually have to make a system call so much of the library support code can be in a compatible across systems high level language, there will need to be a system and architecture specific bit of code for the last mile. You should see this in glibc sources, or hooks into newlib for example in other library solutions. As examples.

Same is true for other languages like C++ as it is for C. Interpreted languages have additional layers but their virtual machines are just programs that sit on similar layers.

Low level programming doesnt mean machine nor assembly language it just means whatever programming language you are using accesses at a lower level, below the application or below the operating system, etc...

这篇关于编译器实际上会产生机器代码吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆