如何可能会引导C编译器(来自源代码)? [英] How could one possibly bootstrap a C compiler(from source)?

查看:54
本文介绍了如何可能会引导C编译器(来自源代码)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究编译器自举,我研究了Golang如何实现从源代码进行引导,即通过构建用C实现的最新版本的Golang并使用生成的可执行文件来编译较新的Go版本.这使我对使用C可以完成相同的操作感到好奇.您可以在计算机上实际上没有任何内容的情况下构造C编译器吗?如果不是,那我怎么能相信我使用的编译器的二进制文件不会自动用间谍软件填充其编译的二进制文件?

I was looking into compiler bootstrapping, and I looked at how Golang implements bootstrapping from source, i.e., by building the last version of Golang implemented in C and using the generated executable to compile newer Go releases. This made me curious as to how the same could be done with C. Can you construct a C compiler on a computer with literally nothing present on it? If not, then how can I trust that the binary of the compiler I use doesn't automatically fill the binaries it compiles with spyware?

相关问题,因为第一个C编译器是用B编写的并且B是用BCPL编写的,那么BCPL是用什么编写的?

Related question, since the first C compiler was written in B and B was written in BCPL, what was BCPL written in?

推荐答案

您能在没有任何文字的计算机上构造C编译器吗?

Can you construct a C compiler on a computer with literally nothing present on it?

主要问题是如何(在2021年)您将为该计算机编写程序吗!以及您将如何输入呢?

The main issue is how (in 2021) would you write a program for that computer! And how would you input it?

在1970年代,计算机(例如 IBM 360 大型机)具有许多机械开关输入一些初始程序.在1960年代,他们拥有更多,例如 IBM1620 .

In the 1970s computers (like IBM 360 mainframes) had many mechanical switches to enter some initial program. In the 1960s, they had even more, e.g. IBM1620.

今天,您将如何输入该初始程序?您是否考虑过使用 Arduino 吗?甚至今天的示波器都包含具有程序的微处理器....

Today, how would you input that initial program? Did you consider using some Arduino ? Even oscilloscopes today contain microprocessors with programs....

几年前,今天的一些业余爱好者已经设计(并花了很多钱)来制作具有机械

Some hobbyists today have designed (and spent a lot of money) in making - a few years ago - computers with mechanical relays. These are probably thousands times slower than the cheapest laptop computer you could buy (or the micro-controller inside your computer mouse - and your mouse contains some software too).

您还可以购买许多分立的晶体管(例如成千上万的2N2222)并制造一台计算机通过焊接它们.

You could also buy many discrete transistors (e.g. thousands of 2N2222) and make a computer by soldering them.

即使是便宜的主板(例如 MSI A320M A-PRO )今天有一些固件程序,称为 BIOS .它随该程序一起提供....据传大部分是用C编写的(数十万条语句).

Even a cheap motherboard (like e.g. MSI A320M A-PRO) has today some firmware program called UEFI or BIOS. It is shipped with that program.... and rumored to be mostly written in C (several dozen of thousands of statements).

在某些方面,计算机芯片是软件". VHDL

In some ways, computer chips are "software" coded in VHDL, SystemC, etc... etc...

这是一个假设的故事....

Here is an hypothetical tale....

想象一下,今天您有一台笔记本电脑在一个孤立的岛屿上运行一个小型Linux发行版(请参见 Robinson Crusoe ),没有任何互联网连接-但带有书籍(包括 现代C 以及一些有关x86-64汇编和指令集体系结构的书,以及许多其他纸质书籍),铅笔,纸张,食物和很多时间.想象一下,该系统没有任何C编译器(例如,因为您刚刚从某些Debian 发行版),而只是 GNU binutils (即链接器 ld 和汇编程序 gas ),一些二进制格式的编辑器(例如 vim ),GNU make 作为二进制软件包.我们假设您有足够的动力去花几个月的时间编写C编译器.我们还假设您可以某种纸质形式访问手册页(特别是 od(1) less(1).

Imagine you have today a laptop running a small Linux distribution on some isolated island (à la Robinson Crusoe), without any Internet connection - but with books (including Modern C and some book about x86-64 assembly and instruction set architecture and many other books in paper form), pencils, papers, food and a lot of time to spend. Imagine that system does not have any C compiler (e.g. because you just removed by mistake the gcc package from some Debian distribution), but just GNU binutils (that is, the linker ld and the assembler gas), some editor in binary form (e.g. GNU emacs or vim), GNU bash and GNU make as binary packages. We assume you are motivated enough to spend months in writing a C compiler. We also assume you have access to man pages in some paper form (notably elf(5) and ld(1)...). We have to assume you can inspect a file in binary form with od(1) and less(1).

然后,您可以在纸上设计 EBNF的C语言子集µC.符号.经过几个月的努力,您可以编写一个小的汇编程序,直接执行 syscalls(2)(请参见 Linux汇编方法)并解释该µC语言(因为编写解释器比编写编译器容易;例如,请阅读 Dragon本书,以及Queinnec的 Lisp In Small件 和Scott的 编程语言语用> 书).

Then you could design on paper a subset µC of the C language in EBNF notation. With months of efforts, you can write a small assembler program, directly doing syscalls(2) (see Linux Assembly HowTo) and interpreting that µC language (since writing an interpreter is easier than writing a compiler; read for example the Dragon book, and Queinnec's Lisp In Small Pieces and Scott's programming language pragmatics book).

有了微型µC解释器后,您就可以在µC中编写一个幼稚的µC编译器(因为Fabrice Bellard能够编写他的 tinyC 编译器).

Once you have your tiny µC interpreter, you can write a naive µC compiler in µC (since Fabrice Bellard has been able to write his tinyC compiler).

调试完该µC编译器后,可以对其进行扩展以接受C的所有语法和语义.

Once you have debugged that µC compiler, you can extend it to accept all the syntax and semantics of C.

一旦有了完整的C编译器,就可以对其进行改进以使其更好地进行优化,也许可以扩展它以接受一小部分C ++,并且还可以编写受

Once you have a full C compiler, you could improve it to optimize better, maybe extend it to accept a small subset of C++, and you might also write a static C code analyzer inspired by Frama-C.

PS.引导可以广泛推广-参见Pitrat的博客,网址为引导人工智能(雅克·皮特拉出生于1934年,去世了在2019年10月)和 RefPerSys 项目.

PS. Bootstrapping can be generalized a lot - see Pitrat's blog on bootstrapping artificial intelligence (Jacques Pitrat, born in 1934, died in october 2019) and the RefPerSys project.

这篇关于如何可能会引导C编译器(来自源代码)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆