从 C 代码构建 AST [英] Build AST from C code

查看:22
本文介绍了从 C 代码构建 AST的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从 gcc C 代码构建 AST(抽象语法树)以进行一些修改,例如将一些 int 变量转换为浮点数,然后再次将代码复制(生成)为 C 语法.

How can I build an AST (Abstract Syntax Tree) from gcc C code in order to make some modifications, like converting some int variables to float, and reproduce(generate) the code to C syntax again after that.

实际上,就目前而言,我真正需要的唯一功能是从由几行组成的 c 程序中提取变量及其类型表......我认为有一个简单的解析器可以这样做.

Actually, for the moment, the only functionality I truly need is to extract a table of variables and their types from a c program consisting of few lines... I think there is a simple parser doing so.

我有一些变量,例如:

int  var_bss ;           
float var_f_bss;            
int var_data = 4;        
float var_f_data = 5;  

还有一个函数:

int Foo(){          
   some local variables;            
}    

代码在单个 c 文件中.

The code is in a single c file.

我想向最终用户介绍所有变量,让他选择特定内存段中的源类型,例如.data 中的 int 变量.然后用户可以将这些变量转换为浮点数.最后,我为用户生成相同的代码,但使用他选择的新变量类型.

I want to introduce all the variables to the end user to let him choose a source type in a specific memory segment e.g. int variables in the .data. Then the user can convert those variables into floats. Finally, I generate the same code for the user but with the new variable types those he has chosen.

推荐答案

首先,这是一项艰巨的任务,因为 C 的抽象语法树比您认为的要复杂得多.阅读 C11 标准 n1570 了解详情,并查看这个网站.另请查看 tinyCCnwcc(至少是为了灵感).

First, it is a difficult task, because the abstract syntax tree of C is much more complex than what you believe it is. Read the C11 standard n1570 for details, and see this website. Look also into tinyCC or nwcc (at least for inspiration).

那么如果您使用的是最近的 GCC(例如 4.7 或 4.8),我强烈建议自定义 GCC 例如使用 MELT 扩展名(或您的 GCC 插件).

Then if you are using a recent GCC (e.g. 4.7 or 4.8), I strongly suggest customizing GCC e.g. with a MELT extension (or your GCC plugin).

我不认为这是一项简单的任务,因为您很可能需要了解 GCC 内部表示的细节(至少 GIMPLE)

I don't claim it is a simple task, because very probably you need to understand the details of GCC internal representations (at least GIMPLE)

顺便说一句,MELT 是(曾经)一种用于扩展 GCC 的领域特定语言,专为您梦想中的任务而设计.您将能够使用 MELT 转换内部 GCC 表示(Gimple 和 Tree-s).2020 年的今天,由于缺乏资金,MELT 没有开展工作.

BTW, MELT is (was) a domain specific language to extend GCC, and is designed exactly for the kind of tasks you are dreaming about. You would be able with MELT to transform the internal GCC representations (Gimple and Tree-s). Today in 2020, MELT is not worked upon because of lack of funding.

在 GCC 内(或在其他一些编译器内,如 Clang/LLVM)内工作的优势在于您不必回吐一些 C 代码(这实际上比您想象的要困难得多);您只需转换内部编译器表示,也许最重要的是,您利用免费"编译器经常做的许多事情:各种优化,如常量折叠、内联、公共子表达式消除等,等等,......

The advantage of working inside GCC (or inside some other compiler like Clang/LLVM) is that you don't have to spit back some C code (which is actually much more difficult than what you think); you just transform the internal compiler representation and, perhaps most importantly, you take advantage "gratis" of the many things a compiler always do: all kind of optimizations like constant folding, inlining, common-subexpression elimination, etc, etc, etc, ....

在 2020 年,您还可以考虑在最近的 libgccjit 框架="https://gcc.gnu.org/gcc-10/" rel="nofollow noreferrer">GCC 10,并阅读 本草稿 报告(与Bismon 相关)a>; 但另见 RefPerSys,分享一些想法但没有Bismon).也可以试试 Clang 静态分析器 和/或 Frama-C.

In 2020, you could also consider using the libgccjit framework inside recent GCC 10, and read this draft report (related to Bismon; but see also RefPerSys, sharing some ideas but no code with Bismon). Try perhaps also the Clang static analyzer and/or Frama-C.

这篇关于从 C 代码构建 AST的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆