antlr 为 c 生成 ast 并解析 ast [英] antlr generate ast for c and parse the ast

查看:31
本文介绍了antlr 为 c 生成 ast 并解析 ast的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对c程序做静态分析.我在antlr网站上搜索,似乎没有合适的语法文件可以为c程序生成ast.是不是意味着我必须从一开始就自己做.或者是有一个更快的方法.我还需要一个可以遍历解析器创建的 ast 的树解析器.

I am doing static analyze on c program.And I search the antlr website ,there seems to be no appropriate grammar file that produce ast for c program.Does it mean I have to do it myself from the very start.Or is there a quicker method.I also need a tree parser that can traverse the ast created by the parser.

推荐答案

您表示要进行静态分析以检测缓冲区溢出.

You indicated you want to do static analysis to detect buffer overflow.

首先,为 C 编写语法比看起来更难.标准中有所有这些东西,然后才是真正的编译器实际接受的东西.你必须决定如何处理预处理器(它因编译器而异!).如果您没有完全正确地掌握语法和预处理,您将无法解析真正的程序.(如果你想做玩具语言,那很好,但你不需要 C 语法).

First, writing a grammar for C is harder than it looks. There's all that stuff in the standard, and then there's what the real compilers actually accept. And you have to decide what to do about the preprocessor (and it varies from compiler to compiler!). If you don't get the grammar and preprocessing exactly right, you won't be able to parse real programs. (If you want to do toy languages, that's fine, but then you don't need a C grammar).

要进行分析,您需要比 AST 更多的机器.您将需要符号表、控制和数据流分析、可能的本地和全局点分析、调用图提取和某种类型的范围分析.

To do the analysis, you'll need far more machinery than an AST. You'll need symbol tables, control and data flow analysis, likely local and global points-to analysis, call graph extraction, and some type of range analysis.

人们似乎不明白这一点.

** 获得解析器距离用真实语言做任何有用的事情还有很长的路要走 **

** GETTING A PARSER IS A LONG WAY FROM DOING ANYTHING USEFUL WITH REAL LANGUAGES **

我大喊大叫是因为我一遍又一遍地看到这一点.

I'm shouting because I see this over, and over, and over.

如果您想继续执行特定的程序分析或转换任务,除非您想在开始任务之前就老死,否则您最好找到一个已经拥有大部分所需内容的基础.语法不规则的解析器生成器的基础不是基础.(不要误会我的意思:ANTLR、YACC、JavaCC 都是很好的解析器生成器,它们非常适合为新语言构建解析器.当投资获得时,它们非常适合为真正的语言实现生产解析器.但是他们生产解析器,而且大多数人不做生产部分.而且他们根本不提供额外的机器.)

If you want to get on with a specific program analysis or transformation task, unless you want to die of old age before you start your task, you better find a foundation that has most of what you need already. A foundation on a parser generator with a creaky grammar is not a foundation. (Don't get me wrong: ANTLR, YACC, JavaCC are all fine parser generators, and they're great for building a parser for a new language. They're great for implementing production parsers for real langauges when the investment gets made. But they produce parsers, and mostly people don't do the production part. And they don't provide the additional machinery by a long shot.)

我们的DMS Software Reengineering Toolkit 包含上述所有机制,因为它几乎总是需要,而且实施起来很头疼.(到目前为止,我的团队已经投资了 15 年.)

Our DMS Software Reengineering Toolkit contains all the above machinery because it is almost always needed, and it is a royal headache to implement. (My team has 15 years invested so far.)

我们还实例化了机械是专门用于 COBOL 和 Java、C、C++(稍微小一点,语言真的很难),各种方言,免得别人重复这个漫长的过程.

We've also instantiated that machinery is forms specifically useful for COBOL and Java, C, C++ (to somewhat lesser extent, the language is really hard), in a variety of dialects, so that others don't have to repeat this long process.

GCC 和 Clang 作为 C 和 C++ 的替代品已经相当成熟.

GCC and Clang are pretty mature for C and C++ as alternatives.

这篇关于antlr 为 c 生成 ast 并解析 ast的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆