如何在ANTLR3中创建TreeParser? [英] How do I make a TreeParser in ANTLR3?

查看:82
本文介绍了如何在ANTLR3中创建TreeParser?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习语言解析,这很有趣...

I'm attemping to learn language parsing for fun...

我创建了一个ANTLR语法,我相信它将与我希望实现的一种简单语言匹配.它将具有以下语法:

I've created a ANTLR grammar which I believe will match a simple language I am hoping to implement. It will have the following syntax:

<FunctionName> ( <OptionalArguments>+) {
     <OptionalChildFunctions>+
 }

实际示例:

ForEach(in:[1,2,3,4,5] as:"nextNumber") {
   Print(message:{nextNumber})
}

我相信我的语法可以正确地匹配此构造,现在我正尝试为该语言构建一个抽象语法树.

I believe I have the grammar working correctly to match this construct, and now I am attemping to build an Abstract Syntax Tree for the language.

首先,我必须承认我不确定这棵树的外观.其次,我完全不知如何在我的Antlr语法中做到这一点……我已经尝试了好几个小时,并没有取得太大的成功.

Firstly, I must admit I'm not exactly sure HOW this tree should look. Secondly, I'm at a complete loss how to do this in my Antlr grammar...I've been trying without much success for hours.

这是我正在为树准备的当前想法:

This is the current idea I'm going with for the tree:

                   FunctionName
                  /          \
           Attributes         \
               / \          /  \ 
            ID    /\    ChildFunctions
           / \   ID etc
          /   \
  Attribute  AttributeValue
        Type

这是我当前的Antlr语法文件:

This is my current Antlr grammar file:

grammar Test;

options {output=AST;ASTLabelType=CommonTree;}

program : function ;
function : ID (OPEN_BRACKET (attribute (COMMA? attribute)*)? CLOSE_BRACKET)? (OPEN_BRACE function* CLOSE_BRACE)?;

attribute : ID COLON datatype;

datatype : NUMBER | STRING | BOOLEAN | array | lookup ;
array  :  OPEN_BOX (datatype (COMMA datatype)* )? CLOSE_BOX ;
lookup  : OPEN_BRACE (ID (PERIOD ID)*) CLOSE_BRACE;

NUMBER
 : ('+' | '-')? (INTEGER | FLOAT)
 ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

BOOLEAN
 : 'true' | 'TRUE' | 'false' | 'FALSE'
 ;

ID  : (LETTER|'_') (LETTER | INTEGER |'_')*
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WHITESPACE : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ;

COLON : ':' ;
COMMA : ',' ;
PERIOD  :  '.' ;

OPEN_BRACKET : '(' ;
CLOSE_BRACKET : ')' ;

OPEN_BRACE : '{' ; 
CLOSE_BRACE : '}' ;

OPEN_BOX : '[' ;
CLOSE_BOX : ']' ;

fragment
LETTER
 : 'a'..'z' | 'A'..'Z' 
 ;

fragment
INTEGER
 : '0'..'9'+
 ;

fragment
FLOAT
 : INTEGER+ '.' INTEGER*
 ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    ;

任何帮助/建议都很好.我尝试阅读许多教程,关于AST一代似乎一无所有:(

ANY help / advice would be great. I've tried reading dozens of tutorials and nothing about the AST generation seems to stick :(

推荐答案

步骤1是使树看起来像您发布的小图.现在,您没有任何树构建运算符,因此您将获得一个固定清单.

Step 1 is to make the tree look like the little graph that you posted. Right now, you don't have any tree construction operators, so you're going to end up with a flat list.

请参阅antlr.org网站上的树结构.

See tree construction on the antlr.org website.

您可以使用 ANTLRWorks 来查看解析树和AST的使用情况.开始添加树构造运算符,并观察其变化.

You can use ANTLRWorks to see what your getting for a parse tree and AST. Start adding tree construction operators and watch how things change.

编辑/其他信息:

您可以按照以下流程大致了解操作方法:

Here's a process you can follow to give you a rough idea of how to do it:

  1. 下载ANTLRWorks并使用其绘图功能.您肯定希望在进行更改之前和之后都看到解析树和AST.一旦了解了一切工作原理,便可以使用所需的任何IDE或编辑器.
  2. 有两种基本的树构造运算符-惊叹号!告诉编译器不要将节点放置在AST内,而carot ^告诉ANTLR将某些内容做成根节点.首先,仔细阅读每条非终止规则,并确定哪些元素不需要包含在AST中.例如,您不需要逗号或括号.获得所有信息后,您可以填充提供所有信息的结构(或创建自己的AST结构).逗号不再有用,因此请向其中添加!.例如:

  1. Download ANTLRWorks and use it's graphing facilities. You will definitely want to see the parse tree and the AST before and after you make changes. Once you understand how everything works, then you can use any IDE or editor you want.
  2. There are two basic operators for tree construction - The exclamation mark ! which tells the compiler to not place the node within the AST, and the carot ^, which tells ANTLR to make something the root node. Start by going through each non-terminal rule and deciding which elements don't need to be in the AST. For example, you don't need commas or parenthesis. Once you have all the information you can populate the a structure (or create your own AST structure) that provides all the information. Commas don't help any more, so add a ! to them. For example:

function: ID (OPEN_BRACKET! (attribute (COMMA!? attribute)*)? CLOSE_BRACKET!)? (OPEN_BRACE! function* CLOSE_BRACE!)?;

之前和之后,请看看ANTLRWorks中的AST.比较.

Take a look at the AST in ANTLRWorks before and after. Compare.

这里有一些更改使它更接近我想您想要的:

Here's a few changes that bring it closer to what I think you want:

program : function ;
function : ID^ (OPEN_BRACKET! attributeList? CLOSE_BRACKET!)? (OPEN_BRACE! function* CLOSE_BRACE!)?;
attributeList:  (attribute (COMMA!? attribute)*);
attribute : ID COLON! datatype;
datatype : NUMBER | STRING | BOOLEAN | array | lookup ;
array  :  OPEN_BOX! (datatype^ (COMMA! datatype)* )? CLOSE_BOX!;
lookup  : OPEN_BRACE! (ID (PERIOD! ID)*) CLOSE_BRACE!;

有了这些,现在来看一下教程.

With that under your belt, now go look at some of the tutorials.

这篇关于如何在ANTLR3中创建TreeParser?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆