如何在语法上实现 JJTree [英] How to implement JJTree on grammar

查看:23
本文介绍了如何在语法上实现 JJTree的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个作业,要使用 JavaCC 为讲师提供的语言制作一个带有语义分析的自上而下的解析器.我写出了生产规则,没有错误.我完全被困在如何将 JJTree 用于我的代码,而我在互联网上搜索教程的时间并没有让我获得任何帮助.只是想知道有人能花点时间解释一下如何在代码中实现 JJTree 吗?或者,如果某个地方有隐藏的分步教程会很有帮助!

I have an assignment to use JavaCC to make a Top-Down Parser with Semantic Analysis for a language supplied by the lecturer. I have the production rules written out and no errors. I'm completely stuck on how to use JJTree for my code and my hours of scouring the internet for tutorials hasn't gotten me anywhere. Just wondering could anyone take some time out to explain how to implement JJTree in the code? Or if there's a hidden step-by-step tutorial out there somewhere that would be a great help!

以下是我的一些生产规则,以备不时之需.提前致谢!

Here are some of my production rules in case they help. Thanks in advance!

void program() : {}
{
  (decl())* (function())* main_prog()
}

void decl() #void : {}
{
  (
    var_decl() | const_decl()
   )
}

void var_decl() #void : {}
{
  <VAR> ident_list() <COLON> type()
 (<COMMA> ident_list() <COLON> type())* <SEMIC>
}

void const_decl()  #void : {}
{
  <CONSTANT> identifier() <COLON> type() <EQUAL> expression()
 ( <COMMA> identifier() <COLON> type() <EQUAL > expression())* <SEMIC>
} 

void function() #void : {}
{
  type() identifier() <LBR> param_list() <RBR>
  <CBL>
  (decl())*
  (statement() <SEMIC> )*
  returnRule() (expression() | {} )<SEMIC>
  <CBR>
}

推荐答案

使用 JavaCC 创建 AST 看起来很像创建普通"解析器(在 jj 文件中定义).如果你已经有一个可用的语法,那(相对)容易:)

Creating an AST using JavaCC looks a lot like creating a "normal" parser (defined in a jj file). If you already have a working grammar, it's (relatively) easy :)

以下是创建 AST 所需的步骤:

Here are the steps needed to create an AST:

  1. 将您的jj 语法文件重命名为jjt
  2. 根标签装饰(斜体字是我自己的术语...)
  3. 在你的 jjt 语法上调用 jjtree,它会为你生成一个 jj 文件
  4. 在生成的 jj 语法上调用 javacc
  5. 编译生成的java源文件
  6. 测试一下
  1. rename your jj grammar file to jjt
  2. decorate it with root-labels (the italic words are my own terminology...)
  3. invoke jjtree on your jjt grammar, which will generate a jj file for you
  4. invoke javacc on your generated jj grammar
  5. compile the generated java source files
  6. test it

这是一个快速的分步教程,假设您使用的是 MacOS 或 *nix,将 javacc.jar 文件与语法文件和 文件放在同一目录中>javajavac 位于您系统的 PATH 中:

Here's a quick step-by-step tutorial, assuming you're using MacOS or *nix, have the javacc.jar file in the same directory as your grammar file(s) and java and javac are on your system's PATH:

假设你的 jj 语法文件名为 TestParser.jj,重命名它:

Assuming your jj grammar file is called TestParser.jj, rename it:

mv TestParser.jj TestParser.jjt

2

现在是棘手的部分:修饰您的语法,以便创建正确的 AST 结构.你装饰一个AST(或节点,或生产规则(都一样)),方法是添加一个#,后面跟一个标识符(在之前:).在您最初的问题中,您在不同的生产中有很多 #void,这意味着您正在为不同的生产规则创建相同类型的 AST:这不是您想要的.

2

Now the tricky part: decorating your grammar so that the proper AST structure is created. You decorate an AST (or node, or production rule (all the same)) by adding a # followed by an identifier after it (and before the :). In your original question, you have a lot of #void in different productions, meaning you're creating the same type of AST's for different production rules: this is not what you want.

如果你不装饰你的生产,生产的名称将用作节点的类型(因此,你可以删除#void):

If you don't decorate your production, the name of the production is used as the type of the node (so, you can remove the #void):

void decl() :
{}
{
     var_decl()
  |  const_decl()
}

现在规则简单地返回规则 var_decl()const_decl() 返回的任何 AST.

Now the rule simply returns whatever AST the rule var_decl() or const_decl() returned.

现在让我们看看(简化的)var_decl 规则:

Let's now have a look at the (simplified) var_decl rule:

void var_decl() #VAR :
{}
{
  <VAR> id() <COL> id() <EQ> expr() <SCOL>
}

void id() #ID :
{}
{
  <ID>
}

void expr() #EXPR :
{}
{
  <ID>
}

我用 #VAR 类型装饰.现在这意味着此规则将返回以下树结构:

which I decorated with the #VAR type. This now means that this rule will return the following tree structure:

    VAR 
   / | 
  /  |  
ID  ID  EXPR

如您所见,终端已从 AST 中丢弃!这也意味着 idexpr 规则会丢失与它们的 终端匹配的文本.当然,这不是你想要的.对于需要保持终端匹配的内部文本的规则,需要将树的.value显式设置为匹配终端的.image:

As you can see, the terminals are discarded from the AST! This also means that the id and expr rules loose the text their <ID> terminal matched. Of course, this is not what you want. For the rules that need to keep the inner text the terminal matched, you need to explicitly set the .value of the tree to the .image of the matched terminal:

void id() #ID :
{Token t;}
{
  t=<ID> {jjtThis.value = t.image;}
}

void expr() #EXPR :
{Token t;}
{
  t=<ID> {jjtThis.value = t.image;}
}

使输入 "var x : int = i;" 看起来像这样:

causing the input "var x : int = i;" to look like this:

       VAR 
        |
    .---+------.
   /    |       
  /     |        
ID["x"] ID["int"] EXPR["i"]

这就是为 AST 创建合适结构的方式.下面是一个小语法,它是你自己语法的一个非常简单的版本,包括一个小的 main 方法来测试它:

This is how you create a proper structure for your AST. Below follows a small grammar that is a very simple version of your own grammar including a small main method to test it all:

// TestParser.jjt
PARSER_BEGIN(TestParser)

public class TestParser {
  public static void main(String[] args) throws ParseException {
    TestParser parser = new TestParser(new java.io.StringReader(args[0]));
    SimpleNode root = parser.program();
    root.dump("");
  }
}

PARSER_END(TestParser)

TOKEN :
{
   < OPAR  : "(" > 
 | < CPAR  : ")" >
 | < OBR   : "{" >
 | < CBR   : "}" >
 | < COL   : ":" >
 | < SCOL  : ";" >
 | < COMMA : "," >
 | < VAR   : "var" >
 | < EQ    : "=" > 
 | < CONST : "const" >
 | < ID    : ("_" | <LETTER>) ("_" | <ALPHANUM>)* >
}

TOKEN :
{
   < #DIGIT    : ["0"-"9"] >
 | < #LETTER   : ["a"-"z","A"-"Z"] >
 | < #ALPHANUM : <LETTER> | <DIGIT> >
}

SKIP : { " " | "	" | "
" | "
" }

SimpleNode program() #PROGRAM :
{}
{
  (decl())* (function())* <EOF> {return jjtThis;}
}

void decl() :
{}
{
     var_decl()
  |  const_decl()
}

void var_decl() #VAR :
{}
{
  <VAR> id() <COL> id() <EQ> expr() <SCOL>
}

void const_decl() #CONST :
{}
{
  <CONST> id() <COL> id() <EQ> expr() <SCOL>
}


void function() #FUNCTION :
{}
{
  type() id() <OPAR> params() <CPAR> <OBR> /* ... */ <CBR>
}

void type() #TYPE :
{Token t;}
{
  t=<ID> {jjtThis.value = t.image;}
}

void id() #ID :
{Token t;}
{
  t=<ID> {jjtThis.value = t.image;}
}

void params() #PARAMS :
{}
{
  (param() (<COMMA> param())*)?
}

void param() #PARAM :
{Token t;}
{
  t=<ID> {jjtThis.value = t.image;}
}

void expr() #EXPR :
{Token t;}
{
  t=<ID> {jjtThis.value = t.image;}
}

3

jjtree 类(包含在 javacc.jar 中)为您创建一个 jj 文件:

3

Let the jjtree class (included in javacc.jar) create a jj file for you:

java -cp javacc.jar jjtree TestParser.jjt

4

上一步已经创建了文件 TestParser.jj(如果一切顺利的话).让 javacc(也出现在 javacc.jar 中)处理它:

4

The previous step has created the file TestParser.jj (if everything went okay). Let javacc (also present in javacc.jar) process it:

java -cp javacc.jar javacc TestParser.jj

5

要编译所有源文件,请执行:

5

To compile all source files, do:

javac -cp .:javacc.jar *.java

(在 Windows 上,执行:javac -cp .;javacc.jar *.java)

关键时刻已经到来:让我们看看一切是否真的有效!让解析器处理输入:

The moment of truth has arrived: let's see if everything actually works! To let the parser process the input:

var n : int = I; 

const x : bool = B; 

double f(a,b,c) 
{ 
}

执行以下操作:

java -cp . TestParser "var n : int = I; const x : bool = B; double f(a,b,c) { }"

您应该会在控制台上看到以下内容:

and you should see the following being printed on your console:

PROGRAM
 decl
  VAR
   ID
   ID
   EXPR
 decl
  CONST
   ID
   ID
   EXPR
 FUNCTION
  TYPE
  ID
  PARAMS
   PARAM
   PARAM
   PARAM

请注意,您看不到 ID 匹配的文本,但相信我,它们就在那里.方法 dump() 根本没有显示它.

Note that you don't see the text the ID's matched, but believe me, they're there. The method dump() simply does not show it.

HTH

对于包含表达式的工作语法,您可以查看我的以下表达式评估器:https://github.com/bkiers/Curta(语法在 src/grammar 中).您可能想看看如何在二进制表达式的情况下创建根节点.

For a working grammar including expressions, you could have a look at the following expression evaluator of mine: https://github.com/bkiers/Curta (the grammar is in src/grammar). You might want to have a look at how to create root-nodes in case of binary expressions.

这篇关于如何在语法上实现 JJTree的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆