动态创建词法分析器规则 [英] Dynamically create lexer rule

查看:41
本文介绍了动态创建词法分析器规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个简单的规则:

NAME : 'name1' | 'name2' | 'name3';

是否可以使用包含字符串的数组动态地为此类规则提供替代方案?

Is it possible to provide alternatives for such rule dynamically using an array that contains strings?

推荐答案

是的,动态令牌匹配 IDENTIFIER 规则

Yes, dynamic tokens match IDENTIFIER rule

在这种情况下,只需在 Id 完全匹配后 进行检查,以查看 Id 匹配的文本是否在预定义中收藏.如果它在集合中(在我的示例中为 Set),请更改令牌的类型.

In that case, simply do a check after the Id has matched completely to see if the text the Id matched is in a predefined collection. If it is in the collection (a Set in my example) change the type of the token.

一个小演示:

grammar T;

@lexer::members {
  private java.util.Set<String> special;

  public TLexer(ANTLRStringStream input, java.util.Set<String> special) {
    super(input);
    this.special = special;
  }

}

parse
 : (t=. {System.out.printf("\%-10s'\%s'\n", tokenNames[$t.type], $t.text);})* EOF
 ;

Id
 : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*
   {if(special.contains($text)) $type=Special;}
 ;

Int
 : '0'..'9'+
 ;

Space
 : (' ' | '\t' | '\r' | '\n') {skip();}
 ;

fragment Special : ;

如果您现在运行以下演示:

And if you now run the following demo:

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String source = "foo bar baz Mu";
    java.util.Set<String> set = new java.util.HashSet<String>();
    set.add("Mu");
    set.add("bar");
    TLexer lexer = new TLexer(new ANTLRStringStream(source), set);
    TParser parser = new TParser(new CommonTokenStream(lexer));
    parser.parse();
  }
}

您将看到以下内容被打印:

You will see the following being printed:

Id        'foo'
Special   'bar'
Id        'baz'
Special   'Mu'

ANTLR4

对于 ANTLR4,您可以执行以下操作:

ANTLR4

For ANTLR4, you can do something like this:

grammar T;

@lexer::members {
  private java.util.Set<String> special = new java.util.HashSet<>();

  public TLexer(CharStream input, java.util.Set<String> special) {
    this(input);
    this.special = special;
  }
}

tokens {
  Special
}

parse
 : .*? EOF
 ;

Id
 : [a-zA-Z_] [a-zA-Z_0-9]* {if(special.contains(getText())) setType(TParser.Special);}
 ;

Int
 : [0-9]+
 ;

Space
 : [ \t\r\n] -> skip
 ;

用类测试它:

import org.antlr.v4.runtime.*;
import java.util.HashSet;
import java.util.Set;

public class Main {

  public static void main(String[] args) {

    String source = "foo bar baz Mu";
    Set<String> set = new HashSet<String>(){{
      add("Mu");
      add("bar");
    }};

    TLexer lexer = new TLexer(CharStreams.fromString(source), set);
    CommonTokenStream tokenStream = new CommonTokenStream(lexer);
    tokenStream.fill();

    for (Token t : tokenStream.getTokens()) {
      System.out.printf("%-10s '%s'\n", TParser.VOCABULARY.getSymbolicName(t.getType()), t.getText());
    }
  }
}

将打印:

Id         'foo'
Special    'bar'
Id         'baz'
Special    'Mu'
EOF        '<EOF>'

这篇关于动态创建词法分析器规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆