ANTLR中的“语义谓词"是什么? [英] What is a 'semantic predicate' in ANTLR?

查看：94 发布时间：2020/9/2 22:19:32 antlr antlr3 antlr4

本文介绍了ANTLR中的“语义谓词"是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

ANTLR中的 语义谓词 是什么?

What is a semantic predicate in ANTLR?

ANTLR 4

对于ANTLR 4中的谓词，请检查以下 stack overflow Q& A:

ANTLR 4

For predicates in ANTLR 4, checkout these stackoverflow Q&A's:

Syntax of semantic predicates in Antlr4
Semantic predicates in ANTLR4?

语义谓词 是一种在语法上强制执行额外(语义)规则的方法使用普通代码进行操作.

A semantic predicate is a way to enforce extra (semantic) rules upon grammar actions using plain code.

语义谓词有3种类型:

验证语义谓词；
门控语义谓词；
消除歧义 的语义谓词.

假设您有一段文本，其中仅包含数字，并以逗号，忽略任何空格.您想解析此输入确保数字最多为3位长"(最多999个).以下语法(Numbers.g)会做这样的事情:

Let's say you have a block of text consisting of only numbers separated by comma's, ignoring any white spaces. You would like to parse this input making sure that the numbers are at most 3 digits "long" (at most 999). The following grammar (Numbers.g) would do such a thing:

grammar Numbers;

// entry point of this parser: it parses an input string consisting of at least 
// one number, optionally followed by zero or more comma's and numbers
parse
  :  number (',' number)* EOF
  ;

// matches a number that is between 1 and 3 digits long
number
  :  Digit Digit Digit
  |  Digit Digit
  |  Digit
  ;

// matches a single digit
Digit
  :  '0'..'9'
  ;

// ignore spaces
WhiteSpace
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

测试

可以使用以下课程测试语法:

Testing

The grammar can be tested with the following class:

import org.antlr.runtime.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("123, 456, 7   , 89");
        NumbersLexer lexer = new NumbersLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        NumbersParser parser = new NumbersParser(tokens);
        parser.parse();
    }
}

通过生成词法分析器和解析器，编译所有.java文件以及运行Main类:

Test it by generating the lexer and parser, compiling all .java files and running the Main class:


java -cp antlr-3.2.jar org.antlr.Tool Numbers.g
javac -cp antlr-3.2.jar *.java
java -cp .:antlr-3.2.jar Main

这样做时，不会在控制台上打印任何内容，表明没有任何内容出错.尝试更改:

When doing so, nothing is printed to the console, which indicates that nothing went wrong. Try changing:

ANTLRStringStream in = new ANTLRStringStream("123, 456, 7   , 89");

进入:

ANTLRStringStream in = new ANTLRStringStream("123, 456, 7777   , 89");

并再次进行测试:字符串777之后，您将在控制台上看到一个错误.

and do the test again: you will see an error appearing on the console right after the string 777.

这使我们进入了语义谓词.假设您要解析 1到10位数字之间的数字.像这样的规则:

This brings us to the semantic predicates. Let's say you want to parse numbers between 1 and 10 digits long. A rule like:

number
  :  Digit Digit Digit Digit Digit Digit Digit Digit Digit Digit
  |  Digit Digit Digit Digit Digit Digit Digit Digit Digit
     /* ... */
  |  Digit Digit Digit
  |  Digit Digit
  |  Digit
  ;

会变得很麻烦.语义谓词可以帮助简化这种规则.

would become cumbersome. Semantic predicates can help simplify this type of rule.

验证语义谓词没什么超过一个代码块，后面跟一个问号:

A validating semantic predicate is nothing more than a block of code followed by a question mark:

RULE { /* a boolean expression in here */ }?

要使用验证来解决上述问题语义谓词，将语法中的number规则更改为:

To solve the problem above using a validating semantic predicate, change the number rule in the grammar into:

number
@init { int N = 0; }
  :  (Digit { N++; } )+ { N <= 10 }?
  ;

部分{ int N = 0; }和{ N++; }是纯Java语句，其中当解析器输入" number规则时，第一个被初始化.实际上谓词是:{ N <= 10 }?，这将导致解析器抛出一个 FailedPredicateException 每当一个数字超过10位数字时.

The parts { int N = 0; } and { N++; } are plain Java statements of which the first is initialized when the parser "enters" the number rule. The actual predicate is: { N <= 10 }?, which causes the parser to throw a FailedPredicateException whenever a number is more than 10 digits long.

使用以下ANTLRStringStream对其进行测试:

Test it by using the following ANTLRStringStream:

// all equal or less than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,1234567890");

不会产生异常，而以下内容会引发异常:

which produces no exception, while the following does thow an exception:

// '12345678901' is more than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,12345678901");

2.门控语义谓词

一个门控语义谓词与验证语义谓词类似，只有 gated 版本会产生语法错误，而不是FailedPredicateException.

2. Gated Semantic Predicates

A gated semantic predicate is similar to a validating semantic predicate, only the gated version produces a syntax error instead of a FailedPredicateException.

门控语义谓词的语法为:

The syntax of a gated semantic predicate is:

{ /* a boolean expression in here */ }?=> RULE

要改为使用 gated 谓词来解决最长10位数字的问题，则可以编写以下代码:

To instead solve the above problem using gated predicates to match numbers up to 10 digits long you would write:

number
@init { int N = 1; }
  :  ( { N <= 10 }?=> Digit { N++; } )+
  ;

再次对它们进行测试:

// all equal or less than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,1234567890");

和:

// '12345678901' is more than 10 digits
ANTLRStringStream in = new ANTLRStringStream("1,23,12345678901");

，您将看到最后一个会抛出错误.

and you will see the last on will throw an error.

谓词的最后一种类型是歧义语义谓词，它看起来有点像验证谓词({boolean-expression}?)，但更像是门控语义谓词(当布尔表达式的结果为false).您可以在规则开始时使用它来检查规则的某些属性，并让解析器匹配所述规则.

The final type of predicate is a disambiguating semantic predicate, which looks a bit like a validating predicate ({boolean-expression}?), but acts more like a gated semantic predicate (no exception is thrown when the boolean expression evaluates to false). You can use it at the start of a rule to check some property of a rule and let the parser match said rule or not.

假设示例语法创建了Number标记(词法规则而不是解析器规则)，它们将匹配0..999范围内的数字.现在，在解析器中，您想区分低数字和高数字(低:0..500，高:501..999).可以使用歧义语义谓词完成此操作，在该语义谓词中，您检查流(input.LT(1))中下一个标记，以检查它的高低.

Let's say the example grammar creates Number tokens (a lexer rule instead of a parser rule) that will match numbers in the range of 0..999. Now in the parser, you'd like to make a distinction between low- and hight numbers (low: 0..500, high: 501..999). This could be done using a disambiguating semantic predicate where you inspect the token next in the stream (input.LT(1)) to check if it's either low or high.

演示:

grammar Numbers;

parse
  :  atom (',' atom)* EOF
  ;

atom
  :  low  {System.out.println("low  = " + $low.text);}
  |  high {System.out.println("high = " + $high.text);}
  ;

low
  :  {Integer.valueOf(input.LT(1).getText()) <= 500}? Number
  ;

high
  :  Number
  ;

Number
  :  Digit Digit Digit
  |  Digit Digit
  |  Digit
  ;

fragment Digit
  :  '0'..'9'
  ;

WhiteSpace
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

如果现在解析字符串"123, 999, 456, 700, 89, 0"，则会看到以下输出:

If you now parse the string "123, 999, 456, 700, 89, 0", you'd see the following output:

low  = 123
high = 999
low  = 456
high = 700
low  = 89
low  = 0

这篇关于ANTLR中的“语义谓词"是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

ANTLR中的“语义谓词"是什么? [英] What is a 'semantic predicate' in ANTLR?

问题描述

推荐答案

ANTLR 4

ANTLR 4

测试

Testing

2.门控语义谓词

2. Gated Semantic Predicates

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

ANTLR中的“语义谓词"是什么? [英] What is a &#39;semantic predicate&#39; in ANTLR?

问题描述

推荐答案

ANTLR 4

ANTLR 4

测试

Testing

2.门控语义谓词

2. Gated Semantic Predicates

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

ANTLR中的“语义谓词"是什么? [英] What is a 'semantic predicate' in ANTLR?

登录关闭