自定义ANTLR语法不适用于每个输入 [英] Custom ANTLR grammar not working for every input

查看:116
本文介绍了自定义ANTLR语法不适用于每个输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为我们的自定义规则引擎编写语法,该引擎使用ANTLR(用于解析)和Pentaho Kettle(用于执行规则)

I am trying to write a grammar for our custom rule engine which uses ANTLR (for parsing) and Pentaho Kettle (for executing the rules)

解析器的有效输入应为以下类型:
(<Attribute_name> <Relational_Operator> <Value>) AND/OR (<Attribute_name> <Relational_Operator> <Value>)
即PERSON_PHONE = 123456789

Valid inputs for the parser would be of the type:
(<Attribute_name> <Relational_Operator> <Value>) AND/OR (<Attribute_name> <Relational_Operator> <Value>)
i.e. PERSON_PHONE = 123456789

这是我的语法:

grammar RuleGrammar;
options{
language=Java;
}

prog                : condition;

condition
                                :  LHSOPERAND RELATIONOPERATOR RHSOPERAND
                                ;

LHSOPERAND
                                :  STRINGVALUE
                                ;

RHSOPERAND
                                :  NUMBERVALUE    |
                                   STRINGVALUE
                                ;


RELATIONOPERATOR
                                :   '>'    |
                                     '=>'  |
                                     '<'   |
                                     '<='  |
                                     '='   |
                                     '<>'
                                ;

fragment NUMBERVALUE
                              : '0'..'9'+
                              ;

fragment STRINGVALUE
                              :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_')*
                              ;


fragment LOGICALOPERATOR
                              :  'AND' |
                                 'OR'  |
                                 'NOT'
                              ;

我所面临的问题是与字符串值进行比较,即PERSON_NAME = 1将通过语法,但是值PERSON_NAME=BATMAN不起作用.我正在使用ANTLRWorks,并在调试PERSON_NAME=BATMAN时,得到RHS值的MismatchTokenException.

The issue I am facing is comparing against string value i.e. PERSON_NAME=1 would pass the grammar, but the value PERSON_NAME=BATMAN does not work. I am using ANTLRWorks and on debugging for PERSON_NAME=BATMAN, I get a MismatchTokenException for the RHS value.

有人可以指导我哪里出问题了吗?

Can anyone please guide me where I am going wrong?

推荐答案

BATMAN被标记为LHSOPERAND标记.您必须意识到,词法分析器没有考虑解析器在特定时间的需求".词法分析器只是尝试尽可能地匹配,并且如果2个(或更多)规则匹配相同数量的字符(在您的情况下为LHSOPERANDRHSOPERAND),则首先定义的规则将获胜",即LHSOPERAND规则.

BATMAN is being tokenized as a LHSOPERAND token. You must realize that the lexer does not take into account what the parser "needs" at a particular time. The lexer simply tries to match as much as possible, and in case 2 (or more) rules match the same amount of characters (LHSOPERAND and RHSOPERAND in your case), the rule defined first will "win", which is the LHSOPERAND rule.

编辑

这样看:首先,词法分析器接收字符流,然后将其转换为令牌流.创建完所有令牌后,解析器将接收这些令牌,然后尝试对其进行解释.在解析过程中(在解析器规则中)但不是在解析过程中 not 创建令牌.

EDIT

Look at it like this: first the lexer receives the character stream which it converts in a stream of tokens. After all tokens have been created, the parser receives these tokens which it then tries to make sense of. Tokens are not created during parsing (in parser rules), but before it.

如何快速完成演示:

A quick demo of how you could do it:

grammar RuleGrammar;

prog
 : condition EOF
 ;

condition
 : logical
 ;

logical
 : relational ((AND | OR) relational)*
 ;

relational
 : STRINGVALUE ((GT | GTEQ | LT | LTEQ | EQ | NEQ) term)?
 ;

term
 : STRINGVALUE
 | NUMBERVALUE
 | '(' condition ')'
 ;

GT          : '>';
GTEQ        : '>=';
LT          : '<';
LTEQ        : '<=';
EQ          : '=';
NEQ         : '<>';
NUMBERVALUE : '0'..'9'+;
AND         : 'AND';
OR          : 'OR';
STRINGVALUE : ('a'..'z' | 'A'..'Z' | '_')+;
SPACE       : ' ' {skip();};

(请注意EQNEQ并不是真正的关系运算符...)

(note that EQ and NEQ aren't really relational operators...)

解析输入,例如:

PERSON_NAME = BATMAN OR age <> 42

现在将导致以下分析:

这篇关于自定义ANTLR语法不适用于每个输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆