ANTLR4:只匹配所有输入选项一次 [英] ANTLR4: Matching all input alternatives exaclty once

查看:26
本文介绍了ANTLR4:只匹配所有输入选项一次的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在 ANTLR 中制定规则以任何顺序仅匹配所有替代项一次?

How can I make a rule to match all of its alternatives only once in any order, in ANTLR?

rule: ('example\r\n' | 'example2\r\n') nextRule

我希望 'example' 和 'example2' 在进入下一条规则之前只匹配一次.

I would like 'example' and 'example2' to match only once before moving onto the next rule.

应该匹配输入:

example
example2

example2
example

但不是输入:

example
example
example2

推荐答案

如何在 ANTLR 中制定规则以任何顺序仅匹配所有替代项一次?

How can I make a rule to match all of its alternatives only once in any order, in ANTLR?

它的所有替代方案仅一次"只是规则:altA altB altC ....这是容易的部分.要求 rule 接受 altAaltBaltC 等在 any 安排意味着适应每一个安排.手动处理小数字很容易(rule: (altA altB | altB altA);).但是我知道没有自动快捷方式可以自动为您处理所有排列.

"All of its alternatives only once" is simply rule: altA altB altC .... That's the easy part. Asking rule to accept all alternatives altA, altB, altC, etc. in any arrangement means accommodating every arrangement. That's easy enough to handle manually for small numbers (rule: (altA altB | altB altA);). But there's no automatic shortcut that I'm aware of to handling all the permutations for you automatically.

这里有一些方法,以防没有内置方法并假设您不能放宽您的要求.警告:我不知道你的问题的全部范围;我不知道你的语法;我不知道你为什么想要你所要求的;我不知道您更喜欢哪种类型的解决方案,但您可能会喜欢它比任何这些选项都更容易.

Here are some approaches in case there isn't a built-in way and assuming you can't relax your requirements. Caveats: I don't know the full scope of your problem; I don't know your grammar; I don't know why you want what you're asking for; I don't know what class of solution you prefer, other than you'd probably like it easier than any of these options.

首先,您可以硬着头皮自己生成匹配的所有排列,无论是手动还是运行排列生成器.然后 ANTLR 就会以它理解的方式得到你想要的东西.它粗略但高效:它是直接的 ANTLR 语法,因此没有像下面的选项中那样涉及外部代码.

First, you could bite the bullet and produce all the permutations of matches yourself, either by hand or by running a permutation generator. Then ANTLR would have what you want in a manner that it understands. It's crude but efficient: it's straight ANTLR syntax so there's no external code involved as there is in the options below.

例如,假设您有一个规则 field 处理像 "public static final x" 这样的输入,其中包含所有三个修饰符,但没有特定的顺序.排列看起来像这样:

For example, assume you have a rule field that processes input like "public static final x", with all three modifiers expected but in no particular order. The permutations would look like so:

field : modifiers ID EOF;

modifiers
    : PUBLIC STATIC FINAL //ABC
    | PUBLIC FINAL STATIC //ACB
    | STATIC PUBLIC FINAL //BAC
    | STATIC FINAL PUBLIC //BCA
    | FINAL PUBLIC STATIC //CAB
    | FINAL STATIC PUBLIC //CBA
    ;

到此结束.没有外部代码,没有谓词,什么都没有.

That's the end of it. No external code, no predicates, no nothing.

其次,您可以在语法中使用语义谓词来确保提供所有匹配项并且匹配没有重复项.有多种编写谓词本身的方法,但归结为跟踪进行了哪些匹配(以防止重复),然后测试规则是否与它期望的所有部分匹配.这是一个基本示例,遵循与上一个相同的要求:

Second, you could use semantic predicates in the grammar to ensure that all matches are provided and matched without duplicates. There are various ways of writing the predicates themselves, but it boils down to tracking what matches have been made (to prevent duplicates) and then testing whether the rule has matched all the parts that it expects. Here is a basic example, following the same requirements as the previous one:

field 
    locals [java.util.HashSet<String> names = new java.util.HashSet<String>();]
    : modifiers ID EOF;

modifiers
    //Ensure that the full number of modifiers have been provided
    : {$field::names.size() < 3}? modifier modifiers
    | {$field::names.size() == 3}? //match nothing once we have (any) three modifiers
    ;

modifier
    //Ensure that no duplicates have been provided
    : {!$field::names.contains("public")}? PUBLIC {$field::names.add("public");}
    | {!$field::names.contains("static")}? STATIC {$field::names.add("static");}
    | {!$field::names.contains("final")}? FINAL {$field::names.add("final");}
    ;

此处规则 field 跟踪局部变量 names 中的修饰符名称.规则 modifiers 调用规则 modifier 直到 names 包含三个值.规则 modifier 匹配任何在 names 中没有对应键的名字.请注意,这些值是手动添加到 names 的.它们可以是任意值,只要 modifier 的替代项在它匹配的标记的两侧添加相同的值即可.

Here rule field tracks the modifier names in local variable names. Rule modifiers calls rule modifier until names contains three values. Rule modifier matches any name that doesn't have a corresponding key in names. Note that the values are manually added to names. They could be any arbitrary value as long as modifier's alternatives add the same value to both sides of the token it's matching.

我的实现有点粗糙,因为修饰符最终嵌套在生成的解析树中(因为 modifiers 包含一个 modifier 和一个 modifiers包含一个 modifier 和一个 modifiers ......),但我希望你能明白.

My implementation is a little crude because the modifiers end up nesting in the produced parse tree (since modifiers contains one modifier and one modifiers that contains one modifier and one modifiers that...), but I hope you get the idea.

第三,您可以不理会糟糕的解析器并测试调用代码的完整性.这可以在使用解析器侦听器解析期间完成,也可以在使用解析器生成的 ParserRuleContext 对象解析之后完成.这将问题分为两部分:让解析器解决任意顺序的任何 X、Y、Z",让调用代码解决所有且仅 X、Y、Z".

Third, you could leave the poor parser alone and test for completeness in the calling code. This can be done during parsing using a parser listener or done after parsing using the ParserRuleContext object produced by the parser. This breaks the problem into two parts: let the parser solve "any X, Y, Z in any order" and let the calling code solve "all and only just X, Y, Z."

以下是使用侦听器方法的示例:

Here is an example using the listener approach:

//partial grammar

field : modifier* ID EOF; //accept any modifiers in any order

modifier  
    : PUBLIC
    | STATIC
    | FINAL
    ;

 

//snippet of calling code
//initialize lexer and parser

parser.addParseListener(new MyGrammarBaseListener() {
    @Override
    public void exitField(FieldContext ctx) {
        // The field rule has finished. Let's verify that no modifiers
        // were duplicated.

        //TODO check for duplicates, ensure all modifiers are specified.
        //TODO log errors accordingly.

    }
});

//Call the parser.
parser.field();

语法保持干净.修饰符可以在 ID 之前任意出现在输入中,以任意数量和任意顺序出现.调用代码通过它选择的任何方式执行测试,记录它想要的任何错误.

The grammar is kept clean. Modifiers can appear in the input arbitrarily before ID, in any number and in any order. The calling code performs the tests by whatever means it chooses, logging whatever errors it wants.

这是一个综合了我提到的三个选项的超级示例,以便更清楚地了解我在说什么.

Here is an uberexample that pulls together the three options I mentioned, to give a clearer idea of what I'm talking about.

grammar Modifiers;

//Hard-coded version : all the permutations are specified //
permutationField : permutationModifiers ID EOF;

permutationModifiers
    : PUBLIC STATIC FINAL //ABC
    | PUBLIC FINAL STATIC //ACB
    | STATIC PUBLIC FINAL //BAC
    | STATIC FINAL PUBLIC //BCA
    | FINAL PUBLIC STATIC //CAB
    | FINAL STATIC PUBLIC //CBA
    ;

// Predicate version : use semantic predicates to prevent duplicates and ensure all the modifiers are provided //

predicateField 
    locals [java.util.HashSet<String> names = new java.util.HashSet<String>();]
    : predicateModifiers ID EOF;

predicateModifiers
    //Ensure that the full number of modifiers have been provided
    : {$predicateField::names.size() < 3}? predicateModifier predicateModifiers
    | {$predicateField::names.size() == 3}? //match nothing once we have (any) three modifiers
    ;

predicateModifier
    //Ensure that no duplicates have been provided
    : {!$predicateField::names.contains("public")}? PUBLIC {$predicateField::names.add("public");}
    | {!$predicateField::names.contains("static")}? STATIC {$predicateField::names.add("static");}
    | {!$predicateField::names.contains("final")}? FINAL {$predicateField::names.add("final");}
    ;

//Listener version : test everything when the parser calls the listener //

listenerField : listenerModifier* ID EOF;

listenerModifier  
    : PUBLIC
    | STATIC
    | FINAL
    ;


PUBLIC : 'public';
STATIC : 'static';
FINAL  : 'final';
FOO    : 'foo';
ID     : [a-zA-Z]+;
WS     : [ \r\n\t]+ -> skip; 

ModifiersTest.java

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.BaseErrorListener;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.RecognitionException;
import org.antlr.v4.runtime.Recognizer;
import org.antlr.v4.runtime.misc.Nullable;

public class ModifiersTest {

    public static void main(String[] args) {

        run("public static final x", "ok");
        run("final static public x", "ok");
        run("static public static final x", "too many modifiers");
        run("static x", "missing modifiers");
        run("final final x", "missing & duplicated modifiers");
    }

    private static void run(String code, String title) {
        System.out.printf("%n---%n**Input : %s**%n%n\t%s%n%n", title, code);

        System.out.println("**Permutation Output**\n");
        runPermutationTest(code);
        System.out.println();

        System.out.println("**Predicate Output**\n");
        runPredicateTest(code);
        System.out.println();

        System.out.println("**Listener Output**\n");
        runListenerTest(code);
        System.out.println();

    }
    private static void runPermutationTest(String code) {
        ModifiersParser parser = createParser(code);

        parser.permutationField();
        System.out.println("\t(done)");
    }

    private static void runPredicateTest(String code) {
        ModifiersParser parser = createParser(code);

        parser.predicateField();
        System.out.println("\t(done)");
    }

    private static void runListenerTest(String code) {
        ModifiersParser parser = createParser(code);

        parser.addParseListener(new ModifiersBaseListener() {
            @Override
            public void exitListenerField(ModifiersParser.ListenerFieldContext ctx) {
                // The field rule has finished. Let's verify that no modifiers
                // were duplicated.

                HashSet<String> uniqueNames = new HashSet<String>();
                ArrayList<String> allNames = new ArrayList<String>();
                HashSet<String> expectedNames = new HashSet<String>();
                expectedNames.add("public");
                expectedNames.add("static");
                expectedNames.add("final");

                if (ctx.listenerModifier() != null && !ctx.listenerModifier().isEmpty()) {
                    List<ModifiersParser.ListenerModifierContext> modifiers = ctx.listenerModifier();

                    // Collect all the modifier names in a set.
                    for (ModifiersParser.ListenerModifierContext modifier : modifiers) {
                        uniqueNames.add(modifier.getText());
                        allNames.add(modifier.getText());
                    }
                }

                // Is the number of unique modifiers less than the number of
                // all given modifiers? If so, then there must be duplicates.
                if (uniqueNames.size() < allNames.size()) {
                    ArrayList<String> names = new ArrayList<String>(allNames);
                    for (String name : uniqueNames){
                        names.remove(name);
                    }
                    System.out.println("\tDetected duplicate modifiers : " + names);
                } else {
                    System.out.println("\t(No duplicate modifiers detected)");
                }

                //Are we missing any expected modifiers?
                if (!uniqueNames.containsAll(expectedNames)) {
                    ArrayList<String> names = new ArrayList<String>(expectedNames);
                    names.removeAll(uniqueNames);
                    System.out.println("\tDetected missing modifiers : " + names);
                } else {
                    System.out.println("\t(No missing modifiers detected)");
                }
            }
        });

        parser.listenerField();

        System.out.println("\t(done)");

    }

    private static ModifiersParser createParser(String code) {
        ANTLRInputStream input = new ANTLRInputStream(code);

        ModifiersLexer lexer = new ModifiersLexer(input);

        ModifiersParser parser = new ModifiersParser(new CommonTokenStream(lexer));

        BaseErrorListener errorListener = createErrorListener();

        lexer.addErrorListener(errorListener);
        parser.addErrorListener(errorListener);
        return parser;
    }

    private static BaseErrorListener createErrorListener() {
        BaseErrorListener errorListener = new BaseErrorListener() {

            @Override
            public void syntaxError(Recognizer<?, ?> recognizer, @Nullable Object offendingSymbol, int line,
                    int charPositionInLine, String msg, @Nullable RecognitionException e) {
                //Print the syntax error 
                System.out.printf("\t%s at (%d, %d)%n", msg, line, charPositionInLine);
            }
        };
        return errorListener;
    }
}

测试场景(上面代码的输出)

<小时>

输入:好的

public static final x

置换输出

(done)

谓词输出

(done)

监听器输出

(No duplicate modifiers detected)
(No missing modifiers detected)
(done)

<小时>

输入:好的

final static public x

置换输出

(done)

谓词输出

(done)

监听器输出

(No duplicate modifiers detected)
(No missing modifiers detected)
(done)

<小时>

输入:修饰符过多

static public static final x

置换输出

extraneous input 'static' expecting 'final' at (1, 14)
(done)

谓词输出

no viable alternative at input 'static' at (1, 14)
(done)

监听器输出

Detected duplicate modifiers : [static]
(No missing modifiers detected)
(done)

<小时>

输入:缺少修饰符

static x

置换输出

no viable alternative at input 'staticx' at (1, 7)
(done)

谓词输出

no viable alternative at input 'x' at (1, 7)
(done)

监听器输出

(No duplicate modifiers detected)
Detected missing modifiers : [final, public]
(done)

<小时>

输入:缺少 &重复修饰符

final final x

置换输出

no viable alternative at input 'finalfinal' at (1, 6)
(done)

谓词输出

no viable alternative at input 'final' at (1, 6)
(done)

监听器输出

Detected duplicate modifiers : [final]
Detected missing modifiers : [static, public]
(done)

这篇关于ANTLR4:只匹配所有输入选项一次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆