使用 ANTLR 解析日志文件 [英] Using ANTLR to parse a log file

查看:23
本文介绍了使用 ANTLR 解析日志文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚开始使用 ANTLR 并尝试从日志文件中解析一些模式

I'm just about starting with ANTLR and trying to parse some pattern out of a log file

例如:日志文件:

7114422 2009-07-16 15:43:07,078[LOGTHREAD] INFO StatusLog - 任务 0输入 :uk.project.Evaluation.Input.Function1(selected=["red","yellow"]){}

7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function1(selected=["red","yellow"]){}

7114437 2009-07-16 15:43:07,093[LOGTHREAD] INFO StatusLog - 任务 0输出 :uk.org.project.Evaluation.Output.Function2(selected=["Rocket"]){}

7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function2(selected=["Rocket"]){}

7114422 2009-07-16 15:43:07,078[LOGTHREAD] INFO StatusLog - 任务 0输入 :uk.project.Evaluation.Input.Function3(selected=["blue","yellow"]){}

7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function3(selected=["blue","yellow"]){}

7114437 2009-07-16 15:43:07,093[LOGTHREAD] INFO StatusLog - 任务 0输出 :uk.org.project.Evaluation.Output.Function4(selected=["Speech"]){}

7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function4(selected=["Speech"]){}

现在我必须解析这个文件才能找到'Evaluation.Input.Function1',它的值是'red'和'yellow'和'Evaluation.Output.Function2'以及值'Rocket',而忽略其他所有的东西,类似的其他2个输入输出功能3,4如下.有很多这样的输入和输出函数,我必须找到这样的输入/输出函数集.这是我尝试的语法不起作用.任何帮助,将不胜感激.作为我第一次尝试编写语法和 ANTLR,现在变得非常令人生畏..

Now I have to parse this file to only find 'Evaluation.Input.Function1' and it's values 'red' and 'yellow' and 'Evaluation.Output.Function2' and values 'Rocket' and ignore everything else and similarly the other 2 input and output functions 3,4 below. There are many such Input and Output functions and I have to find such sets of input/output functions. This is my attempted grammar which is not working. Any help would be appreciated. Being my first attempt at writing grammar and ANTLR it is becoming quite daunting now..

grammar test;

    tag : inputtag+ outputtag+ ;
//Input tag consists of atleast one inputfunction with one or more values
inputtag:  INPUTFUNCTIONS INPUTVALUES+;

//output tag consists of atleast one ontput function with one or more output values
outputtag : OUTPUTFUNCTIONS OUTPUTVALUES+;

INPUTFUNCTIONS 
 : INFUNCTION1 | INFUNCTION2;

OUTPUTFUNCTIONS
 :OUTFUNCTION1 | OUTFUNCTION2;

// Possible input functions in the log file
fragment INFUNCTION1
 :'Evaluation.Input.Function1';

fragment INFUNCTION2
 :'Evaluation.Input.Function3';

//Possible values in the input functions
INPUTVALUES
 : 'red' | 'yellow' | 'blue';

// Possible output functions in the log file 
fragment OUTFUNCTION1
 :'Evaluation.Output.Function2';

fragment OUTFUNCTION2
 :'Evaluation.Output.Function4';

//Possible ouput values in the output functions
fragment OUTPUTVALUES
 : 'Rocket' | 'Speech';

推荐答案

当您只对正在解析的文件的一部分感兴趣时,您不需要解析器并为整个格式编写语法文件.只有词法分析器语法和 ANTLR 的 options{filter=true;} 就足够了.这样,您将只获取您在语法中定义的标记,而忽略文件的其余部分.

When you're only interested in a part of the file you're parsing, you don't need a parser and write a grammar for the entire format of the file. Only a lexer-grammar and ANTLR's options{filter=true;} will suffice. That way, you will only grab the tokens you defined in your grammar and ignore the rest of the file.

这是一个快速演示:

lexer grammar TestLexer;

options{filter=true;}

@lexer::members {
  public static void main(String[] args) throws Exception {
    String text = 
        "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function1(selected=[\"red\",\"yellow\"]){}\n"+
        "\n"+
        "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function2(selected=[\"Rocket\"]){}\n"+
        "\n"+
        "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function3(selected=[\"blue\",\"yellow\"]){}\n"+
        "\n"+
        "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function4(selected=[\"Speech\"]){}";
    ANTLRStringStream in = new ANTLRStringStream(text);
    TestLexer lexer = new TestLexer(in);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    for(Object obj : tokens.getTokens()) {
        Token token = (Token)obj;
        System.out.println("> token.getText() = "+token.getText());
    }
  }
}

Input
  :  'Evaluation.Input.Function' '0'..'9'+ Params   
  ;

Output
  :  'Evaluation.Output.Function' '0'..'9'+ Params
  ;

fragment
Params
  :  '(selected=[' String ( ',' String )* '])'
  ;

fragment
String
  :  '"' ( ~'"' )* '"'
  ;

现在做:

javac -cp antlr-3.2.jar TestLexer.java
java -cp .:antlr-3.2.jar TestLexer // or on Windows: java -cp .;antlr-3.2.jar TestLexer

您将看到以下内容被打印到控制台:

and you'll see the following being printed to the console:

> token.getText() = Evaluation.Input.Function1(selected=["red","yellow"])
> token.getText() = Evaluation.Output.Function2(selected=["Rocket"])
> token.getText() = Evaluation.Input.Function3(selected=["blue","yellow"])
> token.getText() = Evaluation.Output.Function4(selected=["Speech"])

这篇关于使用 ANTLR 解析日志文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆