使用ANTLR解析日志文件 [英] Using ANTLR to parse a log file

查看:112
本文介绍了使用ANTLR解析日志文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是从ANTLR开始,尝试从日志文件中解析出一些模式

I'm just about starting with ANTLR and trying to parse some pattern out of a log file

例如:日志文件:

7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog-任务0 输入 : uk.project.Evaluation.Input.Function1(selected = [" red," yellow]){}

7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function1(selected=["red","yellow"]){}

7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog-任务0 输出 : uk.org.project.Evaluation.Output.Function2(selected = ["Rocket"]){}

7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function2(selected=["Rocket"]){}

7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog-任务0 输入 : uk.project.Evaluation.Input.Function3(selected = ["blue","yellow"]){}

7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function3(selected=["blue","yellow"]){}

7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog-任务0 输出 : uk.org.project.Evaluation.Output.Function4(selected = ["Speech"]){}

7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function4(selected=["Speech"]){}

现在,我必须解析该文件才能找到"Evaluation.Input.Function1",它的值是"red","yellow"和"Evaluation.Output.Function2"以及值"Rocket",而忽略其他所有内容下面的其他2个输入和输出功能3,4.有很多这样的输入和输出功能,我必须找到这类输入/输出功能.这是我尝试的语法,不起作用.任何帮助,将不胜感激.作为我第一次编写语法和ANTLR的尝试,它现在变得相当艰巨.

Now I have to parse this file to only find 'Evaluation.Input.Function1' and it's values 'red' and 'yellow' and 'Evaluation.Output.Function2' and values 'Rocket' and ignore everything else and similarly the other 2 input and output functions 3,4 below. There are many such Input and Output functions and I have to find such sets of input/output functions. This is my attempted grammar which is not working. Any help would be appreciated. Being my first attempt at writing grammar and ANTLR it is becoming quite daunting now..

grammar test;

    tag : inputtag+ outputtag+ ;
//Input tag consists of atleast one inputfunction with one or more values
inputtag:  INPUTFUNCTIONS INPUTVALUES+;

//output tag consists of atleast one ontput function with one or more output values
outputtag : OUTPUTFUNCTIONS OUTPUTVALUES+;

INPUTFUNCTIONS 
 : INFUNCTION1 | INFUNCTION2;

OUTPUTFUNCTIONS
 :OUTFUNCTION1 | OUTFUNCTION2;

// Possible input functions in the log file
fragment INFUNCTION1
 :'Evaluation.Input.Function1';

fragment INFUNCTION2
 :'Evaluation.Input.Function3';

//Possible values in the input functions
INPUTVALUES
 : 'red' | 'yellow' | 'blue';

// Possible output functions in the log file 
fragment OUTFUNCTION1
 :'Evaluation.Output.Function2';

fragment OUTFUNCTION2
 :'Evaluation.Output.Function4';

//Possible ouput values in the output functions
fragment OUTPUTVALUES
 : 'Rocket' | 'Speech';

推荐答案

当您只对要解析的文件的一部分感兴趣时,则不需要解析器,也无需为整个格式编写语法文件.仅词法语法和ANTLR的options{filter=true;}就足够了.这样,您将仅获取在语法中定义的标记,而忽略文件的其余部分.

When you're only interested in a part of the file you're parsing, you don't need a parser and write a grammar for the entire format of the file. Only a lexer-grammar and ANTLR's options{filter=true;} will suffice. That way, you will only grab the tokens you defined in your grammar and ignore the rest of the file.

这是一个快速演示:

lexer grammar TestLexer;

options{filter=true;}

@lexer::members {
  public static void main(String[] args) throws Exception {
    String text = 
        "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function1(selected=[\"red\",\"yellow\"]){}\n"+
        "\n"+
        "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function2(selected=[\"Rocket\"]){}\n"+
        "\n"+
        "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function3(selected=[\"blue\",\"yellow\"]){}\n"+
        "\n"+
        "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function4(selected=[\"Speech\"]){}";
    ANTLRStringStream in = new ANTLRStringStream(text);
    TestLexer lexer = new TestLexer(in);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    for(Object obj : tokens.getTokens()) {
        Token token = (Token)obj;
        System.out.println("> token.getText() = "+token.getText());
    }
  }
}

Input
  :  'Evaluation.Input.Function' '0'..'9'+ Params   
  ;

Output
  :  'Evaluation.Output.Function' '0'..'9'+ Params
  ;

fragment
Params
  :  '(selected=[' String ( ',' String )* '])'
  ;

fragment
String
  :  '"' ( ~'"' )* '"'
  ;

现在做:

javac -cp antlr-3.2.jar TestLexer.java
java -cp .:antlr-3.2.jar TestLexer // or on Windows: java -cp .;antlr-3.2.jar TestLexer

,您将看到以下内容打印到控制台:

and you'll see the following being printed to the console:

> token.getText() = Evaluation.Input.Function1(selected=["red","yellow"])
> token.getText() = Evaluation.Output.Function2(selected=["Rocket"])
> token.getText() = Evaluation.Input.Function3(selected=["blue","yellow"])
> token.getText() = Evaluation.Output.Function4(selected=["Speech"])

这篇关于使用ANTLR解析日志文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆