如何根据语法分割输入 [英] How to split input according to the grammar

查看:73
本文介绍了如何根据语法分割输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在尝试为路由器中生成的日志文件构建解析器. 我们成功构建了该文件,并能够在特定文件中打印有效语言.

We are trying to build a parser for log file generated in the router. We successfully build that and able to print the valid language in particular file.

但是,如果根据语法输入无效,那么我们希望将其打印在其他文件中.我们尝试了一些措施,但无法正常工作. 您能否建议我们可以做到的方式?如果可能的话,请给出工作示例.

But if the input is not valid according to the grammar, then we want to print it in the different file. We tried something and it's not working properly. Can you please suggest the way by which we can do it? And if possible, kindly give the working example.

这是我们尝试过的.

我们没有使用任何特定的IDE,仅使用了文本编辑器. vANTLR-4.5

We are not using any specific IDE, just a text editor. vANTLR-4.5

我们的输入:(input.txt)

Our input: (input.txt)

Dec 24 15:38:13 103.199.144.14 firewall,info NAT: src-nat2 srcnat: in:(none) out:ether1-WAN, proto TCP (SYN), 10.20.114.212:59559->86.96.88.147:6882, len 52
Dec 24 15:38:13 103.199.144.14 firewall,info src-nat2: forward: in:<pppoe-PDR242> out:ether1-WAN, proto TCP (SYN), 10.20.124.8:50055->111.111.111.111:80, len 52

第一行是无效语言.并且不应该通过语法,因此必须打印到failure.txt中,而部分打印在success.txt文件中.

Where the first line is invalid language. And shouldn't pass through the grammar, and hence must print into failure.txt, But is partially printing in the success.txt file.

第二行是有效的,并且在success.txt文件中正确打印,如下所示的输出文件所示.

Whereas the second line is valid, and is printing properly in the success.txt file as shown in the output file shown below.

我们得到的输出:(success.txt)

Output, that we are getting: (success.txt)

Dec 24 15:38:13, 103.199.144.14, .20.114.212, len, 52, , null
Dec 24 15:38:13, 103.199.144.14, pppoe-PDR242, TCP, 10.20.124.8:50055, 111.111.111.111:80, null

语法,我们使用的是:(sys.g)

Grammar, we are using:(sys.g)

grammar sys;

r: IDENT NUM time ip x+ user xout proto xuser ipfull xtra ipfull1 xtra1 (xipfull xtra ipfull2 xtra2 xipfull xtra3)*; 
time: NUM COLN NUM COLN NUM;
ip: NUM DOT NUM DOT NUM DOT NUM ;
ipfull: NUM DOT NUM DOT NUM DOT NUM COLN NUM ;
ipfull1: NUM DOT NUM DOT NUM DOT NUM COLN NUM ;
ipfull2: NUM DOT NUM DOT NUM DOT NUM COLN NUM ;
xipfull: NUM DOT NUM DOT NUM DOT NUM COLN NUM ;

x: (IDENT | COMMA | COLN | BRAC | HYPHN | NUM)+ LTHAN;
user: (IDENT | HYPHN | DOT | NUM)+ ;
xout: GTHAN IDENT+ COLN IDENT+ HYPHN IDENT+ (DOT IDENT)* COMMA IDENT;
proto: IDENT ;
xuser: (IDENT | BRAC | COMMA)+ ;
xtra: HYPHN GTHAN ;
xtra1: COMMA IDENT (BRAC | NUM);
xtra2: BRAC xtra;
xtra3: COMMA IDENT NUM;

IDENT: ('a'..'z' | 'A'..'Z')('a'..'z' | 'A'..'Z' | '0'..'9')* ;
NUM: ('0'..'9')+ ;
LTHAN: '<' ;
GTHAN: '>' ;
COLN: ':';
COMMA: ',';
BRAC: '(' | ')' ;
HYPHN: '-';
DOT: '.';
WS : (' ' | '\t' | '\r' | '\n')+ -> skip ;

我们正在使用语法生成的解析器和词法分析器的主类.

Our main class where we are using Parser and lexer generated by grammar.

import org.antlr.v4.runtime.ANTLRFileStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTree;
import java.io.*;
import org.antlr.v4.runtime.*;

public class SysLogCheck {
    public static void main(String[] args) throws Exception {

        long startTime = System.currentTimeMillis();

        BufferedReader br = new BufferedReader(new FileReader("test123.txt"));
        String s = null;
        //FileWriter out = new FileWriter("abc.txt");
        PrintWriter success = new PrintWriter(new FileWriter("success.csv"));
        PrintWriter failure = new PrintWriter(new FileWriter("failure.csv"));
        while((s=br.readLine())!=null)
        {
            ANTLRInputStream input = new ANTLRInputStream(s);
            sysLexer lexer = new sysLexer(input);
            CommonTokenStream tokens = new CommonTokenStream(lexer);
            sysParser parser = new sysParser(tokens);
            ParseTree tree = parser.r();
            EvalVisitor visitor = new EvalVisitor();
            if((visitor.visit(tree)).equals("failure")) // here visit method of EvalVisitor class returns "failure" then the content should be written 
                                                        //in failure file and else it should be written in success file 
                                                        // but this is not working
            {
                failure.println(s);
            }
            else
            {
                success.println(visitor.visit(tree));
            }
        }
        failure.flush();
        failure.close();
        success.flush();
        success.close();

        long stopTime = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println(elapsedTime);
    }
}

我们的EvalVisitor(主要访问者类别)代码:

Our EvalVisitor (main visitor class)code:

import org.antlr.v4.runtime.tree.ParseTree;
import java.io.*;

public class EvalVisitor extends sysBaseVisitor
{
        class LogEntry {
        String ident1;
        String dayNum;
        String time;
        String ip;
        String ipfull;
        String user;
        String proto;
        String ipfull1;
        String ipfull2;
        String x;

      }


      static LogEntry logEntry;

      @Override
      public Object visit(ParseTree tree) {
        /* Setup logentry used by all visitors (this case, there is only a single visitor...)*/
        logEntry = new LogEntry();

        final Object o = super.visit(tree);

        //our logic to check whether our input contains "<" or not
        if((logEntry.x).contains("<") )
        {
            return logEntry.ident1 +" " + logEntry.dayNum + " " + logEntry.time+ ", " + logEntry.ip+ ", " + logEntry.user+ ", " + logEntry.proto+ ", " + logEntry.ipfull+ ", " + logEntry.ipfull1+ ", " + logEntry.ipfull2;
        }       
            return "failure"; //else return failure
      }

      StringBuilder stringBuilder;



      @Override
      public Object visitR(sysParser.RContext ctx) {
        logEntry.ident1 = ctx.IDENT().getText();
        logEntry.dayNum = ctx.NUM().getText();
        return super.visitR(ctx);
      }

      @Override
      public Object visitTime(sysParser.TimeContext ctx) {
        logEntry.time = ctx.getText();
        return super.visitTime(ctx);
      }

      @Override
      public Object visitIp(sysParser.IpContext ctx) {
        logEntry.ip = ctx.getText();
        return super.visitIp(ctx);
      }

      @Override
      public Object visitIpfull(sysParser.IpfullContext ctx) {
        logEntry.ipfull = ctx.getText();
        return super.visitIpfull(ctx);
      }

      @Override
      public Object visitIpfull1(sysParser.Ipfull1Context ctx) {
        logEntry.ipfull1 = ctx.getText();
        return super.visitIpfull1(ctx);
      }

      @Override
      public Object visitIpfull2(sysParser.Ipfull2Context ctx) {
        logEntry.ipfull2 = ctx.getText();
        return super.visitIpfull2(ctx);
      }

      @Override
      public Object visitXipfull(sysParser.XipfullContext ctx) {
        return super.visitXipfull(ctx);
      }

      @Override
      public Object visitX(sysParser.XContext ctx) {
        logEntry.x = ctx.getText();
        return super.visitX(ctx);
      }

      @Override
      public Object visitUser(sysParser.UserContext ctx) {
        logEntry.user = ctx.getText();
        return super.visitUser(ctx);
      }

      @Override
      public Object visitXuser(sysParser.XuserContext ctx) {
        return super.visitXuser(ctx);
      }

      @Override
      public Object visitXout(sysParser.XoutContext ctx) {
        return super.visitXout(ctx);
      }

      @Override
      public Object visitProto(sysParser.ProtoContext ctx) {
        logEntry.proto = ctx.getText();
        return super.visitProto(ctx);
      }

      @Override
      public Object visitXtra(sysParser.XtraContext ctx) {
        return super.visitXtra(ctx);
      }

      @Override
      public Object visitXtra1(sysParser.Xtra1Context ctx) {
        return super.visitXtra1(ctx);
      }

      @Override
      public Object visitXtra2(sysParser.Xtra2Context ctx) {
        return super.visitXtra2(ctx);
      }

      @Override
      public Object visitXtra3(sysParser.Xtra3Context ctx) {
        return super.visitXtra3(ctx);
      }   

 }

推荐答案

如果您要做的只是使用您认为有效的行中的数据创建文件,则ANTLR可能会过大(我在邮件中提到了这一点)列表线程).我在这里假设您可能想对解析后的结果做更多​​的事情(或者您只是真的要使用ANTLR来做到这一点)

If all you're trying to do is create a file with data from the lines you consider valid, then ANTLR is probably overkill (I mentioned this in the mailing list thread). I'll assume here that you may want to do more with the parsed results (or that you just really want to use ANTLR for this)

我看到您已经在分别解析每个输入行.

I see that you're already parsing each input line individually.

您的"r"解析器规则似乎可以识别有效行和无效"行.我建议加强语法以定义您认为有效的行.如果您的语法仅接受(即识别")有效行,则任何无效行都将引发RecognitionException.

It appears that your 'r' parser rule recognizes valid as well as "invalid" lines. I'd suggest tightening up the grammar to define what you consider to be a valid line. If your grammar only accepts (i.e. "recognizes") valid lines, then any invalid line will throw a RecognitionException.

您没有提到使第2行有效和使第1行无效的原因,因此我无法就如何纠正"r"规则提出真正的建议.

You don't mention what makes line 2 valid and line 1 invalid, so I can't really make a recommendation on how to correct your 'r' rule.

(关于语法的评论很多,这表明您正在尝试学习足够"的ANTLR才能通过.我不认为您是在要求对语法进行全面的批评,因此我将跳过这些细节.)

(There's a lot to critique about your grammar, and it indicates that you're trying to learn "just enough" ANTLR to get by. I don't think you're asking for a full critique of your grammar, so I'll skip the details.)

检查完代码后,您似乎只想识别特定类型的日志行,并从这些行中捕获数据. 如果这就是您要完成的工作,然后查看Java正则表达式和捕获组.这比使用ANTLR要简单得多(而且我是ANTLR的忠实拥护者).

After examination of your code, it appears that you're just wanting to identify log lines of a particular type, and to capture data from those lines. If that's what you're trying to accomplish, then look into Java Regular expressions and capture groups. It'll be a lot simpler than using ANTLR (and I'm a pretty big fan of ANTLR).

这篇关于如何根据语法分割输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆