如何在Java中对ANTLR解析器进行多线程 [英] How to multi-thread an ANTLR parser in java

查看:245
本文介绍了如何在Java中对ANTLR解析器进行多线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的程序运行缓慢,无法读取文件,然后使用antlr语法进行解析.为了提高性能,我想对解析进行多线程处理?

I have my program that is proving slow in reading a file and then parsing it with antlr grammar. To improve performance of this I would like to multi-thread the parsing?

读取文件:

    LogParser pa = new LogParser();
    LogData logrow;
    String inputLine;
    int a=0;
    try {
        //feed line by line
        FileReader fr = new FileReader(jFileChooser1.getSelectedFile());
        BufferedReader reader = new BufferedReader(fr);
        while ((inputLine = reader.readLine()) != null)
        {
            try {
                a++;
                jProgressBar.setValue(a);
                pa.parse(inputLine);  //decode the line
            } catch ... catches errors and send to logger
            } finally {

                logrow=new LogData(pa,a);
                mLogTable.addRow(logrow);//store the decoded line
            }

        } 
        reader.close();
    } catch ... catches errors and send to logger

代码使用pa.parse(inputLine);解析该行,该行将输入行发送到ANTLRStringStream然后是CharStream,然后进行解析. 接下来的logrow=new LogData(pa,a);获取将存储在我的表中的解码值.

The code parses the line with pa.parse(inputLine); which sends the input line in to an ANTLRStringStream and then a CharStream and is then parsed. Next logrow=new LogData(pa,a); gets the decoded values which will be storred in my table.

我的分析显示热点位于lexer和parser类中,这些类是根据我的语法(即LogGrammarLexer和LogGrammarParser)构建的. 希望足够清楚...

My profiling shows the hotspots being in the lexer and parser classes which are built from my grammar (i.e. LogGrammarLexer and LogGrammarParser). Hope it's clear enough...

解析:

LogGrammarLexer lexer = new LogGrammarLexer(inStream);
CommonTokenStream tokens = new CommonTokenStream(lexer);
decoded = new LogGrammarParser(tokens);
try {
    failurePosition="";
    decoded.logLine();
} catch (RecognitionException e) {
    failurePosition=Integer.toString(e.charPositionInLine);
} catch (Exception e) {
  failurePosition="-3";
  throw e;
} finally {
    return decoded;//TODO: see if return is necessary as decoded is now a field in class
}

我也一直在阅读,现在知道多线程文件I/O是无用的... 无论如何,这是我的语法的内置类中的性能,需要对文件中的每一行进行复杂的解码/解析,这是我需要改进的地方.

Also I have been reading up and now know that multi-threading file I/O is useless... Anyway it is the performances inside the built classes for my grammar doing complex decoding/parsing of each line in my file what I need to improve.

所以我的问题是如何使其成为多线程,

So my problem is how to make it multi-threaded,

欢呼

推荐答案

如果每个文件行都是一个单独的可解析实体,那么您可以具有多个线程,如下所示:

If each file line is a separate parse-able entity, then you can have multiple threads as follows:

  • 一个线程(最有可能是主线程)从文件中读取行并将结果存储在工作队列中
  • 一个线程池,每个线程将从前面提到的工作队列中拉出一个文件行并解析数据
  • 线程之间共享的文件行队列必须同步

仅在多核CPU上运行时,性能才会提高.

This will only see an increase in performance if run on a multi-core CPU.

此外,这仅在每个文件行是一个如上所述的单独的可分析实体时才有效.如果可解析实体的概念跨越多行,或者是整个文件,则线程化将无济于事.此外,如果文件中各行的顺序很重要,则多线程可能会引起问题,因为这些行可能无法按顺序解析.

Additionally, this will only work if each file line is a separate parse-able entity as mentioned before. If the concept of an parse-able entity spans multiple lines, or is the entire file then threading will not help. Additionally, if the order of the lines in the file is important, then multi threading may cause issues, since the lines may be parsed out of order.

这是一个标准的生产者/消费者问题,下面是一些有用的链接:

This is a standard producer/consumer problem, here are some useful links:

  • Java Thread Pools
  • Thread pools and work queues
  • ThreadpoolExecutor programming examples

这篇关于如何在Java中对ANTLR解析器进行多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆