Antlr 4 解析大型 c 文件需要永远 [英] Antlr 4 parsing large c file takes forever
问题描述
我有一个大型 c 代码文件 (>9000 LoC) 并尝试使用以下语法解析它:
I have a large c-code file (>9000 LoC) and attempt to parse it using this grammar:
https://github.com/antlr/grammars-v4/blob/master/c/C.g4
我等了一个多小时才中止.该机器是带有4GB内存的Core 2 Duo L9400.最大 java vm-heap-size 设置为 2GB.它不会产生任何解析错误,但它根本没有完成.
I waited for over an hour before aborting. The machine is a Core 2 Duo L9400 with 4GB of ram. Maximum java vm-heap-size is set to 2GB. It does not produce any parse errors, but it simply doesn't finish.
在做了一些研究之后,我将预测模式设置为 SLL,这会产生一个几秒钟内没有可行的输入替代方案".
After doing some research, I set the prediction mode to SLL, which produces a "no viable alternative at input" within seconds.
接下来,我将预测模式设置为 LL_EXACT_AMBIG_DETECTION 并将一个 DiagnosticErrorListener 附加到解析器.这会产生很多歧义报告",主要是关于声明/声明说明符.我认为这会迫使解析器非常频繁地回溯,这可能是解析时间过长的解释?
Next, I set the prediction mode to LL_EXACT_AMBIG_DETECTION and attached a DiagnosticErrorListener to the parser. This produces a lot of "Ambiguity reports", mainly concerning declarations/declaration-specifiers. I assume this forces the parser to backtrack extremely often, which I is probably the explaination for the long parsing time?
除了尝试重写语法之外,我还能做些什么来提高性能?
Is there anything I can do to improve performance other than attempting to rewrite the grammar?
感谢任何帮助;)
推荐答案
首先,重要的是要注意 ANTLR 4 从不 在解析过程中回溯.
First of all, it's important to note that ANTLR 4 never backtracks during parsing.
您所指的语法不是 SLL,这会阻止您以最快的模式使用 ANTLR 4.然而,在实验过程中,我们能够识别出一条规则,我们将其更改为语法 SLL.您可以通过 antlr-interest 邮件列表获取更改后的语法.我现在在听音乐会,所以我无法访问它.
The grammar you are referring to is not SLL, which prevents you from using ANTLR 4 in its fastest mode. However, during or experiments we were able to identify a single rule which we altered to make the grammar SLL. You may be able to obtain the altered grammar via the antlr-interest mailing list. I'm at a concert now so I don't have access to it.
这篇关于Antlr 4 解析大型 c 文件需要永远的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!