Antlr解析器StackOverflowException [英] Antlr parser StackOverflowException

查看:94
本文介绍了Antlr解析器StackOverflowException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的Xamarin.ios C#项目,我在Antlr解析器语法中具有以下内容:

I have the following in my Antlr parser grammar for my Xamarin.ios C# project:

mathToken  
    : DIGIT             #Digit
    | NULL              #Null
    | LESSTHAN          #LessThan
    | GREATERTHAN       #GreaterThan
    | anyLessThanOrEqual   #LessThanOrEqual
    // about 30 more options here

mathTokenList
    :   mathToken mathTokenList   #CompoundMathTokens
    |   mathToken                 #SingleMathToken
    ;

这对于10个令牌,100个甚至1000个令牌的列表非常有用.但是一旦列表足够长,就会导致StackOverflowException,因为生成的MathTokenList递归调用自身,并在顶部带有一些侦听器代码:

This works great for a list of 10 tokens, 100, or even 1000. But once the list gets long enough, it leads to a StackOverflowException, as the generated MathTokenList recursively calls itself, with some listener code at the top:

MyNamespace.HandleToken(MyTokenClass parserToken, List<MyOtherTokenClass> buildingList) in MyNamespace.ManualFileParser.cs:58
MyNamespace.CustomStringReaderParseListener.VisitDefault(Antlr4.Runtime.Tree.TerminalNodeImpl node) in MyNamespace.CustomStringReaderParseListener.cs:228
MyNamespace.CustomStringReaderParseListener.VisitTerminal(Antlr4.Runtime.Tree.TerminalNodeImpl node) in MyNamespace.CustomStringReaderParseListener.cs:94
Antlr4.Runtime.Parser.Consume()
Antlr4.Runtime.Parser.Match(int ttype)  
Antlr.StringReaderParser.mathToken() 
Antlr.StringReaderParser.mathTokenList() // lots of calls here . . . seems to be
Antlr.StringReaderParser.mathTokenList() // one for each token.
Antlr.StringReaderParser.mathTokenList() // ( in antlr generated code)

是否可以重组解析器语法以避免此类问题?还是我需要做更多的事情,以使解析器永远看不到一长串的mathTokens?

Is it possible to restructure the parser grammar to avoid this kind of problem? Or do I need to do something more involved so that the parser never sees a really long list of mathTokens?

我可以通过在解析器看到数字列表之前组合数字列表来解决这个问题.但是它最终可能会与其他一些令牌类型一起重现.

I could put a band-aid on the problem by combining lists of digits before the parser sees them. But it would likely recur eventually with some other token type.

推荐答案

您不能完全避免此问题.实际上,每个规则调用都是在生成的解析器中的函数调用(递归下降解析器原理).如果您的输入仅够复杂,您肯定会达到可用的堆栈限制.在大多数(所有?)编译器中,您可以增加应用程序的堆栈空间,但这也只能在一定程度上有所帮助.

You cannot entirely avoid this problem. Each rule invocation is actually a function call in the generated parser (recursive descent parser principle). If your input is only complex enough you will certainly hit the available stack limit. In most (all?) compilers you can increase the stackspace for your app, but this also helps only up to a certain extent.

但是,正如@BartKiers所建议的那样,您可以使用循环而不是递归(编程中经常使用的原理)来减少调用次数. mathTokenList 规则非常类似于您在yacc/bison中定义列表的方式.在ANTLR中,您可以改用循环运算符,这使它更易于阅读且占用更少的资源:

However, as @BartKiers suggested you can lower the invocation count by using loops instead of recursions (a principle often used in programming). The mathTokenList rule looks very much like how you would define a list in yacc/bison. In ANTLR you can use loop operators instead, which make this better to read and less resource intensive:

mathTokenList: mathToken+;

是前往此处的方式.这将最终导致在您的 mathTokenList 方法中执行一个循环(查看生成的解析器代码,有时会很有启发性).

is the way to go here. This will end up in a loop being executed in your mathTokenList method (look at the generated parser code, sometimes quite enlightening).

这篇关于Antlr解析器StackOverflowException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆