最好的解析器生成器,用于解析C ++中的许多小文本? [英] Best parser generator for parsing many small texts in C++?
问题描述
出于性能原因,我将C#库移植到C ++.在正常运行期间,该库尤其需要解析平均长度少于150个字符的大约150'000个数学表达式(认为是excel公式).
I am, for performance reason, porting a C# library to C++. During normal operation, this library needs, amongst other things, to parse about 150'000 math expressions (think excel formulas) with an average length of less than 150 characters.
在C#版本中,我使用了GOLD解析器来生成解析代码.它可以在一秒钟内解析所有150'000个表达式.
In the C# version, I used GOLD parser to generate parsing code. It can parse all 150'000 expressions in under one second.
因为我们正在考虑扩展语言,所以我认为转向C ++可能是转换为ANTLR的好机会.我已经将(简单)语法移植到ANTLR并从中生成了C代码.解析150'000个表达式需要花费12秒钟以上的时间,因为对于每个表达式,我需要创建一个新的ANTL3_INPUT_STREAM,令牌流,词法分析器和解析器-至少在3.4版中,没有方法可以重用它们.
Because we were thinking about extending our language, I figured the move to C++ might be a good chance to change to ANTLR. I have ported the (simple) grammar to ANTLR and generated C code out of it. Parsing the 150'000 expressions takes over 12 seconds, because for each expression, I need to create a new ANTL3_INPUT_STREAM, token stream, lexer and parser - there is, at least in version 3.4, no way to reuse them.
我会很感激有人可以给我一个建议,而不是使用什么-GOLD当然是一个选项,尽管生成C ++或C代码似乎比C#版本复杂得多.我的语法兼容LALR和LL(1).最重要的问题是解析少量输入的性能.
I'd be grateful is someone could give me a recommendation what to use instead - GOLD is of course an option though generating C++ or C code seems a lot more complicated than the C# variety. My grammar is LALR and LL(1) compatible. Paramount concern is parsing performance on small inputs.
推荐答案
我会尝试boost :: spirit.它通常非常快(即使是解析整数之类的简单事物,它也比C函数atoi http://alexott.blogspot.com/2010/01/boostspirit2-vs-atoi.html )
I would try boost::spirit. It is often extreamly fast (even for parsing simple things like an integer it can be faster than the C function atoi http://alexott.blogspot.com/2010/01/boostspirit2-vs-atoi.html)
它有很多优点:仅标头,因此具有依赖地狱,自由许可证.
It has nice things : header only, so dependency hell, liberal licence.
但是请注意,学习曲线很困难.它是现代的C ++(没有指针,但是有很多模板,并且编译错误非常令人沮丧),因此来自C或C#,您可能会不太舒服.
However be warned that the learning curve is difficult. It's modern C++ (no pointer, but a lot of template and very frustrating compiling errors), so coming from C or C#, you might not be very comfortable.
这篇关于最好的解析器生成器,用于解析C ++中的许多小文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!