最好的解析器生成器,用于解析C ++中的许多小文本? [英] Best parser generator for parsing many small texts in C++?

查看:111
本文介绍了最好的解析器生成器,用于解析C ++中的许多小文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于性能原因,我将C#库移植到C ++.在正常运行期间,该库尤其需要解析平均长度少于150个字符的大约150'000个数学表达式(认为是excel公式).

I am, for performance reason, porting a C# library to C++. During normal operation, this library needs, amongst other things, to parse about 150'000 math expressions (think excel formulas) with an average length of less than 150 characters.

在C#版本中,我使用了GOLD解析器来生成解析代码.它可以在一秒钟内解析所有150'000个表达式.

In the C# version, I used GOLD parser to generate parsing code. It can parse all 150'000 expressions in under one second.

因为我们正在考虑扩展语言,所以我认为转向C ++可能是转换为ANTLR的好机会.我已经将(简单)语法移植到ANTLR并从中生成了C代码.解析150'000个表达式需要花费12秒钟以上的时间,因为对于每个表达式,我需要创建一个新的ANTL3_INPUT_STREAM,令牌流,词法分析器和解析器-至少在3.4版中,没有方法可以重用它们.

Because we were thinking about extending our language, I figured the move to C++ might be a good chance to change to ANTLR. I have ported the (simple) grammar to ANTLR and generated C code out of it. Parsing the 150'000 expressions takes over 12 seconds, because for each expression, I need to create a new ANTL3_INPUT_STREAM, token stream, lexer and parser - there is, at least in version 3.4, no way to reuse them.

我会很感激有人可以给我一个建议,而不是使用什么-GOLD当然是一个选项,尽管生成C ++或C代码似乎比C#版本复杂得多.我的语法兼容LALR和LL(1).最重要的问题是解析少量输入的性能.

I'd be grateful is someone could give me a recommendation what to use instead - GOLD is of course an option though generating C++ or C code seems a lot more complicated than the C# variety. My grammar is LALR and LL(1) compatible. Paramount concern is parsing performance on small inputs.

推荐答案

我会尝试boost :: spirit.它通常非常快(即使是解析整数之类的简单事物,它也比C函数atoi http://alexott.blogspot.com/2010/01/boostspirit2-vs-atoi.html )

I would try boost::spirit. It is often extreamly fast (even for parsing simple things like an integer it can be faster than the C function atoi http://alexott.blogspot.com/2010/01/boostspirit2-vs-atoi.html)

http://boost-spirit.com/home/

它有很多优点:仅标头,因此具有依赖地狱,自由许可证.

It has nice things : header only, so dependency hell, liberal licence.

但是请注意,学习曲线很困难.它是现代的C ++(没有指针,但是有很多模板,并且编译错误非常令人沮丧),因此来自C或C#,您可能会不太舒服.

However be warned that the learning curve is difficult. It's modern C++ (no pointer, but a lot of template and very frustrating compiling errors), so coming from C or C#, you might not be very comfortable.

这篇关于最好的解析器生成器,用于解析C ++中的许多小文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆