C ++ boost :: spirit解析嵌入式语言 [英] C++ boost::spirit parsing embedded languages

查看:88
本文介绍了C ++ boost :: spirit解析嵌入式语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题实际上很简单.我目前正在开发一种语言解析器,可以解析具有嵌入式DSL的元语言.这对我来说很有趣,因为它可以解析具有HTML和嵌入式JavaScript/CSS的网站.我想针对特定用例设计具有最少DSL的类似系统.

my question is quite simple in fact. I'm currently working on a language parser that can parse a meta language with embedded DSLs. This is quite interesting for me because it may parse websites with HTML and embedded JavaScript / CSS. I wanted to design some similar system with minimal DSLs for a specific use case.

boost :: spirit是否有能力做类似的事情?我只是不知道boost :: spirit是如何处理词法分析器生成的,或者甚至是无扫描器的解析器.

Is boost::spirit capable of doing something similar? I just don't know how boost::spirit handles lexer generation or if it even is a scannerless parser.

提前谢谢!

推荐答案

Spirit Qi可以与扫描仪(Spirit Lex)结合使用,也可以不结合使用.

Spirit Qi can be used with a scanner (Spirit Lex) or without.

在我的拙见中,Spirit在不使用扫描仪的情况下会发光.原因主要是,当您避免复杂性时,Spirit会发光,并且使用Spirit Lex可以为您的Spirit Qi语法定义起复杂性乘数的作用.

In my humble opinion, Spirit shines when using it scanner-less, though. The reason is mainly that Spirit shines when you avoid complexity, and using Spirit Lex acts like a complexity multiplier for your Spirit Qi grammar definition.

没关系,

  • 是的,您可以切换到其他 embedded 语法. Nabialek技巧实际上是实现这种目标的一种著名方法.切换.
  • 从技术上讲,在使用Spirit Lex时也可以切换词法分析器状态以实现相同的切换,但是您必须牢记此方法的局限性(无法根据解析器层中的条件来操纵词法分析器状态,这可能与这方面存在未记录的解析器指令提示的事情
  • 您的问题似乎并未涉及即席/即时语法,但由于"DSL"表明了这一点,因此我将添加适当的警告:Spirit Qi是一个可生成PEG解析器的解析器生成器框架, em>在编译时.就目前而言,它不适合在运行时生成规则/语法 (主要是由于其背后的Boost Proto/Boost Phoenix的限制). Spirit X3可能会解除许多限制,但这就是未来.
  • yes you can switch to different embedded grammars¹. The Nabialek trick is actually a famous way to achieve such a switch.
  • technically it's also possible to switch lexer states to achieve the same switch when using Spirit Lex, but you have to bear in mind limitations of this method (Lexer State can not be manipulated depending on conditions in the Parser tier, contrary perhaps to things suggested by the presence of undocumented parser directives in this area)
  • Your question doesn't seem to talk about ad-hoc/on-the-fly grammars, but since "DSLs" suggest this, I'll add proper warning: Spirit Qi is a parser generator framework that generates PEG parsers at compile time. In it's current incarnation, it does not lend itself well to generating rules/grammars at runtime (mainly due to limitations in Boost Proto/Boost Phoenix that underly it). Spirit X3 may lift many of these limitations, but that's future.

也就是说,我强烈建议您为此目的考虑使用现成的解析器/令牌器.我的立场通常总结为:使用Spirit进行快速开发和临时解析.

That said, I strongly suggest looking at ready made parsers/tokenizers for the purpose. My stance is usually summarized as: use Spirit for rapid development and ad-hoc parsing.

一旦语法变得足够复杂并且您知道语法是固定/稳定的,我相信您可以使用手写解析器或使用更繁琐的解析器生成器(如ANTLR,CoCo/R,Flex/野牛等,这需要更多的安装成本.

As soon as your grammar becomes complex enough and you know the grammar is fixed/stable, I believe you can achieve best results with a handwritten parser or using one of the more tedious parser generators like ANTLR, CoCo/R, Flex/bison etc, which require more setup cost.

¹旁注:对于HTML内的脚本,我不认为"DSL"是合适的术语. 嵌入"性质仅与切向相关,例如ECMAScript几乎不是特定于域的",因此在这里我将坚持嵌入式语法"

¹ Side note: I don't think "DSLs" is an appropriate term for the case of scripts inside HTML. The "embedded" nature is only tangentially related, and e.g. ECMAScript is hardly "Domain Specific", so I'll stick to "Embedded Grammar" here

这篇关于C ++ boost :: spirit解析嵌入式语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆