指定的移位C ++ 11语法动作功能/缩小分析器发电机? [英] Specifying C++11 Grammar Action Functions in Shift/Reduce Parser Generator?

查看:158
本文介绍了指定的移位C ++ 11语法动作功能/缩小分析器发电机?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在转变/减少C ++ 11解析器发生器,我不知道如何指定输入生产和减少操作功能,例如,他们将举行我希望把在信息的接口类型它们。

I'm working on a shift/reduce parser generator in C++11 and I am not sure how to specify the interface type of the input productions and reduction action functions such that they will hold the information I want to put in them.

我想静态,但使用C ++类型(而不是一个单独的构建工具)指定的语法。

I want to specify the grammar statically but using C++ types (not a separate build tool).

有关的每个符号(端子和非终端)的用户提供了一个字符串名称和类型。

For each symbol (terminals and non-terminals) the user provides a string name and a type.

然后每个生产指定头符号名和一个或多个身体符号名。

Then each production specifies a head symbol name and one or more body symbol names.

有关每一个返回头非终结类型并具有与(其对应类型)对应于生产体符号参数生产的动作功能是由用户(硬部分)提供的。

For each production an action function is provided by the user (the hard part) that returns the head nonterminal type and has parameters corresponding to the production body symbols (of their corresponding types).

的主要问题是静态绑定的参数类型,返回的这些操作功能类型对应的符号类型

因此​​,例如:

假设我们有终结符号 X A B C

Suppose we have nonterminals X, A B C

他们的名字/类型可能是:

Their names/types might be:

"X" Foo
"A" string
"B" string
"C" int

和文法中有可能是一个生产:

And in the grammar there might be a production:

X -> A B C

和将有使生产所提供的用户的动作的功能

And there will be an action function provided by the user for that production:

Foo f(string A, string B, int C)

如果该产量数据下调了比函数f应该被称为与生产车身参数。用f返回的值然后当存储时,符号被用在一个较高的向上减少对

If that production is reduced than the function f should be called with the production body parameters. The value returned by f is then stored for when that symbol is used in a higher up reduction.

因此​​,要指定语法解析器生成需提供本人是这样的:

So to specify the grammar to the parser generator I need to provide something like:

(我知道下面是无效的)

(I know the following is invalid)

struct Symbol
{
    string name;
    type T;
}

struct Production
{
    string head;
    vector<string> body;

    function<head.T(body[0].T, body[1].T, ..., body[n].T)> action;
}

struct Grammar
{
    vector<Symbol> symbols;

    vector<Production> productions;
}

和指定前面的例子是:

Grammar example =
{
    // symbols
    {
        { "X", Foo },
        { "A", string },
        { "B", string },
        { "C", int }
    },

    // productions
    {
        { 
            "X",
            { "A", "B", "C" },
            [](string A, string B, int C) { ... return Foo(...); }
        }
    }
}

当然,这是行不通的,你不能混用类型参数与运行参数之类的。

This won't work of course, you can't mix type parameters with runtime parameters like that.

一个解决办法是有一些通用的基础:

One solution would be to have some generic base:

struct SymbolBase
{
    ...
}

template<class SymbolType>
struct SymbolDerived<SymbolType> : SymbolBase
{
    SymbolType value;
}

再作类型的所有操作功能:

and then make all action functions of type:

typedef function<SymbolBase(vector<SymbolBase>)> ActionFunction;

和梳理出来在运行时。但是这使得使用更困难,而所有的铸造速度很慢。我宁愿函数签名在编译时检查,并保持对用户隐藏的机制。

and sort it out at runtime. But this makes usage more difficult, and all the casting is slow. I'd rather have the function signatures checked at compile-time and keep the mechanics hidden from the user.

我怎样才能调整符号,生产和语法类型来进行,我想在法律C ++ 11要传达的信息?

How can I restructure the Symbol, Production and Grammar types to carry the information I am trying to convey in legal C++11?

(是的,我已经看过升压精神和朋友,这是一个很好的框架,但它是递归下降这样的语言,它可以处理在一个单一的传球比一个LALR解析器较少而且由于它采用回溯的减排行动将得到多次调用,等等,等等)

(Yes I have looked at Boost Spirit and friends, it is a fine framework but it is recursive descent so the languages it can handle in a single pass are fewer than a LALR parser and because it uses backtracking the reduction actions will get called multiple times, etc, etc)

推荐答案

我一直在玩弄precisely这个问题。一旦可能性,我一直在看,它看起来像它应该工作,是使用堆栈变量的对象,也许提振::变种或升压::所有。由于每减少知道它是从堆栈期待,访问的是类型安全的;不幸的是,该类型检查将在运行时,但它应该是很便宜的。这有发现bug的优势:),它也将正确销毁对象,因为它们可以从堆栈中弹出。

I've been playing around with precisely this problem. Once possibility I've been looking at, which looks like it should work, is to use a stack of variant objects, perhaps boost::variant or boost::any. Since each reduction knows what it's expecting from the stack, the access will be type-safe; unfortunately, the type-check will be at run-time, but it should be very cheap. This has the advantage of catching bugs :) and it will also correctly destruct objects as they're popped from the stack.

我扔在一起的一些示例code作为的PoC,可根据要求。基本样式书写的减少规则是这样的:

I threw together some sample code as a PoC, available upon request. The basic style for writing a reduction rule is something like this:

parse.reduce<Expression(Expression, _, Expression)>
  ( [](Expression left, Expression right){
      return BinaryOperation(Operator::Times, left, right);
  });

这相当于规则:

which corresponds to the rule:

expression: expression TIMES expression

下面, BinaryOperation 是AST节点类型,并且必须转换为防爆pression ;模板参数防爆pression(出pression,_,防爆pression)正是左手侧,右手侧生产,EX pressed的类型。 (因为第二RHS类型_,模板也懒得价值喂到​​减速规则:用适当的解析器生成器,有实际上是没有理由连推标点符号记号到堆栈中的第一名。)我实现这两个标记工会防爆pression 和解析器堆栈使用的boost ::变种的标签类型。如果你试试这个,这是值得知道用变体的另一种变体的选项类型之一没有真正发挥作用。最后,这是最简单的包裹小工会作为结构。你也真的要了解递归类型的部分。

Here, BinaryOperation is the AST node-type, and must be convertible to Expression; the template argument Expression(Expression, _, Expression) is exactly the left-hand-side and right-hand-side of the production, expressed as types. (Because the second RHS type is _, the templates don't bother feeding the value to the reduction rule: with a proper parser generator, there would actually be no reason to even push punctuation tokens onto the stack in the first place.) I implemented both the tagged union Expression and the tagged type of the parser stack using boost::variant. In case you try this, it's worth knowing that using a variant as one of the option types of another variant doesn't really work. In the end, it was easiest to wrap the smaller union as a struct. You also really have to read the section about recursive types.

这篇关于指定的移位C ++ 11语法动作功能/缩小分析器发电机?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆