不应用Boost Spirit解析规则 [英] Boost spirit parse rule is not applied

查看:78
本文介绍了不应用Boost Spirit解析规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里看不到我的错误..此规则可以解析某些内容,但最后两个示例无法解析.有人可以给我一个提示..

i can´t see my error here .. this rule parse some stuff ok but the last two samples not. Could somebody please give me a hint ..

Goal是一个解析器,可以识别成员属性访问和成员函数调用.也以某种方式链接

Goal is a parser than can identify member property access and member function calls. Also chained in some way

 a()
 a(para)
 x.a()
 x.a(para)
 x.a(para).g(para).j()
 x.y
 x.y.z
 x.y.z()    <---fail
 y.z.z(para) <--- fail

  lvalue =
         iter_pos >> name[_val = _1]
          >> *(lit('(') > paralistopt  > lit(')') >> iter_pos)[_val = construct<common_node>(type_cmd_fnc_call, LOCATION_NODE_ITER(_val, _2), key_this, construct<common_node>(_val), key_parameter, construct<std::vector<common_node> >(_1))]        
       >> *(lit('.') >> name_pure >> lit('(') > paralistopt > lit(')') >> iter_pos)[_val = construct<common_node>(type_cmd_fnc_call, LOCATION_NODE_ITER(_val, _3), key_this, construct<common_node>(_val), key_callname, construct<std::wstring>(_1), key_parameter, construct<std::vector<common_node> >(_2))]
       >> *(lit('.') >> name_pure >> iter_pos)[_val = construct<common_node>(type_cmd_dot_call, LOCATION_NODE_ITER(_val, _2), key_this, construct<common_node>(_val), key_propname, construct<std::wstring>(_1))]
    ;

谢谢你 马库斯

推荐答案

您提供的信息很少.让我通过这个猜谜游戏让您感到幽默:

You provide very little information to go at. Let me humor you with my entry into this guessing game:

让我们假设您想解析一个简单的语言",该语言仅允许成员表达式和函数调用,但被链接.

Let's assume you want to parse a simple "language" that merely allows member expressions and function invocations, but chained.

现在,您的语法对参数什么也没说(尽管很明显,参数列表可以为空),所以让我继续前进,假设您要在那里接受相同类型的表达式(因此foo(a)是好的,但是bar(foo(a))bar(b.foo(a))).

Now, your grammar says nothing about the parameters (though it's clear the param list can be empty), so let me go the next mile and assume that you want to accept the same kind of expressions there (so foo(a) is okay, but also bar(foo(a)) or bar(b.foo(a))).

由于您接受函数调用的链接,因此看来函数是一类对象(并且函数可以返回函数),因此也应该接受foo(a)(b, c, d).

Since you accept chaining of function calls, it appears that functions are first-class objects (and functions can return functions), so foo(a)(b, c, d) should be accepted as well.

您没有提到它,但是参数通常包括文字(sqrt(9)println("hello world")).

You didn't mention it, but parameters often include literals (sqrt(9) comes to mind, or println("hello world")).

其他项目:

  • 您没有说,但您可能想忽略某些地方的空白
  • 来自iter_pos(ab),因为您似乎有兴趣跟踪生成的AST中的原始源位置.
  • you didn't say but likely you want to ignore whitespace in certain spots
  • from the iter_pos (ab)use it seems you're interested in tracking the original source location inside the resulting AST.

我们应该使它尽可能简单:

We should keep it simple as ever:

namespace Ast {
    using Identifier = boost::iterator_range<It>;

    struct MemberExpression;
    struct FunctionCall;

    using Expression = boost::variant<
                double,       // some literal types
                std::string,
                // non-literals
                Identifier,
                boost::recursive_wrapper<MemberExpression>,
                boost::recursive_wrapper<FunctionCall>
            >;

    struct MemberExpression {
        Expression object; // antecedent
        Identifier member; // function or field
    };

    using Parameter  = Expression;
    using Parameters = std::vector<Parameter>;

    struct FunctionCall {
        Expression function; // could be a member function
        Parameters parameters;
    };
}

注意:我们不会专注于显示源位置,而是已经做出了一项规定,将标识符存储为迭代器范围.

NOTE We're not going to focus on showing source locations, but already made one provision, storing identifiers as an iterator-range.

注意融合以适应Spirit不直接支持的唯一类型:

NOTE Fusion-adapting the only types not directly supported by Spirit:

BOOST_FUSION_ADAPT_STRUCT(Ast::MemberExpression, object, member)
BOOST_FUSION_ADAPT_STRUCT(Ast::FunctionCall, function, parameters)

我们会发现我们没有使用它们,因为语义动作在这里更方便.

We will find that we don't use these, because Semantic Actions are more convenient here.

2.匹配语法

Grammar() : Grammar::base_type(start) {
    using namespace qi;
    start = skip(space) [expression];

    identifier = raw [ (alpha|'_') >> *(alnum|'_') ];
    parameters = -(expression % ',');

    expression 
        = literal 
        | identifier  >> *(
                    ('.' >> identifier)        
                  | ('(' >> parameters >> ')') 
                );

    literal = double_ | string_;
    string_ = '"' >> *('\\' >> char_ | ~char_('"')) >> '"';

    BOOST_SPIRIT_DEBUG_NODES(
            (identifier)(start)(parameters)(expression)(literal)(string_)
        );
}

在此框架中,大多数规则都受益于自动属性传播.没有的是expression:

In this skeleton most rules benefit from automatic attribute propagation. The one that doesn't is expression:

qi::rule<It, Expression()> start;

using Skipper = qi::space_type;
qi::rule<It, Expression(), Skipper> expression, literal;
qi::rule<It, Parameters(), Skipper> parameters;
// lexemes
qi::rule<It, Identifier()> identifier;
qi::rule<It, std::string()> string_;

因此,让我们为语义动作创建一些助手.

So, let's create some helpers for the semantic actions.

注意,这里的一个重要收获是创建自己的更高级别的构建基块,而不是使用boost::phoenix::construct<>等进行累活.

NOTE An important take-away here is to create your own higher-level building blocks instead of toiling away with boost::phoenix::construct<> etc.

定义两个简单的构造函数:

Define two simple construction functions:

struct mme_f { MemberExpression operator()(Expression lhs, Identifier rhs) const { return { lhs, rhs }; } };
struct mfc_f { FunctionCall operator()(Expression f, Parameters params) const { return { f, params }; } };
phx::function<mme_f> make_member_expression;
phx::function<mfc_f> make_function_call;

然后使用它们:

expression 
    = literal [_val=_1]
    | identifier [_val=_1] >> *(
                ('.' >> identifier)        [ _val = make_member_expression(_val, _1)]
              | ('(' >> parameters >> ')') [ _val = make_function_call(_val, _1) ]
            );

仅此而已.我们准备出发了!

That's all. We're ready to roll!

在Coliru上直播

我创建了一个如下所示的测试床:

I created a test bed looking like this:

int main() {
    using It = std::string::const_iterator;
    Parser::Grammar<It> const g;

    for (std::string const input : {
             "a()", "a(para)", "x.a()", "x.a(para)", "x.a(para).g(para).j()", "x.y", "x.y.z",
             "x.y.z()",
             "y.z.z(para)",
             // now let's add some funkyness that you didn't mention
             "bar(foo(a))",
             "bar(b.foo(a))",
             "foo(a)(b, c, d)", // first class functions
             "sqrt(9)",
             "println(\"hello world\")",
             "allocate(strlen(\"aaaaa\"))",
             "3.14",
             "object.rotate(180)",
             "object.rotate(event.getAngle(), \"torque\")",
             "app.mainwindow().find_child(\"InputBox\").font().size(12)",
             "app.mainwindow().find_child(\"InputBox\").font(config().preferences.baseFont(style.PROPORTIONAL))"
         }) {
        std::cout << " =========== '" << input << "' ========================\n";
        It f(input.begin()), l(input.end());

        Ast::Expression parsed;
        bool ok = parse(f, l, g, parsed);
        if (ok) {
            std::cout << "Parsed: " << parsed << "\n";
        }
        else
            std::cout << "Parse failed\n";

        if (f != l)
            std::cout << "Remaining unparsed input: '" << std::string(f, l) << "'\n";
    }
}

令人难以置信的是,它已经解析了所有测试用例并打印:

Incredible as it may appear, this already parses all the test cases and prints:

 =========== 'a()' ========================
Parsed: a()
 =========== 'a(para)' ========================
Parsed: a(para)
 =========== 'x.a()' ========================
Parsed: x.a()
 =========== 'x.a(para)' ========================
Parsed: x.a(para)
 =========== 'x.a(para).g(para).j()' ========================
Parsed: x.a(para).g(para).j()
 =========== 'x.y' ========================
Parsed: x.y
 =========== 'x.y.z' ========================
Parsed: x.y.z
 =========== 'x.y.z()' ========================
Parsed: x.y.z()
 =========== 'y.z.z(para)' ========================
Parsed: y.z.z(para)
 =========== 'bar(foo(a))' ========================
Parsed: bar(foo(a))
 =========== 'bar(b.foo(a))' ========================
Parsed: bar(b.foo(a))
 =========== 'foo(a)(b, c, d)' ========================
Parsed: foo(a)(b, c, d)
 =========== 'sqrt(9)' ========================
Parsed: sqrt(9)
 =========== 'println("hello world")' ========================
Parsed: println(hello world)
 =========== 'allocate(strlen("aaaaa"))' ========================
Parsed: allocate(strlen(aaaaa))
 =========== '3.14' ========================
Parsed: 3.14
 =========== 'object.rotate(180)' ========================
Parsed: object.rotate(180)
 =========== 'object.rotate(event.getAngle(), "torque")' ========================
Parsed: object.rotate(event.getAngle(), torque)
 =========== 'app.mainwindow().find_child("InputBox").font().size(12)' ========================
Parsed: app.mainwindow().find_child(InputBox).font().size(12)
 =========== 'app.mainwindow().find_child("InputBox").font(config().preferences.baseFont(style.PROPORTIONAL))' ========================
Parsed: app.mainwindow().find_child(InputBox).font(config().preferences.baseFont(style.PROPORTIONAL))

4.太真实了吗?

您是对的.我作弊了.我没有向您显示调试打印已解析的AST所需的以下代码:

4. Too Good To Be True?

You're right. I cheated. I didn't show you this code required to debug print the parsed AST:

namespace Ast {
    static inline std::ostream& operator<<(std::ostream& os, MemberExpression const& me) {
        return os << me.object << "." << me.member;
    }

    static inline std::ostream& operator<<(std::ostream& os, FunctionCall const& fc) {
        os << fc.function << "(";
        bool first = true;
        for (auto& p : fc.parameters) { if (!first) os << ", "; first = false; os << p; }
        return os << ")";
    }
}

这只是调试打印,因为字符串文字没有正确往返.但这只是10行代码,这是一个好处.

It's only debug printing, as string literals aren't correctly roundtripped. But it's only 10 lines of code, that's a bonus.

这引起了您的兴趣,因此让我们展示一下它的工作原理.让我们添加一个简单的循环来打印标识符的所有位置:

This had your interest, so let's show it working. Let's add a simple loop to print all locations of identifiers:

using IOManip::showpos;

for (auto& id : all_identifiers(parsed)) {
    std::cout << " - " << id << " at " << showpos(id, input) << "\n";
}

当然,这引出了一个问题,showposall_identifiers是什么?

Of course, this begs the question, what are showpos and all_identifiers?

namespace IOManip {
    struct showpos_t {
        boost::iterator_range<It> fragment;
        std::string const& source;

        friend std::ostream& operator<<(std::ostream& os, showpos_t const& manip) {
            auto ofs = [&](It it) { return it - manip.source.begin(); };
            return os << "[" << ofs(manip.fragment.begin()) << ".." << ofs(manip.fragment.end()) << ")";
        }
    };

    showpos_t showpos(boost::iterator_range<It> fragment, std::string const& source) {
        return {fragment, source};
    }
}

关于标识符提取:

std::vector<Identifier> all_identifiers(Expression const& expr) {
    std::vector<Identifier> result;
    struct Harvest {
        using result_type = void;
        std::back_insert_iterator<std::vector<Identifier> > out;
        void operator()(Identifier const& id)       { *out++ = id; }
        void operator()(MemberExpression const& me) { apply_visitor(*this, me.object); *out++ = me.member; }
        void operator()(FunctionCall const& fc)     {
            apply_visitor(*this, fc.function); 
            for (auto& p : fc.parameters) apply_visitor(*this, p);
        }
        // non-identifier expressions
        void operator()(std::string const&) { }
        void operator()(double) { }
    } harvest { back_inserter(result) };
    boost::apply_visitor(harvest, expr);

    return result;
}

这是一个树木访问者,它递归地获取所有标识符,并将其插入到容器的背面.

That's a tree visitor that harvests all identifiers recursively, inserting them into the back of a container.

在Coliru上直播

输出看起来像(摘录):

Where output looks like (excerpt):

 =========== 'app.mainwindow().find_child("InputBox").font(config().preferences.baseFont(style.PROPORTIONAL))' ========================
Parsed: app.mainwindow().find_child(InputBox).font(config().preferences.baseFont(style.PROPORTIONAL))
 - app at [0..3)
 - mainwindow at [4..14)
 - find_child at [17..27)
 - font at [40..44)
 - config at [45..51)
 - preferences at [54..65)
 - baseFont at [66..74)
 - style at [75..80)
 - PROPORTIONAL at [81..93)

这篇关于不应用Boost Spirit解析规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆