解析在升压精神嵌套的键值对 [英] Parsing nested key value pairs in Boost Spirit

查看:119
本文介绍了解析在升压精神嵌套的键值对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法写什么,我认为应该使用boost ::精神简单的解析器。 (我使用的是精神,而不是仅仅使用字符串函数,因为这部分是一个学习锻炼对我来说)。

I am having trouble writing what I think should be a simple parser using Boost::Spirit. (I'm using Spirit instead of just using string functions as this is partly a learning exercise for me).

数据解析需要键值对,其中值本身可以是键值对的形式。键是字母数字(用下划线并没有数字作为第一个字符);值是字母加 -_ - 值可以在格式日期 DD-MMM-YYYY 例如 01月2015 和浮动像 3.1415 点数除了普通的旧的字母数字串。键和值与 = 分开;对与分离; ;结构化值分隔的 { ... } 。此刻,我将它传递给灵前擦除用户输入的所有空间。

The data to parse takes the form of key value pairs, where a value can itself be a key value pair. Keys are alphanumeric (with underscores and no digit as first character); values are alphanumeric plus .-_ - the values can be dates in the format DD-MMM-YYYY e.g. 01-Jan-2015 and floating point numbers like 3.1415 in addition to plain old alphanumeric strings. Keys and values are separated with =; pairs are separated with ;; structured values are delimited with {...}. At the moment I am erasing all spaces from the user input before passing it to Spirit.

输入示例:

键1 =值;键2 = {NestedKey1 =阿兰; NestedKey2 = 43.1232; }; KEY3 = 15月 - 1974年;

我会再去除所有的空间给

I would then strip all spaces to give

键1 =值;键2 = {NestedKey1 =艾伦; NestedKey2 = 43.1232;};密钥3 = 15,07月1974;

然后我居然把它传递给精神。

and then I actually pass it to Spirit.

我现在有工作只是花花公子当值只是值。当我开始输入编码结构化值,那么精神的第一个结构化的值后停止。如果只有一个结构值一种解决方法是把它在输入的结束......但我需要有时两个或更多的结构化值。

What I currently have works just dandy when values are simply values. When I start encoding structured values in the input then Spirit stops after the first structured value. A workaround if there is only one structured value is to put it at the end of the input... but I will need two or more structured values on occasion.

下面编译在VS2013和说明了错误:

The below compiles in VS2013 and illustrates the errors:

#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/pair.hpp>
#include <boost/fusion/adapted.hpp>
#include <map>
#include <string>
#include <iostream>

typedef std::map<std::string, std::string> ARGTYPE;

#define BOOST_SPIRIT_DEBUG

namespace qi = boost::spirit::qi;
namespace fusion = boost::fusion;

template < typename It, typename Skipper>
struct NestedGrammar : qi::grammar < It, ARGTYPE(), Skipper >
{
    NestedGrammar() : NestedGrammar::base_type(Sequence)
    {
        using namespace qi;
        KeyName = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z0-9_");
        Value = +qi::char_("-.a-zA-Z_0-9");

        Pair = KeyName >> -(
            '=' >> ('{' >> raw[Sequence] >> '}' | Value)
            );

        Sequence = Pair >> *((qi::lit(';') | '&') >> Pair);

        BOOST_SPIRIT_DEBUG_NODE(KeyName);
        BOOST_SPIRIT_DEBUG_NODE(Value);
        BOOST_SPIRIT_DEBUG_NODE(Pair);
        BOOST_SPIRIT_DEBUG_NODE(Sequence);
    }
private:
    qi::rule<It, ARGTYPE(), Skipper> Sequence;
    qi::rule<It, std::string()> KeyName;
    qi::rule<It, std::string(), Skipper> Value;
    qi::rule<It, std::pair < std::string, std::string>(), Skipper> Pair;
};


template <typename Iterator>
ARGTYPE Parse2(Iterator begin, Iterator end)
{
    NestedGrammar<Iterator, qi::space_type> p;
    ARGTYPE data;
    qi::phrase_parse(begin, end, p, qi::space, data);
    return data;
}


// ARGTYPE is std::map<std::string,std::string>
void NestedParse(std::string Input, ARGTYPE& Output)
{
    Input.erase(std::remove_if(Input.begin(), Input.end(), isspace), Input.end());
    Output = Parse2(Input.begin(), Input.end());
}

int main(int argc, char** argv)
{
    std::string Example1, Example2, Example3;
    ARGTYPE Out;

    Example1 = "Key1=Value1 ; Key2 = 01-Jan-2015; Key3 = 2.7181; Key4 = Johnny";
    Example2 = "Key1 = Value1; Key2 = {InnerK1 = one; IK2 = 11-Nov-2011;};";
    Example3 = "K1 = V1; K2 = {IK1=IV1; IK2=IV2;}; K3=V3; K4 = {JK1=JV1; JK2=JV2;};";

    NestedParse(Example1, Out);
    for (ARGTYPE::iterator i = Out.begin(); i != Out.end(); i++)
        std::cout << i->first << "|" << i->second << std::endl;
    std::cout << "=====" << std::endl;

    /* get the following, as expected:
    Key1|Value1
    Key2|01-Jan-2015
    Key3|2.7181
    Key4|Johnny
    */

    NestedParse(Example2, Out);
    for (ARGTYPE::iterator i = Out.begin(); i != Out.end(); i++)
        std::cout << i->first << "|" << i->second << std::endl;
    std::cout << "=====" << std::endl;

    /* get the following, as expected:
    Key1|Value1
    key2|InnerK1=one;IK2=11-Nov-2011
    */

    NestedParse(Example3, Out);
    for (ARGTYPE::iterator i = Out.begin(); i != Out.end(); i++)
        std::cout << i->first << "|" << i->second << std::endl;

    /* Only get the first two lines of the expected output:
    K1|V1
    K2|IK1=IV1;IK2=IV2
    K3|V3
    K4|JK1=JV1;JK2=JV2
    */

    return 0;

}

我不知道问题是否到我的BNF的无知,我的精神的无知,或者是我在这一点上双方的无知。

I'm not sure if the problem is down to my ignorance of BNF, my ignorance of Spirit, or perhaps my ignorance of both at this point.

任何帮助AP preciated。我读例如灵奇序列分析问题和其中的链接,但我仍然无法找出我我做错了。

Any help appreciated. I've read e.g. Spirit Qi sequence parsing issues and links therein but I still can't figure out what I am doing wrong.

推荐答案

事实上,这precisely语法简单的精神擅长。

Indeed this precisely a simple grammar that Spirit excels at.

而且也绝对没有必要跳过空格前面:精神有内置的宗旨船长

Moreover there is absolutely no need to skip whitespace up front: Spirit has skippers built in for the purpose.

要您明确的问题,但:

规则过于复杂。你可以只用列表操作符(

The Sequence rule is overcomplicated. You could just use the list operator (%):

Sequence = Pair % char_(";&");

现在的问题是,你最终与序列; 未料,这样既最终失败的解析。这是不是很清楚,除非你的#define BOOST_SPIRIT_DEBUG ¹并检查调试输出。

Now your problem is that you end the sequence with a ; that isn't expected, so both Sequence and Value fail the parse eventually. This isn't very clear unless you #define BOOST_SPIRIT_DEBUG¹ and inspect the debug output.

因此​​,要解决这个问题是:

So to fix it use:

Sequence = Pair % char_(";&") >> -omit[char_(";&")];

<大骨节病> 直播固定在Coliru (或与调试信息

打印:

Key1|Value1
Key2|01-Jan-2015
Key3|2.7181
Key4|Johnny
=====
Key1|Value1
Key2|InnerK1=one;IK2=11-Nov-2011;
=====
K1|V1
K2|IK1=IV1;IK2=IV2;
K3|V3
K4|JK1=JV1;JK2=JV2;


加成清理

其实,这很简单。只是删除多余的行删除空格。船长已经齐::空间

(但请注意,该船长并不适用于你的规则,因此值不能包含空格,但解析器不会自动跳过,要么,我想这是可能你想要的东西。要知道的话)。

(Note though that the skipper doesn't apply to your Value rule, so values cannot contain whitespace but the parser will not silently skip it either; I suppose this is likely what you want. Just be aware of it).

您实际上想有一个递归AST,而不是解析成一个平面地图。

You would actually want to have a recursive AST, instead of parsing into a flat map.

加速<一个href=\"http://www.boost.org/doc/libs/1_58_0/doc/html/variant/tutorial.html#variant.tutorial.recursive\"相对=nofollow>递归变种使这一件轻而易举的:

Boost recursive variants make this a breeze:

namespace ast {
    typedef boost::make_recursive_variant<std::string, std::map<std::string, boost::recursive_variant_> >::type Value;
    typedef std::map<std::string, Value> Sequence;
}

为了使这项工作你只需要改变声明的属性类型的规则:

To make this work you just change the declared attribute types of the rules:

qi::rule<It, ast::Sequence(),                      Skipper> Sequence;
qi::rule<It, std::pair<std::string, ast::Value>(), Skipper> Pair;
qi::rule<It, std::string(),                        Skipper> String;
qi::rule<It, std::string()>                                 KeyName;

规则自己的甚至没有改变的所有即可。你需要写一个小访客流的AST:

The rules themselves don't even have to change at all. You will need to write a little visitor to stream the AST:

static inline std::ostream& operator<<(std::ostream& os, ast::Value const& value) {
    struct vis : boost::static_visitor<> {
        vis(std::ostream& os, std::string indent = "") : _os(os), _indent(indent) {}

        void operator()(std::map<std::string, ast::Value> const& map) const {
            _os << "map {\n";
            for (auto& entry : map) {
                _os << _indent << "    " << entry.first << '|';
                boost::apply_visitor(vis(_os, _indent+"    "), entry.second);
                _os << "\n";
            }
            _os << _indent << "}\n";
        }
        void operator()(std::string const& s) const {
            _os << s;
        }

    private:
        std::ostream& _os;
        std::string _indent;
    };
    boost::apply_visitor(vis(os), value);
    return os;
}

现在它打印:

map {
    Key1|Value1
    Key2|01-Jan-2015
    Key3|2.7181
    Key4|Johnny
}

=====
map {
    Key1|Value1
    Key2|InnerK1 = one; IK2 = 11-Nov-2011;
}

=====
map {
    K1|V1
    K2|IK1=IV1; IK2=IV2;
    K3|V3
    K4|JK1=JV1; JK2=JV2;
}

当然,硬道理就是当您更改原料[序列] 来的只是现在

map {
    Key1|Value1
    Key2|01-Jan-2015
    Key3|2.7181
    Key4|Johnny
}

=====
map {
    Key1|Value1
    Key2|map {
        IK2|11-Nov-2011
        InnerK1|one
    }

}

=====
map {
    K1|V1
    K2|map {
        IK1|IV1
        IK2|IV2
    }

    K3|V3
    K4|map {
        JK1|JV1
        JK2|JV2
    }

}

完整演示code

<大骨节病> 住在Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/variant.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <iostream>
#include <string>
#include <map>

namespace ast {
    typedef boost::make_recursive_variant<std::string, std::map<std::string, boost::recursive_variant_> >::type Value;
    typedef std::map<std::string, Value> Sequence;
}

namespace qi = boost::spirit::qi;

template <typename It, typename Skipper>
struct NestedGrammar : qi::grammar <It, ast::Sequence(), Skipper>
{
    NestedGrammar() : NestedGrammar::base_type(Sequence)
    {
        using namespace qi;
        KeyName = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z0-9_");
        String = +qi::char_("-.a-zA-Z_0-9");

        Pair = KeyName >> -(
                '=' >> ('{' >> Sequence >> '}' | String)
            );

        Sequence = Pair % char_(";&") >> -omit[char_(";&")];

        BOOST_SPIRIT_DEBUG_NODES((KeyName) (String) (Pair) (Sequence))
    }
private:
    qi::rule<It, ast::Sequence(),                      Skipper> Sequence;
    qi::rule<It, std::pair<std::string, ast::Value>(), Skipper> Pair;
    qi::rule<It, std::string(),                        Skipper> String;
    qi::rule<It, std::string()>                                 KeyName;
};


template <typename Iterator>
ast::Sequence DoParse(Iterator begin, Iterator end)
{
    NestedGrammar<Iterator, qi::space_type> p;
    ast::Sequence data;
    qi::phrase_parse(begin, end, p, qi::space, data);
    return data;
}

static inline std::ostream& operator<<(std::ostream& os, ast::Value const& value) {
    struct vis : boost::static_visitor<> {
        vis(std::ostream& os, std::string indent = "") : _os(os), _indent(indent) {}

        void operator()(std::map<std::string, ast::Value> const& map) const {
            _os << "map {\n";
            for (auto& entry : map) {
                _os << _indent << "    " << entry.first << '|';
                boost::apply_visitor(vis(_os, _indent+"    "), entry.second);
                _os << "\n";
            }
            _os << _indent << "}\n";
        }
        void operator()(std::string const& s) const {
            _os << s;
        }

      private:
        std::ostream& _os;
        std::string _indent;
    };
    boost::apply_visitor(vis(os), value);
    return os;
}

int main()
{
    std::string const Example1 = "Key1=Value1 ; Key2 = 01-Jan-2015; Key3 = 2.7181; Key4 = Johnny";
    std::string const Example2 = "Key1 = Value1; Key2 = {InnerK1 = one; IK2 = 11-Nov-2011;};";
    std::string const Example3 = "K1 = V1; K2 = {IK1=IV1; IK2=IV2;}; K3=V3; K4 = {JK1=JV1; JK2=JV2;};";

    std::cout << DoParse(Example1.begin(), Example1.end()) << "\n";
    std::cout << DoParse(Example2.begin(), Example2.end()) << "\n";
    std::cout << DoParse(Example3.begin(), Example3.end()) << "\n";
}


¹你有,但不是在正确的地方!任何升压包括之前应该走了。


¹ You "had" it, but not in the right place! It should go before any Boost includes.

这篇关于解析在升压精神嵌套的键值对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆