解析使用Boost异构数据::精神 [英] Parsing heterogeneous data using Boost::Spirit

查看：164 发布时间：2016/8/12 18:50:21 c++ boost boost-spirit boost-spirit-qi

本文介绍了解析使用Boost异构数据::精神的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图弄清楚如何处理以下问题。

I'm trying to figure out how to approach the following problem.

我有以下格式的结构：

struct Data
{
     time_t timestamp;
     string id;
     boost::optional<int> data1;
     boost::optional<string> data2;
     // etc...
};

这应分析出以下格式的单行字符串：

This should be parsed out of a single line string in the following format:

human_readable_timestamp;id;key1=value1 key2=value2.....

当然键的顺序不必匹配在结构元素的顺序

Of course the ordering of the keys does not have to match the order of elements in the structure.

时的boost ::灵适合这种类型的数据？我该如何处理这？我已经通过了的例子，但我不能管理从例子适合我的要求code就搞定了。

Is Boost::Spirit suitable for this type of data? How do I approach this? I have gone through the examples, but I can't manage to get from the examples to code that fits my requirements.

更新

要的意见，使得弹性。我结束了改变从置换解析器而去，并与第一个编号的方法（即克莱尼明星与语义动作的方法）去。

    id     = lexeme [ *~char_(';') ];

    auto data1 = bind(&Data::Fields::data1, _val);
    auto data2 = bind(&Data::Fields::data2, _val);

    other  = lexeme [ +(graph-'=') ] >> '=' >> (real_|int_|text);

    fields = *(
                ("key1" >> lit('=') >> int_) [ data1 = _1 ]
              | ("key2" >> lit('=') >> text) [ data2 = _1 ]
              | other
              );

    start  = timestamp >> ';' >> id >> -(';' >> fields);

这改变了以下几个方面：

This changes the following aspects:

为了能够跳过其他领域，我需要拿出其他领域的一个合理的语法：

in order to be able to skip "other" fields, I needed to come up with a reasonable grammar for "other" fields:

other  = lexeme [ +(graph-'=') ] >> '=' >> (real_|int_|text);

（允许除 = 包含任何非空白的关键，其次是 = ，后跟一些数字（渴望），或文本）。

(allows a key consisting of anything non-whitespace except =, followed by the =, followed by either something numeric (eager), or text).

我已经扩展文本的概念来支持流行的报价/转义方案：

I've extended the notion of text to support popular quoting/escaping schemes:

text   = lexeme [ 
            '"' >> *('\\' >> char_ | ~char_('"')) >> '"'
          | "'" >> *('\\' >> char_ | ~char_("'")) >> "'"
          | *graph 
       ];

它允许重复相同的密钥（在这种情况下，它保留了的最后的有效可见值的）。

如果你想禁止无效值，替换＆GT;＆GT; INT _ 或＆GT;＆GT;文字按＆GT; INT _ 或＆GT;文字（即<一href=\"http://www.boost.org/doc/libs/1_57_0/libs/spirit/doc/html/spirit/qi/reference/operator/expect.html\"相对=nofollow>期望解析器）。

If you wanted to disallow invalid values, replace >> int_ or >> text by > int_ or > text (the expectation parser).

我曾与一些具有挑战性的情况下，延长了测试用例：

I've extended the test cases with some challenging cases:

    2015-Jan-26 00:00:00;id
    2015-Jan-26 14:59:24;id;key2="value"
    2015-Jan-26 14:59:24;id;key2="value" key1=42
    2015-Jan-26 14:59:24;id;key2="value" key1=42 something=awful __=4.74e-10 blarg;{blo;bloop='whatever \'ignor\'ed' key2="new} \"value\""
    2015-Jan-26 14:59:24.123;id;key1=42 key2="value"

和现在的打印效果。

----------------------------------------
Parsing '2015-Jan-26 00:00:00;id'
Parsing success
2015-Jan-26 00:00:00    id
data1: --
data2: --
----------------------------------------
Parsing '2015-Jan-26 14:59:24;id;key2="value"'
Parsing success
2015-Jan-26 14:59:24    id
data1: --
data2:  value
----------------------------------------
Parsing '2015-Jan-26 14:59:24;id;key2="value" key1=42'
Parsing success
2015-Jan-26 14:59:24    id
data1:  42
data2:  value
----------------------------------------
Parsing '2015-Jan-26 14:59:24;id;key2="value" key1=42 something=awful __=4.74e-10 blarg;{blo;bloop='whatever \'ignor\'ed' key2="new} \"value\""'
Parsing success
2015-Jan-26 14:59:24    id
data1:  42
data2:  new} "value"
----------------------------------------
Parsing '2015-Jan-26 14:59:24.123;id;key1=42 key2="value" '
Parsing success
2015-Jan-26 14:59:24.123000 id
data1:  42
data2:  value

<大骨节病> 住在Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/optional/optional_io.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/date_time/posix_time/posix_time_io.hpp>
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;

struct Data
{
    boost::posix_time::ptime timestamp;
    std::string id;
    struct Fields {
        boost::optional<int> data1;
        boost::optional<std::string> data2;
    } fields;
};

BOOST_FUSION_ADAPT_STRUCT(Data::Fields,
        (boost::optional<int>, data1)
        (boost::optional<std::string>, data2)
    )

BOOST_FUSION_ADAPT_STRUCT(Data,
        (boost::posix_time::ptime, timestamp)
        (std::string, id)
        (Data::Fields, fields)
    )

template <typename It, typename Skipper = qi::space_type>
struct grammar : qi::grammar<It, Data(), Skipper> {
    grammar() : grammar::base_type(start) {
        using namespace qi;
        timestamp = stream;

        real_parser<double, strict_real_policies<double> > real_;

        text   = lexeme [ 
                    '"' >> *('\\' >> char_ | ~char_('"')) >> '"'
                  | "'" >> *('\\' >> char_ | ~char_("'")) >> "'"
                  | *graph 
               ];

        id     = lexeme [ *~char_(';') ];

        auto data1 = bind(&Data::Fields::data1, _val);
        auto data2 = bind(&Data::Fields::data2, _val);

        other  = lexeme [ +(graph-'=') ] >> '=' >> (real_|int_|text);

        fields = *(
                    ("key1" >> lit('=') >> int_) [ data1 = _1 ]
                  | ("key2" >> lit('=') >> text) [ data2 = _1 ]
                  | other
                  );

        start  = timestamp >> ';' >> id >> -(';' >> fields);

        BOOST_SPIRIT_DEBUG_NODES((timestamp)(id)(start)(text)(other)(fields))
    }
  private:
    qi::rule<It,                                 Skipper> other;
    qi::rule<It, std::string(),                  Skipper> text, id;
    qi::rule<It, boost::posix_time::ptime(),     Skipper> timestamp;
    qi::rule<It, Data::Fields(),                 Skipper> fields;
    qi::rule<It, Data(),                         Skipper> start;
};

int main() {
    using It = std::string::const_iterator;
    for (std::string const input : {
            "2015-Jan-26 00:00:00;id",
            "2015-Jan-26 14:59:24;id;key2=\"value\"",
            "2015-Jan-26 14:59:24;id;key2=\"value\" key1=42",
            "2015-Jan-26 14:59:24;id;key2=\"value\" key1=42 something=awful __=4.74e-10 blarg;{blo;bloop='whatever \\'ignor\\'ed' key2=\"new} \\\"value\\\"\"",
            "2015-Jan-26 14:59:24.123;id;key1=42 key2=\"value\" ",
            })
    {
        std::cout << "----------------------------------------\nParsing '" << input << "'\n";
        It f(input.begin()), l(input.end());
        Data parsed;
        bool ok = qi::phrase_parse(f,l,grammar<It>(),qi::space,parsed);

        if (ok) {
            std::cout << "Parsing success\n";
            std::cout << parsed.timestamp << "\t" << parsed.id << "\n";
            std::cout << "data1: " << parsed.fields.data1 << "\n";
            std::cout << "data2: " << parsed.fields.data2 << "\n";
        } else {
            std::cout << "Parsing failed\n";
        }

        if (f!=l)
            std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }
}

这篇关于解析使用Boost异构数据::精神的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

解析使用Boost异构数据::精神 [英] Parsing heterogeneous data using Boost::Spirit

问题描述

推荐答案

更新

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

解析使用Boost异构数据::精神 [英] Parsing heterogeneous data using Boost::Spirit

问题描述

推荐答案

更新

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭