读空值与升压::精神 [英] Read empty values with boost::spirit

查看：173 发布时间：2016/8/12 17:47:17 c++ boost boost-spirit

本文介绍了读空值与升压::精神的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想读的CSV到一个结构：

 结构数据
{
   性病::串;
   标准::字符串B：
   性病::串c;
}

不过，我想读，甚至空字符串，以确保所有值都各得其所。
我适应了结构以一个boost ::融合，所以下面的工作：

  //我们的解析器（使用自定义的船长跳过注释和空行）
模板＆LT; typename的迭代器，类型名船长= comment_skipper＆LT;＆迭代器GT; ＆GT;
  结构google_parser：补气::语法＆LT;迭代器，地址簿（），船长＆GT;
{
  google_parser（）：google_parser :: base_type（联系人，联系人）
  {
    使用气:: EOL;
    使用气:: EPS;
    使用气:: _ 1;
    使用气:: _ VAL;
    使用气::重复;
    使用standard_wide :: char_;
    使用凤:: at_c;
    使用凤:: VAL;    值= *（char_  - '，' -  EOL）_val + = _1]。    //这工作，但仅适用于小结构
    条目％=价值＆GT;＆GT; '，'＆GT;＆GT;值＆GT;＆GT; '，'＆GT;＆GT;值＆GT;＆GT; EOL;
  }  齐::规则＆LT;迭代器，标准::字符串（）＆GT;值;
  齐::规则＆LT;迭代器，数据（）＆GT;条目;
};

不幸的是，在矢量重复存储所有非空值，所以属性的值可以混合在一起（即如果 B场为空，也可能包含来自 c中的含量）

 条目％=重复（2）值与GT;＆GT; '，']≥＆GT;值＆GT;＆GT; EOL;

我想使用类似重复短规则为我的结构在实际应用中60属性！不仅是写60规则繁琐，但它似乎加速不喜欢长时间规则...

解决方案

您只是想确保你的解析空字符串太值。

  =价值+（char_  - '，' -  EOL）| ATTR（（未指定））;
进入=价值＆GT;＆GT; '，'＆GT;＆GT;值＆GT;＆GT; '，'＆GT;＆GT;值＆GT;＆GT; EOL;

请参阅演示：

<大骨节病> 住在Coliru

  //＃定义BOOST_SPIRIT_DEBUG
＃包括LT＆;升压/融合/调整/ struct.hpp＆GT;
＃包括LT＆;升压/精神/有/ qi.hpp＆GT;命名空间补气=的boost ::精神::补气;结构数据{
    性病::串;
    标准::字符串B：
    性病::串c;
};BOOST_FUSION_ADAPT_STRUCT（数据，（标准::字符串，一）（标准::字符串，B）（标准::字符串，C））模板＆LT; typename的迭代器，类型名船长=补气:: blank_type＆GT;
结构google_parser：补气::语法＆LT;迭代器，数据（），船长＆GT; {
    google_parser（）：google_parser :: base_type（入门，人脉）{
        使用命名空间补气;        值= +（char_  - '，' -  EOL）| ATTR（（未指定））;
        进入=价值＆GT;＆GT; '，'＆GT;＆GT;值＆GT;＆GT; '，'＆GT;＆GT;值＆GT;＆GT; EOL;        BOOST_SPIRIT_DEBUG_NODES（（值）（输入））
    }
  私人的：
    齐::规则＆LT;迭代器，标准::字符串（）＆GT;值;
    齐::规则＆LT;迭代器，数据（），船长＆GT;条目;
};诠释主（）{
    使用它=标准::字符串::为const_iterator;
    google_parser＆LT;它＆GT;磷;    对于（标准::字符串输入：{
            什么的，太可怕了，为\\ n，
            精,,只是\\ n
            像缺点什么：,, \\ n
        }）
    {
        它F = input.begin（），L = input.end（）;        数据分析;
        布尔OK =齐:: phrase_parse（F，L，P，补气::空白，解析）;        如果（OK）
            性病::法院LT＆;＆LT; 经分析：＆LT;＆LT; parsed.a＆LT;＆LT; '，'＆LT;＆LT; parsed.b＆LT;＆LT; '，'＆LT;＆LT; parsed.c＆LT;＆LT; '\\ n;
        其他
            性病::法院LT＆;＆LT; 解析失败\\ n;        如果（F！= 1）
            性病::法院LT＆;＆LT; 剩余未解析：'＆LT;＆LT;标准::字符串（F，L）LT;＆LT; '\\ n;
    }
}

打印：

 解析的：'东西'，'可怕'，'是'
解析：精，（未指定）'，'只是'
解析：像缺了点什么：'，'（不明），（未指定）

不过，你有一个更大的问题。该齐::重复的假设（2）[值] 将解析分成2个字符串不起作用。

重复，如运算符* ，运营商+ 和操作符％解析到一个容器属性。在这种情况下，容器属性（字符串）将接收来自第二值输入，以及：

<大骨节病> 住在Coliru

 解析的：somethingawful'，'是'，''
解析：'精（未指定）'，'刚'，''
解析：像缺了点什么：（未指定），（未指定）'，''

由于这是不是你想要的，考虑你的数据类型：

要么不调整结构，而是写一个定制特性分配领域（见的http://www.boost.org/doc/libs/1_57_0/libs/spirit/doc/html/spirit/advanced/customize.html)

改变结构包含的std ::字符串的向量相匹配的公开的属性

或创建一个自动分析器生成：

的`汽车_` 办法：

如果你教齐如何提取单个值，你可以使用像

一个简单的规则

 项=跳过（队长（）|'，'）[AUTO_＆GT;＆GT; EOL;

这样，精神本身会产生对给定的顺序融合正确数量的价值提取的！

下面是一个快速的肮脏的方法：

CAVEAT 专业为的std ::字符串直接像这可能不是最好的主意（它可能并不一定合适，而且可能严重相互作用与其他解析器）。然而，在默认情况下 create_parser＆LT;标准::字符串＆GT; 未定义（因为，它会做什么？），所以我抓住了这个演示的目的的机会：

 空间boost {空间{精神特质空间{
    模板＆LT;＆GT;结构create_parser＆LT;标准::字符串＆GT; {
        原的typedef :: ::的result_of  -  DEEP_COPY LT;
            BOOST_TYPEOF（
                齐::语义[+（气:: char_  - ， - 齐:: EOL）|齐:: ATTR（（未指定））
            ）
        ＆GT; ::类型类型;        静态类型调用（）{
            返回原:: DEEP_COPY（
                齐::语义[+（气:: char_  - ， - 齐:: EOL）|齐:: ATTR（（未指定））
            ）;
        }
    };
}}}

再次看到演示输出：

<大骨节病> 住在Coliru

 解析的：'东西'，'可怕'，'是'
解析：'精'，'刚'，'（未指定）
解析：像缺了点什么：'，'（不明），（未指定）

注意有一些高级的巫术得到队长的工作刚刚好（见跳过（）[] 和语义[] ）。一些一般性的解释可以在这里找到：<一href=\"http://stackoverflow.com/questions/17072987/boost-spirit-skipper-issues/17073965#17073965\">Boost精神队长问题

更新
容器方法
有一个微妙了这一点。两人竟。所以这里有一个演示：
<大骨节病> 住在Coliru
  //＃定义BOOST_SPIRIT_DEBUG
＃包括LT＆;升压/融合/调整/ struct.hpp＆GT;
＃包括LT＆;升压/精神/有/ qi.hpp＆GT;命名空间补气=的boost ::精神::补气;结构数据{
    的std ::矢量＆lt;标准::字符串＆GT;部分;
};BOOST_FUSION_ADAPT_STRUCT（数据，（性病::矢量＆lt;标准::字符串＆gt;中部分））模板＆LT; typename的迭代器，类型名船长=补气:: blank_type＆GT;
结构google_parser：补气::语法＆LT;迭代器，数据（），船长＆GT; {
    google_parser（）：google_parser :: base_type（入门，人脉）{
        使用命名空间补气;
        齐::为＆lt;的std ::矢量＆lt;标准::字符串＆GT; ＆GT;串;        值= +（char_  - '，' -  EOL）| ATTR（（未指定））;
        进入=字符串[重复（2）值与GT;＆GT; '，']≥＆GT;值＆GT;＆GT; EOL;        BOOST_SPIRIT_DEBUG_NODES（（值）（输入））
    }
  私人的：
    齐::规则＆LT;迭代器，标准::字符串（）＆GT;值;
    齐::规则＆LT;迭代器，数据（），船长＆GT;条目;
};诠释主（）{
    使用它=标准::字符串::为const_iterator;
    google_parser＆LT;它＆GT;磷;    对于（标准::字符串输入：{
            什么的，太可怕了，为\\ n，
            精,,只是\\ n
            像缺点什么：,, \\ n
        }）
    {
        它F = input.begin（），L = input.end（）;        数据分析;
        布尔OK =齐:: phrase_parse（F，L，P，补气::空白，解析）;        如果（OK）{
            性病::法院LT＆;＆LT; 经分析：
            为（自动＆安培;部分：parsed.parts）
                性病::法院LT＆;＆LT; '＆所述;＆下;部分＆LT;＆LT; ';
            性病::法院LT＆;＆LT; \\ n;
        }
        其他
            性病::法院LT＆;＆LT; 解析失败\\ n;        如果（F！= 1）
            性病::法院LT＆;＆LT; 剩余未解析：'＆LT;＆LT;标准::字符串（F，L）LT;＆LT; '\\ n;
    }
}
 
细微之处是：

适应的单元素序列击中边例自动属性处理：<一href=\"http://stackoverflow.com/questions/19823413/spirit-qi-attribute-propagation-issue-with-single-member-struct/19824426#19824426\">Spirit齐属性传播问题与单个成员的结构

精神需要手把手在这种特殊情况下对待重复[...]＆GT;＆GT;值作为合成一个容器/原子/。在 <一个href=\"http://www.boost.org/doc/libs/1_57_0/libs/spirit/doc/html/spirit/qi/reference/directive/as.html\"相对=nofollow> 为＆lt; T＆GT; 指令解决了这里

I want to read a CSV into a struct :
struct data 
{
   std::string a;
   std::string b;
   std::string c;
}
However, I want to read even empty string to ensure all values are in their proper place. I adapted the struct to a boost::fusion, so the following works :
// Our parser (using a custom skipper to skip comments and empty lines )
template <typename Iterator, typename skipper = comment_skipper<Iterator> >
  struct google_parser : qi::grammar<Iterator, addressbook(), skipper>
{
  google_parser() : google_parser::base_type(contacts, "contacts")
  {
    using qi::eol;
    using qi::eps;
    using qi::_1;
    using qi::_val;
    using qi::repeat;
    using standard_wide::char_;
    using phoenix::at_c;
    using phoenix::val;

    value = *(char_ - ',' - eol) [_val += _1];

    // This works but only for small structs
    entry %= value >> ',' >> value >> ',' >> value >> eol;
  }

  qi::rule<Iterator, std::string()> value;
  qi::rule<Iterator, data()> entry;
};
Unfortunately, repeat stores in a vector all non-empty values so the values of attributes may be mixed together (i.e if the field for b is null, it may contains the content from c):
    entry %= repeat(2)[ value >> ','] >> value >> eol;
I would like to use a short rule similar to repeat as my struct has 60 attributes in practice ! Not only is writing 60 rules tedious but it seems Boost does not like long rules...
解决方案
You just want to make sure you parse a value for "empty" strings too.
value = +(char_ - ',' - eol) | attr("(unspecified)");
entry = value >> ',' >> value >> ',' >> value >> eol;
See the demo:

Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

struct data {
    std::string a;
    std::string b;
    std::string c;
};

BOOST_FUSION_ADAPT_STRUCT(data, (std::string, a)(std::string, b)(std::string, c))

template <typename Iterator, typename skipper = qi::blank_type>
struct google_parser : qi::grammar<Iterator, data(), skipper> {
    google_parser() : google_parser::base_type(entry, "contacts") {
        using namespace qi;

        value = +(char_ - ',' - eol) | attr("(unspecified)");
        entry = value >> ',' >> value >> ',' >> value >> eol;

        BOOST_SPIRIT_DEBUG_NODES((value)(entry))
    }
  private:
    qi::rule<Iterator, std::string()> value;
    qi::rule<Iterator, data(), skipper> entry;
};

int main() {
    using It = std::string::const_iterator;
    google_parser<It> p;

    for (std::string input : { 
            "something, awful, is\n",
            "fine,,just\n",
            "like something missing: ,,\n",
        })
    {
        It f = input.begin(), l = input.end();

        data parsed;
        bool ok = qi::phrase_parse(f,l,p,qi::blank,parsed);

        if (ok)
            std::cout << "Parsed: '" << parsed.a << "', '" << parsed.b << "', '" << parsed.c << "'\n";
        else
            std::cout << "Parse failed\n";

        if (f!=l)
            std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }
}
Prints:
Parsed: 'something', 'awful', 'is'
Parsed: 'fine', '(unspecified)', 'just'
Parsed: 'like something missing: ', '(unspecified)', '(unspecified)'
However, you have a bigger problem. The assumption that qi::repeat(2) [ value ] will parse into 2 strings doesn't work.

repeat, like operator*, operator+ and operator% parse into a container attribute. In this case the container attribute (string) will receive the input from the second value as well:

Live On Coliru
Parsed: 'somethingawful', 'is', ''
Parsed: 'fine(unspecified)', 'just', ''
Parsed: 'like something missing: (unspecified)', '(unspecified)', ''
Since this is not what you want, reconsider your data types:

either don't adapt the struct but instead write a customization trait to assign the fields (see http://www.boost.org/doc/libs/1_57_0/libs/spirit/doc/html/spirit/advanced/customize.html)

change the struct to contain a vector of std::string to match the exposed attributes

or create an auto-parser generator:

The auto_ approach:

If you teach Qi how to extract a single value, you can use a simple rule like
entry = skip(skipper() | ',') [auto_] >> eol;
This way, Spirit itself will generate the correct number of value extractions for the given Fusion sequence!

Here's a quick an dirty approach:

CAVEAT Specializing for std::string directly like this might not be the best idea (it might not always be appropriate and might interact badly with other parsers). However, by default create_parser<std::string> is not defined (because, what would it do?) so I seized the opportunity for the purpose of this demonstration:
namespace boost { namespace spirit { namespace traits {
    template <> struct create_parser<std::string> {
        typedef proto::result_of::deep_copy<
            BOOST_TYPEOF(
                qi::lexeme [+(qi::char_ - ',' - qi::eol)] | qi::attr("(unspecified)")
            )
        >::type type;

        static type call() {
            return proto::deep_copy(
                qi::lexeme [+(qi::char_ - ',' - qi::eol)] | qi::attr("(unspecified)")
            );
        }
    };
}}}
Again, see the demo output:

Live On Coliru
Parsed: 'something', 'awful', 'is'
Parsed: 'fine', 'just', '(unspecified)'
Parsed: 'like something missing: ', '(unspecified)', '(unspecified)'
NOTE There was some advanced sorcery to get the skipper to work "just right" (see skip()[] and lexeme[]). Some general explanations can be found here: Boost spirit skipper issues

UPDATE

The Container Approach

There's a subtlety to that. Two actually. So here's a demo:

Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

struct data {
    std::vector<std::string> parts;
};

BOOST_FUSION_ADAPT_STRUCT(data, (std::vector<std::string>, parts))

template <typename Iterator, typename skipper = qi::blank_type>
struct google_parser : qi::grammar<Iterator, data(), skipper> {
    google_parser() : google_parser::base_type(entry, "contacts") {
        using namespace qi;
        qi::as<std::vector<std::string> > strings;

        value = +(char_ - ',' - eol) | attr("(unspecified)");
        entry = strings [ repeat(2) [ value >> ',' ] >> value ] >> eol;

        BOOST_SPIRIT_DEBUG_NODES((value)(entry))
    }
  private:
    qi::rule<Iterator, std::string()> value;
    qi::rule<Iterator, data(), skipper> entry;
};

int main() {
    using It = std::string::const_iterator;
    google_parser<It> p;

    for (std::string input : { 
            "something, awful, is\n",
            "fine,,just\n",
            "like something missing: ,,\n",
        })
    {
        It f = input.begin(), l = input.end();

        data parsed;
        bool ok = qi::phrase_parse(f,l,p,qi::blank,parsed);

        if (ok) {
            std::cout << "Parsed: ";
            for (auto& part : parsed.parts) 
                std::cout << "'" << part << "' ";
            std::cout << "\n";
        }
        else
            std::cout << "Parse failed\n";

        if (f!=l)
            std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }
}
The subtleties are:

adapting a single-element sequence hits edge cases with automatic attribute handling: Spirit Qi attribute propagation issue with single-member struct

Spirit needs hand-holding in this particular case to treat the repeat[...]>>value as synthesizing a single container /atomically/. The as<T> directive solves that here

这篇关于读空值与升压::精神的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

读空值与升压::精神 [英] Read empty values with boost::spirit

问题描述

的`汽车_` 办法：

更新

容器方法

The `auto_` approach:

UPDATE

The Container Approach

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

读空值与升压::精神 [英] Read empty values with boost::spirit

问题描述

的汽车_ 办法：

更新

容器方法

The auto_ approach:

UPDATE

The Container Approach

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

的`汽车_` 办法：

The `auto_` approach:

登录关闭