为什么用Spirit解析一个空白行会在map中产生一个空的键值对? [英] Why does parsing a blank line with Spirit produce an empty key value pair in map?

查看:102
本文介绍了为什么用Spirit解析一个空白行会在map中产生一个空的键值对?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用Spirit.Qi解析一个简单的文件格式,其中键值对用等号分隔。该文件还支持注释和空行,以及引用的值。



我可以得到几乎所有这些工作按预期,但是,任何空白行或注释要添加到地图的空键值对。当地图交易向量时,不生成空白条目。



示例程序:



  #include< fstream> 
#include< iostream>
#include< string>
#include< map>

#includeboost / spirit / include / qi.hpp
#includeboost / spirit / include / karma.hpp
#includeboost / fusion / include /std_pair.hpp

使用命名空间boost :: spirit;
using namespace boost :: spirit :: qi;

////////////////////////////////////////// ////////////////////////////////////////
int main(int argc,char * * argv)
{
std :: ifstream ifs(file);
ifs>> std :: noskipws;

std :: map< std :: string,std :: string> vars;

auto value = as_string [* print];
auto quoted_value = as_string [lexeme [''>> *(print-'')>> '']];
auto key = as_string [>> *(alnum | char _('_'))];
auto kvp = key> "(quoted_value | value);

phrase_parse(
istream_iterator(ifs),
istream_iterator(),
-kvp%eol,
#'>> *(char_-eol))| blank,
vars);

std :: cout<<vars [< ()<<]:<< std :: endl;
std :: cout< karma :: format(*(karma :: string< < karma :: string<< karma :: eol),vars);

return 0;
}
/ pre>

输入文件:



  one = two 
三=四

#评论
五=六



输出:



  vars [4]:
- >
one - > two
three - > four
five - > six

解决方案

首先,你的程序有未定义的行为在我的系统崩溃)。原因是你不能使用 auto 表达式来存储状态解析器表达式。


请参阅将解析器分配给自动变量增强精神V2 qi与优化级别相关的错误和其他。参见例如这些答案,以了解有效的策略,以克服此限制。




其次,空行是因为语法。




之间有区别

 (-kvp)%qi :: eol 


$ b b

   - (kvp%qi :: eol)

第一个会导致可选地解析一个kvp,然后将结果推入属性容器。



后者将可选将1个或多个kvp解析为容器。请注意,如果不匹配,则不会推空值。



固定/演示



我建议




  • 使 c $ c> lexemes以及(只是通过删除规则声明上的Skipper,真的);您可能不希望'键名1 =值1 解析为keyname1 - > value1。您可能不希望允许 key#no value\\\

  • 使用BOOST_SPIRIT_DEBUG查看发生了什么

  • 不覆盖使用命名空间boost :: spirit 。这是个坏主意。相信我:/

  • 规则声明似乎​​是冗长的,但它们可减少规则定义中的错误

  • 使用 + eol 而不是 eol 允许空行,这似乎是你想要的



Live On Coliru

  #define BOOST_SPIRIT_DEBUG 
#include boost / spirit / include / qi.hpp
#includeboost / spirit / include / karma.hpp
#includeboost / fusion / include / std_pair.hpp
# include< fstream>
#include< map>

命名空间qi = boost :: spirit :: qi;
namespace karma = boost :: spirit :: karma;

template< typename It,typename Skipper,typename Data>
struct kvp_grammar:qi :: grammar< It,Data(),Skipper> {
kvp_grammar():kvp_grammar :: base_type(start){
using namespace qi;

value = raw [* print];
quoted_value =''>> *〜char_('')>> ''';
key = raw?> *(alnum |'_')];

kvp = key>>'='> quoted_value | value);
start = - (kvp%+ eol);

BOOST_SPIRIT_DEBUG_NODES(value)(quoted_value)(key)(kvp))
}
private:
using Pair = std :: pair< std :: string,std :: string> ;;
qi :: rule< It,std :: string(),Skipper> value;
qi :: rule< It,Pair(),Skipper> kvp;
qi :: rule< It,Data(),Skipper> start;
// lexeme:
qi: :rule< It,std :: string()> quoted_value,key;
};

template< typename Map>
bool parse_vars(std :: istream& is ,map& data){
使用It = boost :: spirit :: istream_iterator;
使用Skipper = qi :: rule< It> ;;

kvp_grammar&地图>语法;
它f(是>> std :: noskipws),l;

跳过跳过=('#'> qi :: eol)) qi :: blank;
return qi :: phrase_parse(f,l,grammar,skipper,data);
}

int main(){
std :: ifstream ifs(input.txt);

std :: map< std :: string,std :: string> vars;

if(parse_vars(ifs,vars)){
std :: cout< vars [<< vars.size()<< ]:< std :: endl;
std :: cout<< karma :: format(*(karma :: string< - >}
}

输出(目前在Coliru上损坏):

  vars [3]:
five - > six
one - > two
three - >四个

使用调试信息:

 < kvp> 
< try> one = two\\\
three = four\\\
\\\
< / try>
< key>
< try> one = two\\\
three = four\\\
\\\
< / try>
< success> = two\\\
three = four\\\
\\\
#C< / success>
< attributes> [[o,n,e]]< / attributes>
< / key>
< quoted_value>
< try> two\\\
three = four\\\
\\\
#Co< / try>
< fail />
< / quoted_value>
< value>
< try> two\\\
three = four\\\
\\\
#Co< / try>
< success> \\\
three = four\\\
\\\
#Comme< / success>
< attributes> [[t,w,o]]< / attributes>
< / value>
< success> \\\
three = four\\\
\\\
#Comme< / success>
< attributes> [[[o,n,e],[t,w,o]]]< / attributes>
< / kvp>
< kvp>
< try> three = four\\\
\\\
#Commen< / try>
< key>
< try> three = four\\\
\\\
#Commen< / try>
< success> = four\\\
\\\
#Comment\\\
fiv< / success>
< attributes> [[t,h,r,e,e]]< / attributes>
< / key>
" quoted_value>
< try> four\\\
\\\
#Comment\\\ nfive< / try>
< fail />
< / quoted_value>
< value>
< try> four\\\
\\\
#Comment\\\ nfive< / try>
< success> \\\
\\\
#Comment\\\
five = six< / success>
< attributes> [[f,o,u,r]]< / attributes>
< / value>
< success> \\\
\\\
#Comment\\\
five = six< / success>
< attributes> [[[t,h,r,e,e],[f,o,u,r]]]
< / kvp>
< kvp>
< try> five = six\\\
< / try>
< key>
< try> five = six\\\
< / try>
< success> = six\\\
< / success>
< attributes> [[f,i,v,e]]< / attributes>
< / key>
< quoted_value>
< try> six\\\
< / try>
< fail />
< / quoted_value>
< value>
< try> six\\\
< / try>
< success> \\\
< / success>
< attributes> [[s,i,x]]< / attributes>
< / value>
< success> \\\
< / success>
< attributes> [[[f,i,v,e],[s,i,x]]]< / attributes>
< / kvp>
< kvp>
< try>< / try>
< key>
< try>< / try>
< fail />
< / key>
< fail />
< / kvp>


I'm trying to use Spirit.Qi to parse a simple file format that has key value pairs separated with an equals sign. The file also supports comments and blank lines, as well as quoted values.

I can get nearly all of this to work as expected, however, any blank lines or comments cause an empty key value pair to be added to the map. When the map is traded for a vector, no blank entries are produced.

Example Program:

#include <fstream> 
#include <iostream> 
#include <string> 
#include <map> 

#include "boost/spirit/include/qi.hpp" 
#include "boost/spirit/include/karma.hpp" 
#include "boost/fusion/include/std_pair.hpp" 

using namespace boost::spirit; 
using namespace boost::spirit::qi; 

//////////////////////////////////////////////////////////////////////////////// 
int main(int argc, char** argv) 
{ 
   std::ifstream ifs("file"); 
   ifs >> std::noskipws; 

   std::map< std::string, std::string > vars; 

   auto value = as_string[*print]; 
   auto quoted_value = as_string[lexeme['"' >> *(print-'"') >> '"']]; 
   auto key = as_string[alpha >> *(alnum | char_('_'))]; 
   auto kvp = key >> '=' >> (quoted_value | value); 

   phrase_parse( 
      istream_iterator(ifs), 
      istream_iterator(), 
      -kvp % eol, 
      ('#' >> *(char_-eol)) | blank, 
      vars); 

   std::cout << "vars[" << vars.size() << "]:" << std::endl; 
   std::cout << karma::format(*(karma::string << " -> " << karma::string << karma::eol), vars); 

   return 0; 
}

Input File:

one=two
three=four

# Comment
five=six

Output:

vars[4]:
 ->
one -> two
three -> four
five -> six

Where is the empty key value pair coming from? And how can I prevent it from being generated?

解决方案

Firstly, your program has undefined behaviour (and indeed it crashes on my system). The reason is you can't use auto expressions to store stateful parser expressions.

See Assigning parsers to auto variables, boost spirit V2 qi bug associated with optimization level and others. See e.g. these answers for useful strategies to get around this limitation.

Secondly, the empty line is because of the grammar.

There's a difference between

  (-kvp) % qi::eol

or

  -(kvp % qi::eol)

The first will result in "optionally parsing a kvp" followed by "push the result into the attribute container".

The latter will optionally "parse 1 or more kvp into a container". Note that this won't push the empty value if it wasn't matched.

Fixed/demo

I suggest

  • making key and value lexemes as well (just by dropping the Skipper on the rule declarations, really); You probably didn't want 'key name 1=value 1 to parse as "keyname1" -> "value1". You probably didn't want to allow key # no value\n either.
  • using BOOST_SPIRIT_DEBUG to see what's going on
  • not blanket using namespace boost::spirit. It's a bad idea. Trust me :/
  • rule declarations may appear to be verbose, but they do reduce the cruft in the rule definitions
  • using +eol instead of eol allows for the empty lines, which appears to be what you want

Live On Coliru

#define BOOST_SPIRIT_DEBUG
#include "boost/spirit/include/qi.hpp" 
#include "boost/spirit/include/karma.hpp" 
#include "boost/fusion/include/std_pair.hpp" 
#include <fstream> 
#include <map> 

namespace qi    = boost::spirit::qi;
namespace karma = boost::spirit::karma;

template <typename It, typename Skipper, typename Data>
struct kvp_grammar : qi::grammar<It, Data(), Skipper> {
    kvp_grammar() : kvp_grammar::base_type(start) {
        using namespace qi;

        value        = raw [*print];
        quoted_value = '"' >> *~char_('"') >> '"';
        key          = raw [ alpha >> *(alnum | '_') ];

        kvp          = key >> '=' >> (quoted_value | value);
        start        = -(kvp % +eol);

        BOOST_SPIRIT_DEBUG_NODES((value)(quoted_value)(key)(kvp))
    }
  private:
    using Pair = std::pair<std::string, std::string>;
    qi::rule<It, std::string(), Skipper> value;
    qi::rule<It, Pair(),        Skipper> kvp;
    qi::rule<It, Data(),        Skipper> start;
    // lexeme:
    qi::rule<It, std::string()> quoted_value, key;
};

template <typename Map>
bool parse_vars(std::istream& is, Map& data) {
    using It = boost::spirit::istream_iterator;
    using Skipper = qi::rule<It>;

    kvp_grammar<It, Skipper, Map> grammar;
    It f(is >> std::noskipws), l;

    Skipper skipper = ('#' >> *(qi::char_-qi::eol)) | qi::blank;
    return qi::phrase_parse(f, l, grammar, skipper, data); 
}

int main() { 
    std::ifstream ifs("input.txt"); 

    std::map<std::string, std::string> vars; 

    if (parse_vars(ifs, vars)) {
        std::cout << "vars[" << vars.size() << "]:" << std::endl; 
        std::cout << karma::format(*(karma::string << " -> " << karma::string << karma::eol), vars); 
    }
}

Output (currently broken on Coliru):

vars[3]:
five -> six
one -> two
three -> four

With debug info:

<kvp>
  <try>one=two\nthree=four\n\n</try>
  <key>
    <try>one=two\nthree=four\n\n</try>
    <success>=two\nthree=four\n\n# C</success>
    <attributes>[[o, n, e]]</attributes>
  </key>
  <quoted_value>
    <try>two\nthree=four\n\n# Co</try>
    <fail/>
  </quoted_value>
  <value>
    <try>two\nthree=four\n\n# Co</try>
    <success>\nthree=four\n\n# Comme</success>
    <attributes>[[t, w, o]]</attributes>
  </value>
  <success>\nthree=four\n\n# Comme</success>
  <attributes>[[[o, n, e], [t, w, o]]]</attributes>
</kvp>
<kvp>
  <try>three=four\n\n# Commen</try>
  <key>
    <try>three=four\n\n# Commen</try>
    <success>=four\n\n# Comment\nfiv</success>
    <attributes>[[t, h, r, e, e]]</attributes>
  </key>
  <quoted_value>
    <try>four\n\n# Comment\nfive</try>
    <fail/>
  </quoted_value>
  <value>
    <try>four\n\n# Comment\nfive</try>
    <success>\n\n# Comment\nfive=six</success>
    <attributes>[[f, o, u, r]]</attributes>
  </value>
  <success>\n\n# Comment\nfive=six</success>
  <attributes>[[[t, h, r, e, e], [f, o, u, r]]]</attributes>
</kvp>
<kvp>
  <try>five=six\n</try>
  <key>
    <try>five=six\n</try>
    <success>=six\n</success>
    <attributes>[[f, i, v, e]]</attributes>
  </key>
  <quoted_value>
    <try>six\n</try>
    <fail/>
  </quoted_value>
  <value>
    <try>six\n</try>
    <success>\n</success>
    <attributes>[[s, i, x]]</attributes>
  </value>
  <success>\n</success>
  <attributes>[[[f, i, v, e], [s, i, x]]]</attributes>
</kvp>
<kvp>
  <try></try>
  <key>
    <try></try>
    <fail/>
  </key>
  <fail/>
</kvp>

这篇关于为什么用Spirit解析一个空白行会在map中产生一个空的键值对?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆