Boost精神解析带有前导和尾随空格的字符串 [英] Boost spirit parsing string with leading and trailing whitespace

查看:216
本文介绍了Boost精神解析带有前导和尾随空格的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我还是新的Boost精神。



我试图解析一个包含可能的前导和尾随空白和中间空格的字符串。我想使用字符串

执行以下操作:


  1. 删除任何尾随和前导空格



  2. 例如

     (my test1)(my test2)

    条件 -

     my test1
    my test2

    我使用boost使用以下逻辑

      :spirit:qi; 
    struct Parser:grammar< Iterator,attribType(),space_type>
    {
    public:
    Parser():Parser :: base_type(term)
    {
    group%='('>> >>')';
    names%= no_skip [alnum] [_ val = _1];
    }

    private:
    typedef boost :: spirit :: qi :: rule< Iterator,attribType(),space_type>规则;
    规则组;
    规则名称
    }

    它允许保留之间的空格。不幸的是,它还保持标题和尾随空白和多个中间空格。我想找到一个更好的逻辑。



    我确实看到参考使用一个自定义船长与boost :: spirit :: qi ::在线跳过,但我避风港没有一个有用的空间的例子。

    解决方案

    我建议在之后进行修剪/归一化 (不在期间)解析。



    也就是说,你可以这样:

     code> name%= lexeme [+ alnum]; 
    names%= +(name>>(& lit(')')| attr('')));
    group%='('>>(group | names)>>')';

    查看 Live on Coliru



    输出:


    $ b b

     解析成功
    期限:'my test1'
    期限:'my test2'

    为了可读性,我引入了 name 规则。请注意(& lit(')')| attr(''))是一种别出心裁的说法:


    ')'不执行任何操作,否则将''附加到合成属性


    完整代码:

      #define BOOST_SPIRIT_DEBUG 
    #include< boost / spirit / include / qi.hpp>
    #include< boost / spirit / include / phoenix.hpp>

    命名空间qi = boost :: spirit :: qi;
    namespace phx = boost :: phoenix;

    使用Iterator = std :: string :: const_iterator;

    使用attribType = std :: string;

    struct Parser:qi :: grammar< Iterator,attribType(),qi :: space_type>
    {
    public:
    Parser():Parser :: base_type(group)
    {
    使用命名空间qi;

    name%= lexeme [+ alnum];
    names%= +(name>>(& lit(')')| eps [phx :: push_back(_val,'')]))
    group%='('>>(group | names)>>')';

    BOOST_SPIRIT_DEBUG_NODES((name)(names)(group))
    }

    private:
    typedef boost :: spirit :: qi :: rule< ; Iterator,attribType(),qi :: space_type>规则;
    规则组,名称,名称;
    };


    int main()
    {
    std :: string const input =(my test1)(my test2);

    auto f(input.begin()),l(input.end());

    解析器p;

    std :: vector< attribType>数据;
    bool ok = qi :: phrase_parse(f,l,* p,qi :: space,data);

    if(ok)
    {
    std :: cout< parse success\\\
    ;
    for(auto const& term:data)
    std :: cout<< Term:'<<术语< '\\\
    ;
    }
    else
    {
    std :: cout< 解析失败\\\
    ;
    }

    if(f!= l)
    std :: cout< 剩余未解析的输入:'< std :: string(f,l)<< '\\\
    ;
    }


    I am still new to Boost spirit.

    I am trying to parse a string with possible lead and trailing whitespace and intermediate whitespace. I want to do the following with the string

    1. Remove any trailing and leading whitespace
    2. Limit the in-between word spaces to one whitespace

    For example

    "(  my   test1  ) (my  test2)"
    

    gets parsed as two terms -

    "my test1" 
    "my test2"
    

    I have used the following logic

    using boost::spirit::qi;
    struct Parser : grammar<Iterator, attribType(), space_type>
    {
       public:
         Parser() : Parser::base_type(term)
         {
             group  %= '(' >> (group | names) >> ')';
             names %= no_skip[alnum][_val=_1];
         }
    
      private:
        typedef boost::spirit::qi::rule<Iterator, attribType(), space_type> Rule;
        Rule group;
        Rule names
    }
    

    While it allows preserving the spaces in between. Unfortunately, it also keeps heading and trailing whitespace and multiple intermediate whitespace. I want to find a better logic for that.

    I did see references to using a custom skipper with boost::spirit::qi::skip online, but I haven't come across a useful example for spaces. Does anyone else have experience with it?

    解决方案

    I'd suggest doing the trimming/normalization after (not during) parsing.

    That said, you could hack it like this:

    name   %= lexeme [ +alnum ];
    names  %= +(name >> (&lit(')') | attr(' ')));
    group  %= '(' >> (group | names) >> ')';
    

    See it Live On Coliru

    Output:

    Parse success
    Term: 'my test1'
    Term: 'my test2'
    

    I introduced the name rule only for readability. Note that (&lit(')') | attr(' ')) is a fancy way of saying:

    If the next character matches ')' do nothing, otherwise, append ' ' to the synthesized attribute

    Full code:

    #define BOOST_SPIRIT_DEBUG
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    
    namespace qi = boost::spirit::qi;
    namespace phx = boost::phoenix;
    
    using Iterator = std::string::const_iterator;
    
    using attribType = std::string;
    
    struct Parser : qi::grammar<Iterator, attribType(), qi::space_type>
    {
       public:
         Parser() : Parser::base_type(group)
         {
             using namespace qi;
    
             name   %= lexeme [ +alnum ];
             names  %= +(name >> (&lit(')') | eps [ phx::push_back(_val, ' ') ]));
             group  %= '(' >> (group | names) >> ')';
    
             BOOST_SPIRIT_DEBUG_NODES((name)(names)(group))
         }
    
      private:
        typedef boost::spirit::qi::rule<Iterator, attribType(), qi::space_type> Rule;
        Rule group, names, name;
    };
    
    
    int main()
    {
        std::string const input = "(  my   test1  ) (my  test2)";
    
        auto f(input.begin()), l(input.end());
    
        Parser p;
    
        std::vector<attribType> data;
        bool ok = qi::phrase_parse(f, l, *p, qi::space, data);
    
        if (ok)
        {
            std::cout << "Parse success\n";
            for(auto const& term : data)
                std::cout << "Term: '" << term << "'\n";
        }
        else
        {
            std::cout << "Parse failed\n";
        }
    
        if (f!=l)
            std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
    }
    

    这篇关于Boost精神解析带有前导和尾随空格的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆