Boost精神解析带有前导和尾随空格的字符串 [英] Boost spirit parsing string with leading and trailing whitespace
问题描述
我还是新的Boost精神。
我试图解析一个包含可能的前导和尾随空白和中间空格的字符串。我想使用字符串
执行以下操作:- 删除任何尾随和前导空格
- Remove any trailing and leading whitespace
- Limit the in-between word spaces to one whitespace
例如
(my test1)(my test2)
条件 -
my test1
my test2
我使用boost使用以下逻辑
:spirit:qi;
struct Parser:grammar< Iterator,attribType(),space_type>
{
public:
Parser():Parser :: base_type(term)
{
group%='('>> >>')';
names%= no_skip [alnum] [_ val = _1];
}
private:
typedef boost :: spirit :: qi :: rule< Iterator,attribType(),space_type>规则;
规则组;
规则名称
}
它允许保留之间的空格。不幸的是,它还保持标题和尾随空白和多个中间空格。我想找到一个更好的逻辑。
我确实看到参考使用一个自定义船长与boost :: spirit :: qi ::在线跳过,但我避风港没有一个有用的空间的例子。
我建议在之后进行修剪/归一化 (不在期间)解析。
也就是说,你可以这样:
code> name%= lexeme [+ alnum];
names%= +(name>>(& lit(')')| attr('')));
group%='('>>(group | names)>>')';
输出:
$ b b
解析成功
期限:'my test1'
期限:'my test2'
为了可读性,我引入了 name
规则。请注意(& lit(')')| attr(''))
是一种别出心裁的说法:
')'
不执行任何操作,否则将''
附加到合成属性
完整代码:
#define BOOST_SPIRIT_DEBUG
#include< boost / spirit / include / qi.hpp>
#include< boost / spirit / include / phoenix.hpp>
命名空间qi = boost :: spirit :: qi;
namespace phx = boost :: phoenix;
使用Iterator = std :: string :: const_iterator;
使用attribType = std :: string;
struct Parser:qi :: grammar< Iterator,attribType(),qi :: space_type>
{
public:
Parser():Parser :: base_type(group)
{
使用命名空间qi;
name%= lexeme [+ alnum];
names%= +(name>>(& lit(')')| eps [phx :: push_back(_val,'')]))
group%='('>>(group | names)>>')';
BOOST_SPIRIT_DEBUG_NODES((name)(names)(group))
}
private:
typedef boost :: spirit :: qi :: rule< ; Iterator,attribType(),qi :: space_type>规则;
规则组,名称,名称;
};
int main()
{
std :: string const input =(my test1)(my test2);
auto f(input.begin()),l(input.end());
解析器p;
std :: vector< attribType>数据;
bool ok = qi :: phrase_parse(f,l,* p,qi :: space,data);
if(ok)
{
std :: cout< parse success\\\
;
for(auto const& term:data)
std :: cout<< Term:'<<术语< '\\\
;
}
else
{
std :: cout< 解析失败\\\
;
}
if(f!= l)
std :: cout< 剩余未解析的输入:'< std :: string(f,l)<< '\\\
;
}
I am still new to Boost spirit.
I am trying to parse a string with possible lead and trailing whitespace and intermediate whitespace. I want to do the following with the string
For example
"( my test1 ) (my test2)"
gets parsed as two terms -
"my test1"
"my test2"
I have used the following logic
using boost::spirit::qi;
struct Parser : grammar<Iterator, attribType(), space_type>
{
public:
Parser() : Parser::base_type(term)
{
group %= '(' >> (group | names) >> ')';
names %= no_skip[alnum][_val=_1];
}
private:
typedef boost::spirit::qi::rule<Iterator, attribType(), space_type> Rule;
Rule group;
Rule names
}
While it allows preserving the spaces in between. Unfortunately, it also keeps heading and trailing whitespace and multiple intermediate whitespace. I want to find a better logic for that.
I did see references to using a custom skipper with boost::spirit::qi::skip online, but I haven't come across a useful example for spaces. Does anyone else have experience with it?
I'd suggest doing the trimming/normalization after (not during) parsing.
That said, you could hack it like this:
name %= lexeme [ +alnum ];
names %= +(name >> (&lit(')') | attr(' ')));
group %= '(' >> (group | names) >> ')';
See it Live On Coliru
Output:
Parse success
Term: 'my test1'
Term: 'my test2'
I introduced the name
rule only for readability. Note that (&lit(')') | attr(' '))
is a fancy way of saying:
If the next character matches
')'
do nothing, otherwise, append' '
to the synthesized attribute
Full code:
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using Iterator = std::string::const_iterator;
using attribType = std::string;
struct Parser : qi::grammar<Iterator, attribType(), qi::space_type>
{
public:
Parser() : Parser::base_type(group)
{
using namespace qi;
name %= lexeme [ +alnum ];
names %= +(name >> (&lit(')') | eps [ phx::push_back(_val, ' ') ]));
group %= '(' >> (group | names) >> ')';
BOOST_SPIRIT_DEBUG_NODES((name)(names)(group))
}
private:
typedef boost::spirit::qi::rule<Iterator, attribType(), qi::space_type> Rule;
Rule group, names, name;
};
int main()
{
std::string const input = "( my test1 ) (my test2)";
auto f(input.begin()), l(input.end());
Parser p;
std::vector<attribType> data;
bool ok = qi::phrase_parse(f, l, *p, qi::space, data);
if (ok)
{
std::cout << "Parse success\n";
for(auto const& term : data)
std::cout << "Term: '" << term << "'\n";
}
else
{
std::cout << "Parse failed\n";
}
if (f!=l)
std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
}
这篇关于Boost精神解析带有前导和尾随空格的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!