提振精神:: ::补气期望解析器和解析器分组意外的行为 [英] boost::spirit::qi Expectation Parser and parser grouping unexpected behaviour
问题描述
我希望有人可以通过我的使用&GT的无知照亮一盏灯;
和>>
运营商在精神分析。
我有一个工作的语法,其中顶层的规则看起来像
检测=标识符>> operationRule>>重复(1,3)[any_string]≥>箭头GT;> any_string>> conditionRule;
它依赖于自动分配解析值的融合适应的结构(即升压元组)。
属性不过,我知道,一旦我们的operationRule匹配,我们必须继续或失败(即我们不想让回溯到尝试以标识符
其他规则)。
检测=标识符>>
operationRule>重复(1,3)[any_string]≥箭头GT; any_string> conditionRule;
这会导致一个神秘的编译器错误('的boost ::集装箱:使用类模板需要模板参数列表
)。 Futz了一下周围和下面的编译:
检测=标识符>>
(operationRule>重复(1,3)[any_string]≥箭头> any_string> conditionRule);
但属性设置不再工作 - 我的数据结构包含解析后的垃圾。这可以通过添加操作,如固定[at_c℃的>(_ VAL)= _1]
,但似乎有点麻烦 - 以及根据升压使事情更慢文档。
所以,我的问题是
- 是否值得preventing背跟踪?
- 为什么需要分组运算符
()
- 我的最后一个例子上面真的停下来后,
operationRule
回溯匹配(我不怀疑,看来,如果整个内部解析器(...)
失败回溯将被允许)? - 如果答案为previous问题是/不/,我怎么构造允许回溯如果
操作
是/不是/匹配,但不执行规则一旦操作/是/匹配不允许出尔反尔? - 为什么分组运算符破坏属性语法 - 需要采取行动
我意识到这是一个相当宽泛的问题 - 在正确的方向指向任何暗示将大大AP preciated
!-
是否值得preventing背跟踪?
当然可以。 preventing一般回追踪是提高性能分析器一个行之有效的方法。
- 减少使用(负)向前看(运营商,运营商 - 和一些运营商和放大器;!)
- 为了分行(操作员|,运营商||,运营商^有些运算符* / - / +)等,最常见的/有可能分支排在第一位,或者说是最昂贵的分支试图最后一个
使用的期望点(
>
)基本上不减少回溯:它只是不允许它的。这将使针对性的错误消息,prevent无用的解析 - 到 - 在未知'。 -
为什么需要分组
运算符()
我不知道。我做了一个检查<一href=\"http://stackoverflow.com/questions/9404189/detecting-the-parameter-types-in-a-spirit-semantic-action/9405265#9405265\">using从这里我的
what_is_the_attr
助手-
识别&GT;&GT; OP&GT;&GT;重复(1,3)的&GT;&GT; - &gt;中&GT;&GT;任何
结果
综合属性:融合::的Vector4&LT;字符串,字符串,矢量&lt;串&gt;中串&GT;
-
识别&GT;&GT; OP&GT;重复(1,3)的&GT; - &gt;中&GT;任何
结果
综合属性:融合的Vector3 :: LT与融合:: vector2&LT;字符串,字符串&gt;中矢量&lt;串&gt;中串&GT;
我还没有发现的需求的分组SUBEX pressions使用圆括号(事编译),但显然
DataT
必须修改以匹配改变的布局。的typedef的boost ::元组LT;
提高::元组LT;的std ::字符串,性病::字符串&gt;中
的std ::矢量&lt;标准::字符串&gt;中
标准::字符串
&GT; DataT; -
满code以下表明我是多么想preFER要做到这一点,使用适应结构。
-
是否operationRule匹配后,我上面的例子中真止回追踪(我不怀疑,似乎里面,如果整个分析器(...)失败回溯将被允许)?
当然可以。如果期望(S)不能满足,一个
齐:: expectation_failure&LT;&GT;
异常。这种默认的中止的解析。你可以使用气:: ON_ERROR到重试
,失败
,接受
或重新抛出
。在<一个href=\"http://www.boost.org/doc/libs/1_41_0/libs/spirit/doc/html/spirit/qi/tutorials/mini_xml___error_handling.html#spirit.qi.tutorials.mini_xml___error_handling.on_error\"相对=nofollow> MiniXML例如对使用的期望点,齐很好的例子:: ON_ERROR
-
如果答案为previous问题是/不/, <击>如何构建允许回溯的规则,如果操作/不是/匹配,但不执行一旦操作/是/匹配不允许出尔反尔?击>
-
为什么分组运算符破坏属性语法 - 需要采取行动
有不破坏属性语法,它只是改变了暴露类型。所以,如果你绑定一个适当的属性参照规则/语法,你不需要语义动作。现在,我觉得应该有办法去无分组<击>,所以让我试试吧(preferrably你的短自足样品)。罢工>的事实上,我发现没有这种必要的。我添加了一个完整的例子来帮助你看到正在发生的事情在我的测试,而不是使用语义动作。
全code
满code显示5情景:
-
选项1:原装不期望
的(没有相关的变化)的
-
选项2:符合预期。
的使用的DataT修改后的typedef(如上图所示)的
-
方法3:调整结构,不期望
的使用与BOOST_FUSION_ADAPT_STRUCT一个用户自定义的结构的
-
方案4:调整结构,符合市场预期。
的从OPTION 3修改调整结构的
-
方案5:前瞻破解
的这一次利用了的'聪明'黑客,使所有的
&GT(?);&GT;
成的预期,并检测一个operationRule
-match的presence事前。这当然是最理想的,但可以让你保持DataT
修改,而无需使用语义动作。的
显然,定义 OPTION
为需要的值在编译之前。
的#include&LT;升压/精神/有/ qi.hpp&GT;
#包括LT&;升压/精神/有/ karma.hpp&GT;
#包括LT&;升压/精神/有/ phoenix.hpp&GT;
#包括LT&;升压/融合/ adapted.hpp&GT;
#包括LT&;&iostream的GT;命名空间补气=的boost ::精神::补气;
命名空间因缘=的boost ::精神::人缘;OPTION的#ifndef
#定义方案5
#万一#如果OPTION == 1 || OPTION == 5 //原来没有预期(或超前破解版)
TYPEDEF提振::元组LT;的std ::字符串,性病::字符串的std ::矢量&lt;标准::字符串&gt;中的std ::字符串&GT; DataT;
#elif指令选项== 2 //符合预期
TYPEDEF提振::元组LT;的boost ::元组LT;的std ::字符串,性病::字符串&gt;中的std ::矢量&lt;标准::字符串&gt;中的std ::字符串&GT; DataT;
#elif指令OPTION == 3 //调整结构,没有预期
结构DataT
{
的std ::字符串标识,操作;
的std ::矢量&lt;标准::字符串&GT;值;
标准::字符串目的地;
}; BOOST_FUSION_ADAPT_STRUCT(DataT,(标准::字符串,标识符)(标准::字符串,操作)(的std ::矢量&lt;标准::字符串&gt;中值)(标准::字符串,目的地));
#elif指令OPTION == 4 //调整结构,符合市场预期
结构IdOpT
{
的std ::字符串标识,操作;
};
结构DataT
{
IdOpT idop;
的std ::矢量&lt;标准::字符串&GT;值;
标准::字符串目的地;
}; BOOST_FUSION_ADAPT_STRUCT(IdOpT,(标准::字符串,标识符)(标准::字符串,操作));
BOOST_FUSION_ADAPT_STRUCT(DataT,(IdOpT,idop)(的std ::矢量&lt;标准::字符串&gt;中值)(标准::字符串,目的地));
#万一模板&LT; typename的迭代器&GT;
结构test_parser:补气::语法&LT;迭代器,DataT(),齐::空间类型,气虚::当地人&LT;&烧焦GT; &GT;
{
test_parser():test_parser :: base_type(测试,测试)
{
使用命名空间补气; quoted_string =
省略[字符_('\\)[_a = _1]
&GT;&GT; no_skip [*(char_ - 字符_(_一))]
&GT;点亮(_a); any_string = quoted_string | +补气:: alnum; 标识符=语义[alnum&GT;&GT; *图] operationRule =字符串(加)| 亚健康;
箭头= - &gt;中;#如果OPTION == 1 || OPTION == 3 //没有预期
测试=标识符&GT;&GT; operationRule&GT;&GT;重复(1,3)[any_string]≥&GT;箭头GT;&GT; any_string;
#elif指令选项== 2 || OPTION == 4 //符合预期
测试=标识符&GT;&GT; operationRule&GT;重复(1,3)[any_string]≥箭头GT; any_string;
#elif指令OPTION == 5 //超前黑客
测试=及(标识符&GT;&GT; operationRule)GT;标识符&GT; operationRule&GT;重复(1,3)[any_string]≥箭头GT; any_string;
#万一
} 齐::规则&LT;迭代器,气::空间类型/ *,补气::当地人&LT;&烧焦GT; * /&GT;箭头;
齐::规则&LT;迭代器,标准::字符串(),齐::空间类型/ *,补气::当地人&LT;&烧焦GT; * /&GT; operationRule;
齐::规则&LT;迭代器,标准::字符串(),齐::空间类型/ *,补气::当地人&LT;&烧焦GT; * /&GT;标识符;
齐::规则&LT;迭代器,标准::字符串(),齐::空间类型,气虚::当地人&LT;&烧焦GT; &GT; quoted_string,any_string;
齐::规则&LT;迭代器,DataT(),齐::空间类型,气虚::当地人&LT;&烧焦GT; &GT;测试;
};诠释的main()
{
性病::字符串str(addx001加上'STR1'\\str2的\\ - &GT; \\STR3 \\);
test_parser&LT;的std ::字符串::为const_iterator&GT;语法;
标准::字符串::为const_iterator ITER = str.begin();
标准::字符串::为const_iterator结束= str.end(); DataT数据;
BOOL R = phrase_parse(ITER,最终,语法,补气::空间,数据); 如果(r)的
{
使用命名空间因缘;
性病::法院LT&;&LT; OPTION&LT;&LT; OPTION&LT;&LT; :&所述;&下; STR&LT;&LT; - &gt;中;
#如果OPTION == 1 || OPTION == || 3 OPTION == 5 //没有预期(或超前破解版)
性病::法院LT&;&LT;格式(分隔[AUTO_&LT;&LT; AUTO_&LT;&LT;'['&LT;&LT; AUTO_&LT;&LT;']'&LT;&LT; - &gt;中&LT;&LT; AUTO_]数据)LT; &LT; \\ n;
#elif指令选项== 2 || OPTION == 4 //符合预期
性病::法院LT&;&LT;格式(分隔[AUTO_&LT;&LT;'['&LT;&LT; AUTO_&LT;&LT;']'&LT;&LT; - &gt;中&LT;&LT; AUTO_]数据)LT;&LT; \\ n;
#万一
}
如果(ITER!=结束)
性病::法院LT&;&LT; 剩余:&LT;&LT;标准::字符串(ITER,结束)LT;&LT; \\ n;
}
对于所有选项输出:
为1 2 3 4 5; DO G ++ -DOPTION = $一个-I〜/自定义/升压/ TEST.CPP -o测试$ A和&安培; ./test$a; DONE
选项1:addx001添加'STR1'STR2赛车 - &GT; STR3 - &GT; addx001添加[STR1 STR2赛车] - &GT; STR3
选项2:addx001添加'STR1'STR2赛车 - &GT; STR3 - &GT; addx001添加[STR1 STR2赛车] - &GT; STR3
OPTION 3:addx001添加'STR1'STR2赛车 - &GT; STR3 - &GT; addx001添加[STR1 STR2赛车] - &GT; STR3
方案4:addx001添加'STR1'STR2赛车 - &GT; STR3 - &GT; addx001添加[STR1 STR2赛车] - &GT; STR3
方案5:addx001添加'STR1'STR2赛车 - &GT; STR3 - &GT; addx001添加[STR1 STR2赛车] - &GT; STR3
I'm hoping someone can shine a light through my ignorance of using the >
and >>
operators in spirit parsing.
I have a working grammar, where the top-level rule looks like
test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string >> conditionRule;
It relies on attributes to automatically allocated parsed values to a fusion-adapted struct (ie a boost tuple).
However, I know that once we match the operationRule, we must continue or fail (ie we don't want to allow backtracking to try other rules that begin with identifier
).
test = identifier >>
operationRule > repeat(1,3)[any_string] > arrow > any_string > conditionRule;
This causes a cryptic compiler error ('boost::Container' : use of class template requires template argument list
). Futz around a bit and the following compiles:
test = identifier >>
(operationRule > repeat(1,3)[any_string] > arrow > any_string > conditionRule);
but the attribute setting no longer works - my data structure contains garbage after parsing. This can be fixed by adding actions like [at_c<0>(_val) = _1]
, but that seems a little clunky - as well as making things slower according to the boost docs.
So, my questions are
- Is it worth preventing back-tracking?
- Why do I need the grouping operator
()
- Does my last example above really stop back-tracking after
operationRule
is matched (I suspect not, it seems that if the entire parser inside the(...)
fails backtracking will be allowed)? - If the answer to the previous question is /no/, how do I construct a rule that allows backtracking if
operation
is /not/ matched, but does not allow backtracking once operation /is/ matched? - Why does the grouping operator destroy the attribute grammar - requiring actions?
I realise this is quite a broad question - any hints that point in the right direction will be greatly appreciated!
Is it worth preventing back-tracking?
Absolutely. Preventing back tracking in general is a proven way to improve parser performance.
- reduce the use of (negative) lookahead (operator !, operator - and some operator &)
- order branches (operator |, operator ||, operator^ and some operator */-/+) such that the most frequent/likely branch is ordered first, or that the most costly branch is tried last
Using expectation points (
>
) does not essentially reduce backtracking: it will just disallow it. This will enable targeted error messages, prevent useless 'parsing-into-the-unknown'.Why do I need the grouping
operator ()
I'm not sure. I had a check using my
what_is_the_attr
helpers from hereident >> op >> repeat(1,3)[any] >> "->" >> any
synthesizes attribute:fusion::vector4<string, string, vector<string>, string>
ident >> op > repeat(1,3)[any] > "->" > any
synthesizes attribute:fusion::vector3<fusion::vector2<string, string>, vector<string>, string>
I haven't found the need to group subexpressions using parentheses (things compile), but obviously
DataT
needs to be modified to match the changed layout.typedef boost::tuple< boost::tuple<std::string, std::string>, std::vector<std::string>, std::string > DataT;
The full code below shows how I'd prefer to do that, using adapted structs.
Does my above example really stop back-tracking after operationRule is matched (I suspect not, it seems that if the entire parser inside the (...) fails backtracking will be allowed)?
Absolutely. If the expectation(s) is not met, a
qi::expectation_failure<>
exception is thrown. This by default aborts the parse. You could use qi::on_error toretry
,fail
,accept
orrethrow
. The MiniXML example has very good examples on using expectation points withqi::on_error
If the answer to the previous question is /no/,
how do I construct a rule that allows backtracking if operation is /not/ matched, but does not allow backtracking once operation /is/ matched?Why does the grouping operator destroy the attribute grammar - requiring actions?
It doesn't destroy the attribute grammar, it just changes the exposed type. So, if you bind an appropriate attribute reference to the rule/grammar, you won't need semantic actions. Now, I feel there should be ways to go without the grouping
, so let me try it (preferrably on your short selfcontained sample).And indeed I have found no such need. I've added a full example to help you see what is happening in my testing, and not using semantic actions.
Full Code
The full code show 5 scenarios:
OPTION 1: Original without expectations
(no relevant changes)
OPTION 2: with expectations
Using the modified typedef for DataT (as shown above)
OPTION 3: adapted struct, without expectations
Using a userdefined struct with BOOST_FUSION_ADAPT_STRUCT
OPTION 4: adapted struct, with expectations
Modifying the adapted struct from OPTION 3
OPTION 5: lookahead hack
This one leverages a 'clever' (?) hack, by making all
>>
into expectations, and detecting the presence of aoperationRule
-match beforehand. This is of course suboptimal, but allows you to keepDataT
unmodified, and without using semantic actions.
Obviously, define OPTION
to the desired value before compiling.
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
#ifndef OPTION
#define OPTION 5
#endif
#if OPTION == 1 || OPTION == 5 // original without expectations (OR lookahead hack)
typedef boost::tuple<std::string, std::string, std::vector<std::string>, std::string> DataT;
#elif OPTION == 2 // with expectations
typedef boost::tuple<boost::tuple<std::string, std::string>, std::vector<std::string>, std::string> DataT;
#elif OPTION == 3 // adapted struct, without expectations
struct DataT
{
std::string identifier, operation;
std::vector<std::string> values;
std::string destination;
};
BOOST_FUSION_ADAPT_STRUCT(DataT, (std::string, identifier)(std::string, operation)(std::vector<std::string>, values)(std::string, destination));
#elif OPTION == 4 // adapted struct, with expectations
struct IdOpT
{
std::string identifier, operation;
};
struct DataT
{
IdOpT idop;
std::vector<std::string> values;
std::string destination;
};
BOOST_FUSION_ADAPT_STRUCT(IdOpT, (std::string, identifier)(std::string, operation));
BOOST_FUSION_ADAPT_STRUCT(DataT, (IdOpT, idop)(std::vector<std::string>, values)(std::string, destination));
#endif
template <typename Iterator>
struct test_parser : qi::grammar<Iterator, DataT(), qi::space_type, qi::locals<char> >
{
test_parser() : test_parser::base_type(test, "test")
{
using namespace qi;
quoted_string =
omit [ char_("'\"") [_a =_1] ]
>> no_skip [ *(char_ - char_(_a)) ]
> lit(_a);
any_string = quoted_string | +qi::alnum;
identifier = lexeme [ alnum >> *graph ];
operationRule = string("add") | "sub";
arrow = "->";
#if OPTION == 1 || OPTION == 3 // without expectations
test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string;
#elif OPTION == 2 || OPTION == 4 // with expectations
test = identifier >> operationRule > repeat(1,3)[any_string] > arrow > any_string;
#elif OPTION == 5 // lookahead hack
test = &(identifier >> operationRule) > identifier > operationRule > repeat(1,3)[any_string] > arrow > any_string;
#endif
}
qi::rule<Iterator, qi::space_type/*, qi::locals<char> */> arrow;
qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> operationRule;
qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> identifier;
qi::rule<Iterator, std::string(), qi::space_type, qi::locals<char> > quoted_string, any_string;
qi::rule<Iterator, DataT(), qi::space_type, qi::locals<char> > test;
};
int main()
{
std::string str("addx001 add 'str1' \"str2\" -> \"str3\"");
test_parser<std::string::const_iterator> grammar;
std::string::const_iterator iter = str.begin();
std::string::const_iterator end = str.end();
DataT data;
bool r = phrase_parse(iter, end, grammar, qi::space, data);
if (r)
{
using namespace karma;
std::cout << "OPTION " << OPTION << ": " << str << " --> ";
#if OPTION == 1 || OPTION == 3 || OPTION == 5 // without expectations (OR lookahead hack)
std::cout << format(delimit[auto_ << auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n";
#elif OPTION == 2 || OPTION == 4 // with expectations
std::cout << format(delimit[auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n";
#endif
}
if (iter!=end)
std::cout << "Remaining: " << std::string(iter,end) << "\n";
}
Output for all OPTIONS:
for a in 1 2 3 4 5; do g++ -DOPTION=$a -I ~/custom/boost/ test.cpp -o test$a && ./test$a; done
OPTION 1: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 2: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 3: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 4: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
OPTION 5: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3
这篇关于提振精神:: ::补气期望解析器和解析器分组意外的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!