boost Spirit 语法的不一致行为 [英] inconsistent behavior of boost spirit grammar
问题描述
我有一些语法要用于工作项目.一个最小的可执行示例是:
I have a little grammar that I want to use for a work project. A minimum executable example is:
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-local-typedefs"
#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
#pragma GCC diagnostic ignored "-Wunused-variable"
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/qi_grammar.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
#pragma GCC diagnostic pop // pops
#include <iostream>
int main()
{
typedef unsigned long long ull;
std::string curline = "1;2;;3,4;5";
std::cout << "parsing: " << curline << "
";
namespace qi = boost::spirit::qi;
auto ids = -qi::ulong_long % ','; // '-' allows for empty vecs.
auto match_type_res = ids % ';' ;
std::vector<std::vector<ull> > r;
qi::parse(curline.begin(), curline.end(), match_type_res, r);
std::cout << "got: ";
for (auto v: r){
for (auto i : v)
std::cout << i << ",";
std::cout << ";";
}
std::cout <<"
";
}
在我的个人机器上,这会产生正确的输出:解析:1;2;;3,4;5得到:1,;2,;;3,4,;5,;
On my personal machine this produces the correct output: parsing: 1;2;;3,4;5 got: 1,;2,;;3,4,;5,;
但在工作中它会产生:解析:1;2;;3,4;5得到:1,;2,;;3,
But at work it produces: parsing: 1;2;;3,4;5 got: 1,;2,;;3,
换句话说,一旦长整数向量中有多个元素,它就无法解析.
In other words, it fails to parse the vector of long integers as soon as there's more than one element in it.
现在,我确定工作系统使用的是 boost 1.56,而我的私人计算机使用的是 1.57.这是原因吗?
Now, I have identified that the work system is using boost 1.56, while my private computer is using 1.57. Is this the cause?
知道我们在堆栈溢出方面有一些真正的精神专家,我希望有人可能知道这个问题的来源,或者至少可以缩小我需要检查的事情的数量.
Knowning we have some real spirit experts here on stack overflow, I was hoping someone might know where this issue is coming from, or can at least narrow down the number of things I need to check.
如果 boost 版本有问题,我或许可以说服公司升级,但无论如何都会欢迎解决方法.
If the boost version is the problem, I can probably convince the company to upgrade, but a workaround would be welcome in any case.
推荐答案
您正在调用 未定义行为 在您的代码中.
You're invoking Undefined Behaviour in your code.
特别是在您使用 auto
存储解析器表达式的地方.表达式模板 包含对成为 dangling 位于包含完整表达式的末尾¹.
Specifically where you use auto
to store a parser expression. The Expression Template contains references to temporaries that become dangling at the end of the containing full-expression¹.
UB 意味着任何事情都可能发生.两个编译器都是对的!最好的部分是,根据所使用的编译器标志,您可能会看到不同的行为.
UB implies that anything can happen. Both compilers are right! And the best part is, you will probably see varying behaviour depending on the compiler flags used.
使用以下方法修复它:
qi::copy
(或 v.1.55 IIRC 之前的boost::proto::deep_copy
)- 使用
BOOST_SPIRIT_AUTO
而不是BOOST_AUTO
(如果您也支持 C++03,则最有用) 使用
qi::rule<>
和qi::grammar<>
(非终端) 以键入擦除和表达式.这也会影响性能,但也提供了更多功能,例如
qi::copy
(orboost::proto::deep_copy
before v.1.55 IIRC)- use
BOOST_SPIRIT_AUTO
instead ofBOOST_AUTO
(mostly helpful iff you also support C++03) use
qi::rule<>
andqi::grammar<>
(the non-terminals) to type-erase and the expressions. This has performance impact too, but also gives more features like
- 递归规则
- 本地和继承属性
- 声明的skippers(方便,因为规则可以隐式
lexeme[]
(参见此处) - 更好的代码组织.
另请注意,Spirit X3 承诺取消对 auto 使用的限制.由于使用了 c++14 特性,它基本上更加轻量级.请记住,它还不稳定.
使用 -O2 显示 GCC 显示未定义的结果;生活在 Coliru
固定版本:
//#pragma GCC diagnostic push //#pragma GCC diagnostic ignored "-Wunused-local-typedefs" //#pragma GCC diagnostic ignored "-Wmaybe-uninitialized" //#pragma GCC diagnostic ignored "-Wunused-variable" #include <boost/spirit/include/karma.hpp> #include <boost/spirit/include/qi.hpp> //#pragma GCC diagnostic pop // pops #include <iostream> int main() { typedef unsigned long long ull; std::string const curline = "1;2;;3,4;5"; std::cout << "parsing: '" << curline << "' "; namespace qi = boost::spirit::qi; #if 0 // THIS IS UNDEFINED BEHAVIOUR: auto ids = -qi::ulong_long % ','; // '-' allows for empty vecs. auto grammar = ids % ';'; #else // THIS IS CORRECT: auto ids = qi::copy(-qi::ulong_long % ','); // '-' allows for empty vecs. auto grammar = qi::copy(ids % ';'); #endif std::vector<std::vector<ull> > r; qi::parse(curline.begin(), curline.end(), grammar, r); std::cout << "got: "; for (auto v: r){ for (auto i : v) std::cout << i << ","; std::cout << ";"; } std::cout <<" "; }
打印(也使用 GCC -O2!):
Printing (also with GCC -O2!):
parsing: '1;2;;3,4;5' got: 1,;2,;;3,4,;5,;
<小时>
¹(这里基本上是在下一个分号处";但在标准语中)
¹ (that's basically "at the next semicolon" here; but in standardese)
这篇关于boost Spirit 语法的不一致行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!