使用Boost精神解析一个文本文件,而跳过它的大部分 [英] Using Boost Spirit to parse a text file while skipping large parts of it
问题描述
我有如下的的std ::字符串
:
<lots of text not including "label A" or "label B">
label A: 34
<lots of text not including "label A" or "label B">
label B: 45
<lots of text not including "label A" or "label B">
...
我要提取下面的所有出现单一的整体数字标签的
或标签B
,并放置在相应的矢量&lt;&INT GT; A,B
。这样做的一个简单的,但不是优雅的方式是使用找到(标签的)
和找到(标签B)
和解析以先到为准。有前$ P $使用精神pssing它简洁的方式?你如何跳过一切,但标签的
或标签B
?
I want extract single integral numbers following all occurrences of label A
or label B
and place them in corresponding vector<int> a, b
. A simple, but not elegant way of doing it is using find("label A")
and find("label B")
and parsing whichever is first. Is there a succinct way of expressing it using Spirit? How do you skip everything but label A
or label B
?
推荐答案
您可以只
omit [ eol >> *char_ - ("\nlabel A:") ] >> eol
例如: 住在Coliru
Example: Live On Coliru
还有资源库中的寻求[]
指令。以下是相当于上面
There's also the seek[]
directive in the repository. The following is equivalent to the above:
repo::seek [ eol >> &lit("int main") ]
下面是它分析你的原始样品的样品:
Here's a sample that parses your original sample:
*repo::seek [ eol >> "label" >> char_("A-Z") >> ':' >> int_ ],
这将解析成的std ::矢量&lt;的std ::对&LT;焦炭,INT&GT;方式&gt;
没有别的
On Coliru Too:
#if 0
<lots of text not including "label A" or "label B">
label A: 34
<lots of text not including "label A" or "label B">
label B: 45
<lots of text not including "label A" or "label B">
...
#endif
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <fstream>
namespace qi = boost::spirit::qi;
namespace repo = boost::spirit::repository::qi;
int main()
{
std::ifstream ifs("main.cpp");
ifs >> std::noskipws;
boost::spirit::istream_iterator f(ifs), l;
std::vector<std::pair<char, int> > parsed;
using namespace qi;
bool ok = phrase_parse(
f, l,
*repo::seek [ eol >> "label" >> char_("A-Z") >> ':' >> int_ ],
blank,
parsed
);
if (ok)
{
std::cout << "Found:\n";
for (auto& p : parsed)
std::cout << "'" << p.first << "' has value " << p.second << "\n";
}
else
std::cout << "Fail at: '" << std::string(f,l) << "'\n";
}
注:
-
征求
并揭露相匹配的属性,它是pretty强大:
seek
does expose the attribute matched, which is pretty powerful:
repo::seek [ eol >> "label" >> char_("ABCD") >> ':' ]
将'吃'的标签,但暴露的标签号('A'
,'B'
, 'C'
或'D'
)的属性。
性能跳绳可以pretty令人惊讶的时候,阅读文档<一个在警告href=\"http://www.boost.org/doc/libs/1_55_0/libs/spirit/repository/doc/html/spirit_repository/qi_components/directives/seek.html\" rel=\"nofollow\">http://www.boost.org/doc/libs/1_55_0/libs/spirit/repository/doc/html/spirit_repository/qi_components/directives/seek.html
Performance when skipping can be pretty surprising, read the warning in the documentation http://www.boost.org/doc/libs/1_55_0/libs/spirit/repository/doc/html/spirit_repository/qi_components/directives/seek.html
输出为
Found:
'A' has value 34
'B' has value 45
这篇关于使用Boost精神解析一个文本文件,而跳过它的大部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!