使用非无效函数对象增强精神语义动作 [英] boost spirit semantic action using non-void function objects

查看:74
本文介绍了使用非无效函数对象增强精神语义动作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的语义动作中,我不仅试图打印出已解析的内容.但是,语义动作函数应该创建一个新对象,而该对象又应该是解析器创建的值.

In my semantic action I'm not just trying to print out what has been parsed. But the semantic action function is supposed to create some new object which in turn is supposed to be the value created by the parser.

让我们承担以下任务: 解析器应该将地址/引用传递给对象

Let's assume the following task: The parser should get passed the address/reference to an object

typedef std::pair<
    std::map<std::string, std::size_t>,
    std::map<std::size_t, std::string>
> mapping;

,解析器应将所有非空白字符串转换为std::size_t,在EOI上返回std::vector<std::size_t>,代表所有字符串.当然,这意味着在上面的映射中创建/查找条目.

and the parser should convert all non-white-space strings into a std::size_t return a std::vector<std::size_t> on EOI representing all the strings. Of course this means creating/finding entries in the mapping above.

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/qi_char_class.hpp>
#include <vector>
#include <map>
#include <string>
#include <iostream>

namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
typedef std::pair<
    std::map<std::string, std::size_t>,
    std::map<std::size_t, std::string>
> mapping;

struct convert
{   mapping &m_r;
    convert(mapping &_r)
        :m_r(_r)
    {
    }
    std::size_t operator()(const std::string &_r, const boost::fusion::unused_type&, const boost::fusion::unused_type&) const
    {   const auto pFind = m_r.first.find(_r);
        if (pFind != m_r.first.end())
            return pFind->second;
        else
        {   const auto iID = m_r.first.size();
            m_r.second.insert(std::make_pair(iID, _r));
            m_r.first.insert(std::make_pair(_r, iID));
            return iID;
        }
    }
};
template<typename Iterator>
struct parser:qi::grammar<Iterator, std::vector<std::size_t>(), ascii::space_type>
{   qi::rule<Iterator, std::vector<std::size_t>(), ascii::space_type> start;
    qi::rule<Iterator, std::size_t(), ascii::space_type> name;
    parser(mapping &_r)
        :parser::base_type(start)
    {   name = (+qi::char_)[convert(_r)];
        start %= *name;
    }
};



int main(int, char**)
{   const auto sVector = std::vector<char>(std::istream_iterator<char>(std::cin), std::istream_iterator<char>());
    mapping sMapping;
    parser<std::vector<char>::const_iterator> sParser(sMapping);
    qi::phrase_parse(sVector.begin(), sVector.end(), sParser, ascii::space_type());

}

推荐答案

解析器要生成的只是单词ID"的序列(我们称它们为原子).

What you want the parser to produce is just the sequence of "word ids" (let's call them atoms).

只有会助长语义动作的函子才需要了解"映射.

Only the functor that will fuel your semantic action needs to "know about" the mappings.

我将在这里简化您的数据结构:

I'm going to simplify your data-structure a bit here:

using AtomId = size_t;
using Atom = std::string_view; // or boost::string_view

struct mapping {
    std::map<Atom, AtomId> by_word;
    std::map<AtomId, Atom> by_id;
};

关于语义动作

您可以阅读如果要使用合成的,本地的,公开的或继承的属性,则需要对context参数进行解码.对此的最佳解决方法仍然是: boost精神语义动作参数

If you want to use the synthesized , local, exposed or inherited attributes, you will have decode the context parameter. Still the best treatment of this is this answer: boost spirit semantic action parameters

但是,如果您看过它,将会发现它不是很方便.相反,我建议您保留在Phoenix域中(其中_1_val_pass_r1_a之类的内容具有预期的含义,而无需知道如何解决它们).上下文).

However, if you've looked at it, you'll find it's not very convenient. Instead, I'd suggest to stay in the Phoenix domain (where things like _1, _val, _pass, _r1 and _a magically have the intended meanings, without having to know how to address them in the context).

在这种情况下,您将希望您的功能像这样:

In that case, you will want your function to be like:

struct convert_f {
    mapping &m_ref;

    using Range = boost::iterator_range<It>;

    AtomId operator()(Range const& text) const {
        Atom atom{&*text.begin(), text.size()};
        auto& left  = m_ref.by_word;
        auto& right = m_ref.by_id;

        auto it = left.find(atom);

        if (it != left.end())
            return it->second;
        else {
            const auto iID = left.size();
            left.emplace (atom, iID);
            right.emplace(iID, atom);
            return iID;
        }
    }
};

boost::phoenix::function<convert_f> convert;

您可能只是将Range制成了std::string,但我在想,由于将完整文件读入矢量,因此可以基于原始源迭代器范围使用string_view,以避免复制任何事物.这也消除了在两个地图¹中存储相同的std::string的令人毛骨悚然的冗余.

You could have made Range just std::string, but I was thinking ahead, and since you read the full file into a vector, you can use a string_view based on the raw source iterator range, to avoid copying anything. This also removes the creepy redundancy of storing the same std::string inside two maps¹.

¹,但请参见新的奖金"部分

  1. 错误:如果您希望+char_仅匹配连续的字符,请确保将其包装在lexeme[]中(这样它就不能无声地跳过空格),或者当然也可以使规则隐式地执行lexeme(请参见Boost Spirit队长问题).
  2. 错误:除非要解析/anything/,否则不要使用+char_.在您的情况下,您需要连续的非空格扩展,因此至少使其成为+qi::graph
  3. 错误:从std::cin读取数据时,您已经跳过了空格,因此所有输入将再次变成大字.首先使用std::noskipws或使用std::istreambuf_iterator代替std::istream_iterator.微妙,我知道.
  4. 除非您是要让呼叫者更改它,否则请不要暴露您的船长
  1. BUG: if you expect +char_ to match only contiguous chars, make sure you wrap it in a lexeme[] (so it cannot skip whitespaces silently) OR of course make the rule implicitly lexeme (see Boost spirit skipper issues).
  2. BUG: don't use +char_ unless you mean to parse /anything/ In your case, you want contiguous stretches of non-space, so at least make it +qi::graph
  3. BUG: when reading the data from std::cin you already skip whitespace, so all input will become on big word again. Use std::noskipws first OR use std::istreambuf_iterator instead std::istream_iterator. Subtle, I know.
  4. don't expose your skipper unless you mean for the caller to change it

我可能忘记了更多步骤,但是现在,让我们忘记它,然后放一个演示:

I probably forgot some more steps, but for now, let's forget about that and just drop a demo:

在Coliru上直播

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <string_view> // or <boost/utility/string_view.hpp>
#include <iostream>
#include <map>

using AtomId = size_t;
using Atom   = std::string_view; // or boost::string_view
using Atoms  = std::vector<AtomId>;

struct mapping {
    std::map<Atom, AtomId> by_word;
    std::map<AtomId, Atom> by_id;
};

namespace qi = boost::spirit::qi;

template <typename It>
struct parser : qi::grammar<It, Atoms()> {
    parser(mapping &r) : parser::base_type(start), convert({r}) {
        using namespace qi;

        // we don't expose the skipper anymore, so we specify it at toplevel
        start = skip(ascii::space)[ *name ]; 

        name  = raw[ +graph ] [_val = convert(_1)];
    }
  private:
    qi::rule<It, Atoms()> start;
    qi::rule<It, AtomId()> name;

    struct convert_f {
        mapping &m_ref;

        using Range = boost::iterator_range<It>;

        AtomId operator()(Range const& text) const {
            Atom atom{&*text.begin(), text.size()};
            auto& left  = m_ref.by_word;
            auto& right = m_ref.by_id;

            auto it = left.find(atom);

            if (it != left.end())
                return it->second;
            else {
                const auto iID = left.size();
                left.emplace (atom, iID);
                right.emplace(iID, atom);
                return iID;
            }
        }
    };

    boost::phoenix::function<convert_f> convert;
};

int main() {
    using It = std::string::const_iterator;
    std::string const input { std::istreambuf_iterator<char>(std::cin), {} };

    mapping sMapping;
    parser<It> const sParser(sMapping);

    if (qi::parse(input.begin(), input.end(), sParser)) {
        std::cout << "Parsed " << sMapping.by_id.size() << " unique atoms\n";
        for (auto& [atom, id] : sMapping.by_word) {
            std::cout << atom << "(" << id << ")\n";
        }
        std::cout << "\n";
    } else {
        std::cout << "Parse failed\n";
        return 1;
    }
}

打印(对于当前帖子文本):

Prints (for the current post text):

Parsed 282 unique atoms
!=(153)
"know(34)
"word(19)
##(63)
&m_ref;(135)
(atom,(161)
(it(152)
(let's(21)
(see(230)
(so(220)
(where(111)
**<kbd>[Live(279)
,(78)
//(50)
/anything/(236)
0.(208)
=(46)
About(64)
Action(67)
Actions](http://boost-spirit.com/home/2010/03/03/the-anatomy-of-semantic-actions-in-qi/).(75)
Atom(48)
Atom>(60)
AtomId(45)
AtomId>(57)
BUG:(209)
Coliru]()</kbd>**(281)
DEMO(278)
However,(92)
I(174)
I'd(105)
I'm(37)
If(76)
In(129)
Instead,(104)
OR(225)
Of(73)
On(280)
Only(25)
Phoenix(109)
Points(207)
Problem(206)
Range(136)
Semantic(66)
Some(204)
Spirit(74)
Still(86)
Subtle,(261)
That(65)
There(0)
This(193)
Use(255)
Varied(205)
What(11)
You(68)
[Anatomy(72)
`+char_`(211)
`+qi::graph`(241)
`Range`(171)
`_1`,(114)
`_a`(119)
`_pass`,(116)
`_r1`(117)
`_val`,(115)
`lexeme[]`(219)
`std::cin`(246)
`std::istream_iterator`.(260)
`std::istreambuf_iterator`(258)
`std::noskipws`(256)
`std::string`(200)
`std::string`,(172)
`string_view`(183)
a(40)
about(71)
about"(35)
action(32)
address(127)
again.(254)
ahead,(177)
all(249)
already(247)
also(194)
and(118)
answer:(90)
anything.(192)
at(96)
atom);(164)
atoms).(24)
atom{&*text.begin(),(142)
attribute(9)
attributes,(81)
auto(149)
auto&(144)
avoid(190)
based(184)
be(132)
become(251)
best(87)
big(252)
binding(4)
bit(41)
boost::iterator_range<It>;(137)
boost::phoenix::function<convert_f>(167)
boost::string_view(52)
but(173)
by_id;(61)
by_word;(58)
call(22)
caller(266)
can(69)
cannot(221)
case,(130)
change(267)
chars,(215)
complexity(42)
const(141)
const&(139)
context(84)
context).(128)
contiguous(214)
convenient.(103)
convert;(168)
convert_f(134)
copying(191)
could(169)
course(226)
creepy(196)
data(244)
decode(83)
demo:(277)
domain(110)
don't(232)
drop(276)
else(157)
expect(210)
expose(263)
exposed(8)
file(180)
find(99)
first(257)
for(265)
forget(275)
forgot(269)
from(245)
fuel(29)
full(179)
function(131)
functor(26)
going(38)
have(82)
having(124)
here:(43)
how(126)
https://stackoverflow.com/questions/17072987/boost-spirit-skipper-issues/17073965#17073965).(231)
https://stackoverflow.com/questions/3066701/boost-spirit-semantic-action-parameters/3067881#3067881(91)
iID(158)
iID);(162)
iID;(165)
ids"(20)
if(93)
implicitly(228)
in(108)
inherited(80)
input(250)
inside(201)
instead(259)
intended(121)
into(181)
is(1)
it(150)
it's(100)
it,(97)
it->second;(156)
iterator(188)
just(16)
know(125)
know.(262)
least(240)
left(145)
left.emplace(160)
left.end())(154)
left.find(atom);(151)
left.size();(159)
let's(274)
lexeme(229)
like(113)
like:(133)
little(2)
local,(79)
looked(95)
m_ref.by_id;(148)
m_ref.by_word;(146)
made(170)
magically(120)
make(216)
map(6)
mapping(54)
mappings.(36)
maps.(203)
match(212)
mean(234)
meanings,(122)
more(271)
needs(33)
non-space,(238)
not(101)
now,(273)
of(18)
on(185)
only(213)
operator()(Range(138)
or(51)
parameter.(85)
parse(235)
parser(14)
probably(268)
produce(15)
range,(189)
raw(186)
read(70)
reading(243)
redundancy(197)
removes(195)
return(155)
right(147)
right.emplace(iID,(163)
rule(227)
same(199)
semantic(31)
sequence(17)
sidestep(39)
silently)(224)
since(178)
size_t;(47)
skip(222)
skipper(264)
so(239)
some(270)
source(187)
stay(107)
std::map<Atom,(56)
std::map<AtomId,(59)
std::string_view;(49)
steps,(272)
storing(198)
stretches(237)
struct(53)
suggest(106)
sure(217)
synthesized(77)
text)(140)
text.size()};(143)
that(27)
the(5)
them(23)
things(112)
thinking(176)
this(89)
to(7)
treatment(88)
two(202)
type.(10)
unless(233)
use(3)
using(44)
vector,(182)
very(102)
want(13)
was(175)
when(242)
whitespace,(248)
whitespaces(223)
will(28)
without(123)
word(253)
wrap(218)
you(12)
you'll(98)
you've(94)
your(30)
{(55)
}(166)
};(62)

哦,我忘了实际存储Atoms:

在Coliru上直播

Atoms idlist;
if (qi::parse(input.begin(), input.end(), sParser, idlist)) {
    std::cout << "Parsed " << sMapping.by_id.size() << " unique atoms\n";
    for (AtomId id : idlist) {
        std::cout << "'" << sMapping.by_id.at(id) << "' ";
    }
    std::cout << "\n";
} else {
    // ...

打印类似以下内容的内容:

Prints something starting like:

Parsed 282 unique atoms
'There' 'is' 'little' 'use' 'binding' 'the' 'map' 'to' 'the' 'exposed' 'attribute' 'type.' 'What' 'you' 'want' 'the' ...

奖金

  • 使用Boost Bimap而不是手动滚动两个地图.这样可以使事物始终保持同步,并缩短大约15行代码:
  • 在Coliru上直播

    using mapping = boost::bimap<Atom, AtomId>;
    
    // ...
    
    AtomId convert_f::operator()(Range const& text) const {
        Atom atom{&*text.begin(), text.size()};
        return m_ref.left.insert({atom, m_ref.size()}).first->second;
    }
    

    然后在用法中

    std::cout << "Parsed " << sMapping.size() << " unique atoms\n";
    for (AtomId id : idlist) {
        std::cout << "'" << sMapping.right.at(id) << "' ";
    }
    

    这篇关于使用非无效函数对象增强精神语义动作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆