不使用船长就可以增强精神分析 [英] boost spirit parsing with no skipper

查看:102
本文介绍了不使用船长就可以增强精神分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑一个预处理器,它将读取原始文本(没有明显的空白或标记).

Think about a preprocessor which will read the raw text (no significant white space or tokens).

有3条规则.

  • resolve_para_entry应该在调用中解决参数.顶级文本作为字符串返回.

  • resolve_para_entry should solve the Argument inside a call. The top-level text is returned as string.

resolve_para应该解析整个参数列表,并将所有顶级参数放在字符串列表中.

resolve_para should resolve the whole Parameter list and put all the top-level Parameter in a string list.

resolve是条目

在跟踪迭代器并获取文本部分的方式上

On the way I track the iterator and get the text portion

样品:

  • sometext(para)→在字符串列表中期望para

  • sometext(para) → expect para in the string list

sometext(para1,para2)→在字符串列表中期望para1para2

sometext(para1,para2) → expect para1 and para2 in string list

sometext(call(a))→在字符串列表中期望call(a)

sometext(call(a)) → expect call(a) in the string list

sometext(call(a,b))←此处失败;它表明!lit(',')"不会将解析器带到外面..

sometext(call(a,b)) ← here it fails; it seams that the "!lit(',')" wont take the Parser to step outside ..

规则:

resolve_para_entry = +(  
     (iter_pos >> lit('(') >> (resolve_para_entry | eps) >> lit(')') >> iter_pos) [_val=  phoenix::bind(&appendString, _val, _1,_3)]
     | (!lit(',') >> !lit(')') >> !lit('(') >> (wide::char_ | wide::space))         [_val = phoenix::bind(&appendChar, _val, _1)]
    );

resolve_para = (lit('(') >> lit(')'))[_val = std::vector<std::wstring>()]  // empty para -> old style
    | (lit('(') >> resolve_para_entry >> *(lit(',') >> resolve_para_entry) > lit(')'))[_val = phoenix::bind(&appendStringList, _val, _1, _2)]
    | eps;
  ;

resolve = (iter_pos >> name_valid >> iter_pos >> resolve_para >> iter_pos);

最后似乎不太优雅.也许有一种更好的方法来解析此类内容而无需使用船长

In the end doesn't seem very elegant. Maybe there is a better way to parse such stuff without skipper

推荐答案

实际上,这应该简单得多.

Indeed this should be a lot simpler.

首先,我看不到为什么船长的缺席与完全相关.

First off, I fail to see why the absense of a skipper is at all relevant.

第二,最好使用qi::raw[]公开原始输入,而不是使用iter_pos和笨拙的语义动作¹跳舞.

Second, exposing the raw input is best done using qi::raw[] instead of dancing with iter_pos and clumsy semantic actions¹.

在我看到的其他观察结果中:

Among the other observations I see:

  • negating a charset is done with ~, so e.g. ~char_(",()")
  • (p|eps) would be better spelled -p
  • (lit('(') >> lit(')')) could be just "()" (after all, there's no skipper, right)
  • p >> *(',' >> p) is equivalent to p % ','
  • With the above, resolve_para simplifies to this:

resolve_para = '(' >> -(resolve_para_entry % ',') >> ')';

对我来说,

  • resolve_para_entry似乎很奇怪.似乎所有嵌套的括号都被简单地吞下了.为什么不真正解析递归语法,以便检测语法错误?

  • resolve_para_entry seems weird, to me. It appears that any nested parentheses are simply swallowed. Why not actually parse a recursive grammar so you detect syntax errors?

    这是我的看法:

    我更愿意将此作为第一步,因为它可以帮助我考虑解析器的产生:

    I prefer to make this the first step because it helps me think about the parser productions:

    namespace Ast {
    
        using ArgList = std::list<std::string>;
    
        struct Resolve {
            std::string name;
            ArgList arglist;
        };
    
        using Resolves = std::vector<Resolve>;
    }
    

    创建语法规则

    qi::rule<It, Ast::Resolves()> start;
    qi::rule<It, Ast::Resolve()>  resolve;
    qi::rule<It, Ast::ArgList()>  arglist;
    qi::rule<It, std::string()>   arg, identifier;
    

    及其定义:

    identifier = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
    
    arg        = raw [ +('(' >> -arg >> ')' | +~char_(",)(")) ];
    arglist    = '(' >> -(arg % ',') >> ')';
    resolve    = identifier >> arglist;
    
    start      = *qr::seek[hold[resolve]];
    

    注意:

    • 没有更多的语义动作
    • 没有了eps
    • 没有iter_pos
    • 我选择使arglist为非可选.如果您真的想要,请改回来:

    • No more semantic actions
    • No more eps
    • No more iter_pos
    • I've opted to make arglist not-optional. If you really wanted that, change it back:

    resolve    = identifier >> -arglist;
    

    但是在我们的示例中,它将产生很多嘈杂的输出.

    But in our sample it will generate a lot of noisy output.

    当然,您的入口点(start)将有所不同.我只是使用了Spirit Repository中的另一个方便的解析器指令(例如您已经在使用的iter_pos)做了可能的最简单的事情:seek[]

    Of course your entry point (start) will be different. I just did the simplest thing that could possibly work, using another handy parser directive from the Spirit Repository (like iter_pos that you were already using): seek[]

    在Coliru上直播

    #include <boost/fusion/include/adapt_struct.hpp>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/repository/include/qi_seek.hpp>
    
    namespace Ast {
    
        using ArgList = std::list<std::string>;
    
        struct Resolve {
            std::string name;
            ArgList arglist;
        };
    
        using Resolves = std::vector<Resolve>;
    }
    
    BOOST_FUSION_ADAPT_STRUCT(Ast::Resolve, name, arglist)
    
    namespace qi = boost::spirit::qi;
    namespace qr = boost::spirit::repository::qi;
    
    template <typename It>
    struct Parser : qi::grammar<It, Ast::Resolves()>
    {
        Parser() : Parser::base_type(start) {
            using namespace qi;
    
            identifier = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
    
            arg        = raw [ +('(' >> -arg >> ')' | +~char_(",)(")) ];
            arglist    = '(' >> -(arg % ',') >> ')';
            resolve    = identifier >> arglist;
    
            start      = *qr::seek[hold[resolve]];
        }
      private:
        qi::rule<It, Ast::Resolves()> start;
        qi::rule<It, Ast::Resolve()>  resolve;
        qi::rule<It, Ast::ArgList()>  arglist;
        qi::rule<It, std::string()>   arg, identifier;
    };
    
    #include <iostream>
    
    int main() {
        using It = std::string::const_iterator;
        std::string const samples = R"--(
    Samples:
    
    sometext(para)        → expect para in the string list
    sometext(para1,para2) → expect para1 and para2 in string list
    sometext(call(a))     → expect call(a) in the string list
    sometext(call(a,b))   ← here it fails; it seams that the "!lit(',')" wont make the parser step outside
    )--";
        It f = samples.begin(), l = samples.end();
    
        Ast::Resolves data;
        if (parse(f, l, Parser<It>{}, data)) {
            std::cout << "Parsed " << data.size() << " resolves\n";
    
        } else {
            std::cout << "Parsing failed\n";
        }
    
        for (auto& resolve: data) {
            std::cout << " - " << resolve.name << "\n   (\n";
            for (auto& arg : resolve.arglist) {
                std::cout << "       " << arg << "\n";
            }
            std::cout << "   )\n";
        }
    }
    

    打印

    Parsed 6 resolves
     - sometext
       (
           para
       )
     - sometext
       (
           para1
           para2
       )
     - sometext
       (
           call(a)
       )
     - call
       (
           a
       )
     - call
       (
           a
           b
       )
     - lit
       (
           '
           '
       )
    

    更多创意

    最后一个输出显示您当前的语法有问题:lit(',')显然不应被视为具有两个参数的调用.

    More Ideas

    That last output shows you a problem with your current grammar: lit(',') should obviously not be seen as a call with two parameters.

    我最近做了一个答案,它提取带有参数的(嵌套的)函数调用,可以使事情做得更整洁:

    I recently did an answer on extracting (nested) function calls with parameters which does things more neatly:

    • Boost spirit parse rule is not applied
    • or this one boost spirit reporting semantic error

    使用string_view的奖金版本,还显示所有提取单词的确切行/列信息.

    Bonus version that uses string_view and also shows exact line/column information of all extracted words.

    请注意,它仍然不需要任何凤凰或语义操作.取而代之的是,它仅定义了要从迭代器范围分配给boost::string_view的必要特征.

    Note that it still doesn't require any phoenix or semantic actions. Instead it simply defines the necesary trait to assign to boost::string_view from an iterator range.

    在Coliru上直播

    #include <boost/fusion/include/adapt_struct.hpp>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/repository/include/qi_seek.hpp>
    #include <boost/utility/string_view.hpp>
    
    namespace Ast {
    
        using Source  = boost::string_view;
        using ArgList = std::list<Source>;
    
        struct Resolve {
            Source name;
            ArgList arglist;
        };
    
        using Resolves = std::vector<Resolve>;
    }
    
    BOOST_FUSION_ADAPT_STRUCT(Ast::Resolve, name, arglist)
    
    namespace boost { namespace spirit { namespace traits {
        template <typename It>
        struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
            static void call(It f, It l, boost::string_view& attr) { 
                attr = boost::string_view { f.base(), size_t(std::distance(f.base(),l.base())) };
            }
        };
    } } }
    
    namespace qi = boost::spirit::qi;
    namespace qr = boost::spirit::repository::qi;
    
    template <typename It>
    struct Parser : qi::grammar<It, Ast::Resolves()>
    {
        Parser() : Parser::base_type(start) {
            using namespace qi;
    
            identifier = raw [ char_("a-zA-Z_") >> *char_("a-zA-Z0-9_") ];
    
            arg        = raw [ +('(' >> -arg >> ')' | +~char_(",)(")) ];
            arglist    = '(' >> -(arg % ',') >> ')';
            resolve    = identifier >> arglist;
    
            start      = *qr::seek[hold[resolve]];
        }
      private:
        qi::rule<It, Ast::Resolves()> start;
        qi::rule<It, Ast::Resolve()>  resolve;
        qi::rule<It, Ast::ArgList()>  arglist;
        qi::rule<It, Ast::Source()>   arg, identifier;
    };
    
    #include <iostream>
    
    struct Annotator {
        using Ref = boost::string_view;
    
        struct Manip {
            Ref fragment, context;
    
            friend std::ostream& operator<<(std::ostream& os, Manip const& m) {
                return os << "[" << m.fragment << " at line:" << m.line() << " col:" << m.column() << "]";
            }
    
            size_t line() const {
                return 1 + std::count(context.begin(), fragment.begin(), '\n');
            }
            size_t column() const {
                return 1 + (fragment.begin() - start_of_line().begin());
            }
            Ref start_of_line() const {
                return context.substr(context.substr(0, fragment.begin()-context.begin()).find_last_of('\n') + 1);
            }
        };
    
        Ref context;
        Manip operator()(Ref what) const { return {what, context}; }
    };
    
    int main() {
        using It = std::string::const_iterator;
        std::string const samples = R"--(Samples:
    
    sometext(para)        → expect para in the string list
    sometext(para1,para2) → expect para1 and para2 in string list
    sometext(call(a))     → expect call(a) in the string list
    sometext(call(a,b))   ← here it fails; it seams that the "!lit(',')" wont make the parser step outside
    )--";
        It f = samples.begin(), l = samples.end();
    
        Ast::Resolves data;
        if (parse(f, l, Parser<It>{}, data)) {
            std::cout << "Parsed " << data.size() << " resolves\n";
    
        } else {
            std::cout << "Parsing failed\n";
        }
    
        Annotator annotate{samples};
    
        for (auto& resolve: data) {
            std::cout << " - " << annotate(resolve.name) << "\n   (\n";
            for (auto& arg : resolve.arglist) {
                std::cout << "       " << annotate(arg) << "\n";
            }
            std::cout << "   )\n";
        }
    }
    

    打印

    Parsed 6 resolves
     - [sometext at line:3 col:1]
       (
           [para at line:3 col:10]
       )
     - [sometext at line:4 col:1]
       (
           [para1 at line:4 col:10]
           [para2 at line:4 col:16]
       )
     - [sometext at line:5 col:1]
       (
           [call(a) at line:5 col:10]
       )
     - [call at line:5 col:34]
       (
           [a at line:5 col:39]
       )
     - [call at line:6 col:10]
       (
           [a at line:6 col:15]
           [b at line:6 col:17]
       )
     - [lit at line:6 col:62]
       (
           [' at line:6 col:66]
           [' at line:6 col:68]
       )
    


    ¹ Boost Spirit:语义行为是邪恶的"?

    这篇关于不使用船长就可以增强精神分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆