boost::spirit 访问来自语义动作的位置迭代器 [英] boost::spirit access position iterator from semantic actions

查看:27
本文介绍了boost::spirit 访问来自语义动作的位置迭代器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这样的代码(行号供参考):

Lets say I have code like this (line numbers for reference):

1:
2:function FuncName_1 {
3:    var Var_1 = 3;
4:    var  Var_2 = 4;
5:    ...

我想编写一个语法来解析此类文本,将所有标识符(函数和变量名称)信息放入一棵树(utree?)中.每个节点应保留:line_num、column_num 和符号值.例子:

I want to write a grammar that parses such text, puts all indentifiers (function and variable names) infos into a tree (utree?). Each node should preserve: line_num, column_num and symbol value. example:

root: FuncName_1 (line:2,col:10)
  children[0]: Var_1 (line:3, col:8)
  children[1]: Var_1 (line:4, col:9)

我想将它放入树中,因为我打算遍历那棵树,并且对于每个节点,我必须知道上下文":(当前节点的所有父节点).

I want to put it into the tree because I plan to traverse through that tree and for each node I must know the 'context': (all parent nodes of current nodes).

例如,在使用 Var_1 处理节点时,我必须知道这是函数 FuncName_1 的局部变量的名称(当前正在作为节点处理,但要提前一级)

E.g, while processing node with Var_1, I must know that this is a name for local variable for function FuncName_1 (that is currently being processed as node, but one level earlier)

我无法弄清楚一些事情

  1. 这可以在 Spirit 中通过语义动作和 utree 完成吗?或者我应该使用变体<>树?
  2. 如何同时将这三个信息(column、line、symbol_name)传递给节点?我知道我必须使用 pos_iterator 作为语法的迭代器类型,但如何在语义操作中访问这些信息?

我是 Boost 的新手,所以我一遍又一遍地阅读 Spirit 文档,我尝试在 google 上搜索我的问题,但不知何故我无法将所有部分放在一起以找到解决方案.似乎以前没有人像我这样的用例(或者我只是找不到它)看起来位置迭代器的唯一解决方案是解析错误处理的解决方案,但这不是我感兴趣的情况.仅解析我正在处理的代码的代码如下,但我不知道如何继续.

I'm a newbie in Boost so I read the Spirit documentaiton over and over, I try to google my problems but I somehow cannot put all the pieces together ot find the solution. Seems like there was no one me with such use case like mine before (or I'm just not able to find it) Looks like the only solutions with position iterator are the ones with parsing error handling, but this is not the case I'm interested in. The code that only parses the code I was taking about is below but I dont know how to move forward with it.

  #include <boost/spirit/include/qi.hpp>
  #include <boost/spirit/include/support_line_pos_iterator.hpp>

  namespace qi = boost::spirit::qi;
  typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;

  template<typename Iterator=pos_iterator_t, typename Skipper=qi::space_type>
  struct ParseGrammar: public qi::grammar<Iterator, Skipper>
  {
        ParseGrammar():ParseGrammar::base_type(SourceCode)
        {
           using namespace qi;
           KeywordFunction = lit("function");
           KeywordVar    = lit("var");
           SemiColon     = lit(';');

           Identifier = lexeme [alpha >> *(alnum | '_')];
           VarAssignemnt = KeywordVar >> Identifier >> char_('=') >> int_ >> SemiColon;
           SourceCode = KeywordFunction >> Identifier >> '{' >> *VarAssignemnt >> '}';
        }

        qi::rule<Iterator, Skipper> SourceCode;
        qi::rule<Iterator > KeywordFunction;
        qi::rule<Iterator,  Skipper> VarAssignemnt;
        qi::rule<Iterator> KeywordVar;
        qi::rule<Iterator> SemiColon;
        qi::rule<Iterator > Identifier;
  };

  int main()
  {
     std::string const content = "function FuncName_1 {
 var Var_1 = 3;
 var  Var_2 = 4; }";

     pos_iterator_t first(content.begin()), iter = first, last(content.end());
     ParseGrammar<pos_iterator_t> resolver;    //  Our parser
     bool ok = phrase_parse(iter,
                            last,
                            resolver,
                            qi::space);

     std::cout << std::boolalpha;
     std::cout << "
ok : " << ok << std::endl;
     std::cout << "full   : " << (iter == last) << std::endl;
     if(ok && iter == last)
     {
        std::cout << "OK: Parsing fully succeeded

";
     }
     else
     {
        int line   = get_line(iter);
        int column = get_column(first, iter);
        std::cout << "-------------------------
";
        std::cout << "ERROR: Parsing failed or not complete
";
        std::cout << "stopped at: " << line  << ":" << column << "
";
        std::cout << "remaining: '" << std::string(iter, last) << "'
";
        std::cout << "-------------------------
";
     }
     return 0;
  }

推荐答案

这是一个有趣的练习,我最后整理了一个 on_success[1] 注释 AST 节点.

This has been a fun exercise, where I finally put together a working demo of on_success[1] to annotate AST nodes.

假设我们想要一个像这样的 AST:

Let's assume we want an AST like:

namespace ast
{
struct LocationInfo {
    unsigned line, column, length;
};

struct Identifier     : LocationInfo {
    std::string name;
};

struct VarAssignment  : LocationInfo {
    Identifier id;
    int value;
};

struct SourceCode     : LocationInfo {
    Identifier function;
    std::vector<VarAssignment> assignments;
};
}

我知道,对于 SourceCode 节点来说,位置信息"可能有点矫枉过正,但你知道……无论如何,为了在不需要语义操作的情况下轻松地为这些节点分配属性 或许多特制的构造函数:

I know, 'location information' is probably overkill for the SourceCode node, but you know... Anyways, to make it easy to assign attributes to these nodes without requiring semantic actions or lots of specifically crafted constructors:

#include <boost/fusion/adapted/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(ast::Identifier,    (std::string, name))
BOOST_FUSION_ADAPT_STRUCT(ast::VarAssignment, (ast::Identifier, id)(int, value))
BOOST_FUSION_ADAPT_STRUCT(ast::SourceCode,    (ast::Identifier, function)(std::vector<ast::VarAssignment>, assignments))

那里.现在我们可以声明规则来公开这些属性:

There. Now we can declare the rules to expose these attributes:

qi::rule<Iterator, ast::SourceCode(),    Skipper> SourceCode;
qi::rule<Iterator, ast::VarAssignment(), Skipper> VarAssignment;
qi::rule<Iterator, ast::Identifier()>         Identifier;
// no skipper, no attributes:
qi::rule<Iterator> KeywordFunction, KeywordVar, SemiColon;

我们根本不(基本上)修改语法:属性传播只是自动的"[2] :

We don't (essentially) modify the grammar, at all: attribute propagation is "just automatic"[2] :

KeywordFunction = lit("function");
KeywordVar      = lit("var");
SemiColon       = lit(';');

Identifier      = as_string [ alpha >> *(alnum | char_("_")) ];
VarAssignment   = KeywordVar >> Identifier >> '=' >> int_ >> SemiColon; 
SourceCode      = KeywordFunction >> Identifier >> '{' >> *VarAssignment >> '}';

魔法

我们如何获取附加到我们节点的源位置信息?

The magic

How do we get the source location information attached to our nodes?

auto set_location_info = annotate(_val, _1, _3);
on_success(Identifier,    set_location_info);
on_success(VarAssignment, set_location_info);
on_success(SourceCode,    set_location_info);

现在,annotate 只是一个可调用的惰性版本,定义为:

Now, annotate is just a lazy version of a calleable that is defined as:

template<typename It>
struct annotation_f {
    typedef void result_type;

    annotation_f(It first) : first(first) {}
    It const first;

    template<typename Val, typename First, typename Last>
    void operator()(Val& v, First f, Last l) const {
        do_annotate(v, f, l, first);
    }
  private:
    void static do_annotate(ast::LocationInfo& li, It f, It l, It first) {
        using std::distance;
        li.line   = get_line(f);
        li.column = get_column(first, f);
        li.length = distance(f, l);
    }
    static void do_annotate(...) { }
};

由于 get_column 的工作方式,函子是 有状态的(因为它记住了起始迭代器)[3].如您所见,do_annotate 只接受从 LocationInfo 派生的任何内容.

Due to way in which get_column works, the functor is stateful (as it remembers the start iterator)[3]. As you can see do_annotate just accepts anything that derives from LocationInfo.

现在,布丁的证明:

std::string const content = "function FuncName_1 {
 var Var_1 = 3;
 var  Var_2 = 4; }";

pos_iterator_t first(content.begin()), iter = first, last(content.end());
ParseGrammar<pos_iterator_t> resolver(first);    //  Our parser

ast::SourceCode program;
bool ok = phrase_parse(iter,
        last,
        resolver,
        qi::space,
        program);

std::cout << std::boolalpha;
std::cout << "ok  : " << ok << std::endl;
std::cout << "full: " << (iter == last) << std::endl;
if(ok && iter == last)
{
    std::cout << "OK: Parsing fully succeeded

";

    std::cout << "Function name: " << program.function.name << " (see L" << program.printLoc() << ")
";
    for (auto const& va : program.assignments)
        std::cout << "variable " << va.id.name << " assigned value " << va.value << " at L" << va.printLoc() << "
";
}
else
{
    int line   = get_line(iter);
    int column = get_column(first, iter);
    std::cout << "-------------------------
";
    std::cout << "ERROR: Parsing failed or not complete
";
    std::cout << "stopped at: " << line  << ":" << column << "
";
    std::cout << "remaining: '" << std::string(iter, last) << "'
";
    std::cout << "-------------------------
";
}

打印:

ok  : true
full: true
OK: Parsing fully succeeded

Function name: FuncName_1 (see L1:1:56)
variable Var_1 assigned value 3 at L2:3:14
variable Var_2 assigned value 4 at L3:3:15

完整的演示程序

看到它生活在 Coliru

还显示:

  • 错误处理,例如:

  • error handling, e.g.:

Error: expecting "=" in line 3: 

var  Var_2 - 4; }
           ^---- here
ok  : false
full: false
-------------------------
ERROR: Parsing failed or not complete
stopped at: 1:1
remaining: 'function FuncName_1 {
var Var_1 = 3;
var  Var_2 - 4; }'
-------------------------

  • BOOST_SPIRIT_DEBUG 宏

  • BOOST_SPIRIT_DEBUG macros

    //#define BOOST_SPIRIT_DEBUG
    #define BOOST_SPIRIT_USE_PHOENIX_V3
    #include <boost/fusion/adapted/struct.hpp>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/spirit/include/support_line_pos_iterator.hpp>
    #include <iomanip>
    
    namespace qi = boost::spirit::qi;
    namespace phx= boost::phoenix;
    
    typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;
    
    namespace ast
    {
        namespace manip { struct LocationInfoPrinter; }
    
        struct LocationInfo {
            unsigned line, column, length;
            manip::LocationInfoPrinter printLoc() const;
        };
    
        struct Identifier     : LocationInfo {
            std::string name;
        };
    
        struct VarAssignment  : LocationInfo {
            Identifier id;
            int value;
        };
    
        struct SourceCode     : LocationInfo {
            Identifier function;
            std::vector<VarAssignment> assignments;
        };
    
        ///////////////////////////////////////////////////////////////////////////
        // Completely unnecessary tweak to get a "poor man's" io manipulator going
        // so we can do `std::cout << x.printLoc()` on types of `x` deriving from
        // LocationInfo
        namespace manip {
            struct LocationInfoPrinter {
                LocationInfoPrinter(LocationInfo const& ref) : ref(ref) {}
                LocationInfo const& ref;
                friend std::ostream& operator<<(std::ostream& os, LocationInfoPrinter const& lip) {
                    return os << lip.ref.line << ':' << lip.ref.column << ':' << lip.ref.length;
                }
            };
        }
    
        manip::LocationInfoPrinter LocationInfo::printLoc() const { return { *this }; }
        // feel free to disregard this hack
        ///////////////////////////////////////////////////////////////////////////
    }
    
    BOOST_FUSION_ADAPT_STRUCT(ast::Identifier,    (std::string, name))
    BOOST_FUSION_ADAPT_STRUCT(ast::VarAssignment, (ast::Identifier, id)(int, value))
    BOOST_FUSION_ADAPT_STRUCT(ast::SourceCode,    (ast::Identifier, function)(std::vector<ast::VarAssignment>, assignments))
    
    struct error_handler_f {
        typedef qi::error_handler_result result_type;
        template<typename T1, typename T2, typename T3, typename T4>
            qi::error_handler_result operator()(T1 b, T2 e, T3 where, T4 const& what) const {
                std::cerr << "Error: expecting " << what << " in line " << get_line(where) << ": 
    " 
                    << std::string(b,e) << "
    "
                    << std::setw(std::distance(b, where)) << '^' << "---- here
    ";
                return qi::fail;
            }
    };
    
    template<typename It>
    struct annotation_f {
        typedef void result_type;
    
        annotation_f(It first) : first(first) {}
        It const first;
    
        template<typename Val, typename First, typename Last>
        void operator()(Val& v, First f, Last l) const {
            do_annotate(v, f, l, first);
        }
      private:
        void static do_annotate(ast::LocationInfo& li, It f, It l, It first) {
            using std::distance;
            li.line   = get_line(f);
            li.column = get_column(first, f);
            li.length = distance(f, l);
        }
        static void do_annotate(...) {}
    };
    
    template<typename Iterator=pos_iterator_t, typename Skipper=qi::space_type>
    struct ParseGrammar: public qi::grammar<Iterator, ast::SourceCode(), Skipper>
    {
        ParseGrammar(Iterator first) : 
            ParseGrammar::base_type(SourceCode),
            annotate(first)
        {
            using namespace qi;
            KeywordFunction = lit("function");
            KeywordVar      = lit("var");
            SemiColon       = lit(';');
    
            Identifier      = as_string [ alpha >> *(alnum | char_("_")) ];
            VarAssignment   = KeywordVar > Identifier > '=' > int_ > SemiColon; // note: expectation points
            SourceCode      = KeywordFunction >> Identifier >> '{' >> *VarAssignment >> '}';
    
            on_error<fail>(VarAssignment, handler(_1, _2, _3, _4));
            on_error<fail>(SourceCode, handler(_1, _2, _3, _4));
    
            auto set_location_info = annotate(_val, _1, _3);
            on_success(Identifier,    set_location_info);
            on_success(VarAssignment, set_location_info);
            on_success(SourceCode,    set_location_info);
    
            BOOST_SPIRIT_DEBUG_NODES((KeywordFunction)(KeywordVar)(SemiColon)(Identifier)(VarAssignment)(SourceCode))
        }
    
        phx::function<error_handler_f> handler;
        phx::function<annotation_f<Iterator>> annotate;
    
        qi::rule<Iterator, ast::SourceCode(),    Skipper> SourceCode;
        qi::rule<Iterator, ast::VarAssignment(), Skipper> VarAssignment;
        qi::rule<Iterator, ast::Identifier()>             Identifier;
        // no skipper, no attributes:
        qi::rule<Iterator> KeywordFunction, KeywordVar, SemiColon;
    };
    
    int main()
    {
        std::string const content = "function FuncName_1 {
     var Var_1 = 3;
     var  Var_2 - 4; }";
    
        pos_iterator_t first(content.begin()), iter = first, last(content.end());
        ParseGrammar<pos_iterator_t> resolver(first);    //  Our parser
    
        ast::SourceCode program;
        bool ok = phrase_parse(iter,
                last,
                resolver,
                qi::space,
                program);
    
        std::cout << std::boolalpha;
        std::cout << "ok  : " << ok << std::endl;
        std::cout << "full: " << (iter == last) << std::endl;
        if(ok && iter == last)
        {
            std::cout << "OK: Parsing fully succeeded
    
    ";
    
            std::cout << "Function name: " << program.function.name << " (see L" << program.printLoc() << ")
    ";
            for (auto const& va : program.assignments)
                std::cout << "variable " << va.id.name << " assigned value " << va.value << " at L" << va.printLoc() << "
    ";
        }
        else
        {
            int line   = get_line(iter);
            int column = get_column(first, iter);
            std::cout << "-------------------------
    ";
            std::cout << "ERROR: Parsing failed or not complete
    ";
            std::cout << "stopped at: " << line  << ":" << column << "
    ";
            std::cout << "remaining: '" << std::string(iter, last) << "'
    ";
            std::cout << "-------------------------
    ";
        }
        return 0;
    }
    

    <小时>

    [1] 遗憾的是没有(下)记录,除了召唤样本


    [1] sadly un(der)documented, except for the conjure sample(s)

    [2] 好吧,我使用 as_string 来正确分配 Identifier 没有太多工作

    [2] well, I used as_string to get proper assignment to Identifier without too much work

    [3]在性能方面可能有更聪明的方法,但现在,让我们保持简单

    [3] There could be smarter ways about this in terms of performance, but for now, let's keep it simple

    这篇关于boost::spirit 访问来自语义动作的位置迭代器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆