从提高语义动作::精神访问迭代器位置 [英] boost::spirit access position iterator from semantic actions

查看:184
本文介绍了从提高语义动作::精神访问迭代器位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以说我有code这样的(供参考行号):

  1:
2:功能FuncName_1 {
3:VAR VAR_1 = 3;
4:VAR VAR_2 = 4;
5:...

我想写这样解析文本的语法,将所有indentifiers(函数和变量名)的相关信息成树(utree?)。
每个节点应preserve:line_num,column_num和符号价值。例如:

 根:FuncName_1(行:2,列:10)
  孩子[0]:VAR_1(线路:3,西:8)
  孩子[1]:VAR_1(线路:4,西:9)

我希望把它放到树,因为我打算通过该树和我必须知道的语境的每个节点遍历:(当前节点的所有父节点)

例如,同时与VAR_1处理节点,我必须知道,这是对函数FuncName_1局部变量的名称(当前正在处理为节点,而是一个级别更早)

我无法弄清楚一些事情。


  1. 可以这样的精神进行语义的行动和utree的?或者我应该用变体LT;>树木

  2. 如何传递给节点同时这三个信息(列,行,SYMBOL_NAME)?我知道我必须使用pos_iterator作为迭代器类型的语法,但是如何访问这些信息,思迈特行动?

我在升压一个新手,所以我读了精神载文了个遍,我尝试谷歌自己的问题,但我有点不能把所有的拼在一起OT找到解决方案。好像有我这样的用例像以前一样我的是没有一个人(或我只是不能够找到它)
看起来像位置的迭代器的唯一解决方案与解析错误处理的人,但是这不是我感兴趣的情况下。
在code,只有解析code,我需要大约低于但我不知道如何与它向前移动。

 的#include<升压/精神/有/ qi.hpp>
  #包括LT&;升压/精神/有/ support_line_pos_iterator.hpp>  命名空间补气=的boost ::精神::补气;
  类型定义的boost ::精神:: line_pos_iterator<的std ::字符串::为const_iterator> pos_iterator_t;  模板< typename的迭代器= pos_iterator_t,类型名船长=补气::空间类型>
  结构ParseGrammar:公共补气::语法<迭代器,船长>
  {
        ParseGrammar():ParseGrammar :: base_type(来源$ C ​​$ C)
        {
           使用命名空间补气;
           KeywordFunction =亮(功能);
           KeywordVar =点亮(变种);
           分号=亮起(';');           标识符=语义的α>> *(alnum |'_')];
           VarAssignemnt = KeywordVar>>标识符>>烧焦_('=')>> int_>>分号;
           来源$ C ​​$ C = KeywordFunction>>标识符>> {>> * VarAssignemnt>> };
        }        齐::规则<迭代器,船长>来源$ C ​​$ C;
        齐::规则<&迭代器GT; KeywordFunction;
        齐::规则<迭代器,船长> VarAssignemnt;
        齐::规则<&迭代器GT; KeywordVar;
        齐::规则<&迭代器GT;分号;
        齐::规则<&迭代器GT;标识;
  };  诠释的main()
  {
     标准::字符串常量内容=功能FuncName_1 {\\ n VAR VAR_1 = 3; \\ n VAR VAR_2 = 4;};     pos_iterator_t第一(content.begin()),= ITER第一,去年(content.end());
     ParseGrammar< pos_iterator_t>解析; //我们的解析器
     布尔OK = phrase_parse(ITER,
                            持续,
                            解析器,
                            齐::空间);     性病::法院LT&;<的std :: boolalpha;
     性病::法院LT&;< \\ NOK:&所述;&下; OK<<的std :: ENDL;
     性病::法院LT&;< 全:<< (ITER ==最后)LT;<的std :: ENDL;
     如果(OK功放&;&安培; ITER ==最后一个)
     {
        性病::法院LT&;< OK:解析完全成功\\ n \\ n;
     }
     其他
     {
        INT线= get_line(ITER);
        INT列= get_column(第一,ITER);
        性病::法院LT&;< ------------------------- \\ N的;
        性病::法院LT&;< 错误:分析失败或没有完成\\ n;
        性病::法院LT&;< 停在:<<线474;< :&所述;&下;列<< \\ n;
        性病::法院LT&;< 其余的:'<<标准::字符串(ITER,最后)LT;< '\\ n;
        性病::法院LT&;< ------------------------- \\ N的;
     }
     返回0;
  }


解决方案

这是一个有趣的练习,我在那里的最后的放在一起的工作演示 on_success [1] 注释AST节点。

假设我们要像一个AST:

 命名空间AST
{
结构LocationInfo {
    无符号线,柱,长度;
};结构标识符:LocationInfo {
    性病::字符串名称;
};结构VarAssignment:LocationInfo {
    标识ID;
    int值;
};结构来源$ C ​​$ C:LocationInfo {
    标识功能;
    的std ::矢量<&VarAssignment GT;任务;
};
}

我知道,位置信息可能是对来源$ C ​​$ C 节点矫枉过正,但你知道...反正,使它容易分配属性这些节点不需要的语义动作的或许多特制的构造函数:

 的#include<升压/融合/调整/ struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(AST ::标识符(标准::字符串,名))
BOOST_FUSION_ADAPT_STRUCT(AST :: VarAssignment,(AST ::标识符id)(INT,值))
BOOST_FUSION_ADAPT_STRUCT(AST ::来源$ C ​​$ C,(AST ::标识符,函数)(的std ::矢量< AST :: VarAssignment>中分配))

有。现在我们可以宣布的规则公开这些属性:

 齐::规则<迭代器,AST​​ ::来源$ C ​​$ C(),船长>来源$ C ​​$ C;
齐::规则<迭代器,AST​​ :: VarAssignment(),船长> VarAssignment;
齐::规则<迭代器,AST​​ ::标识符()>标识;
//没有队长,没有属性:
齐::规则<&迭代器GT; KeywordFunction,KeywordVar,分号;

我们不(本质)修改语法,在所有:属性传播只是自动 [2]

  KeywordFunction =亮(功能);
KeywordVar =点亮(变种);
分号=亮起(';');标识符= as_string的α>> *(alnum | CHAR _(_));
VarAssignment = KeywordVar>>标识符>> '='>> int_>>分号;
来源$ C ​​$ C = KeywordFunction>>标识符>> {>> * VarAssignment>> };

魔术

我们怎样连接到我们的节点源位置信息?

 自动set_location_info =注释(_val,_1,_3);
on_success(标识符,set_location_info);
on_success(VarAssignment,set_location_info);
on_success(来源$ C ​​$ C,set_location_info);

现在,注释只是一个calleable的懒惰版本的定义为:

 模板< TYPENAME它>
结构annotation_f {
    无效的typedef result_type的;    annotation_f(它首先):第一(第一){}
    为const第一;    模板< TYPENAME缬氨酸,类型名称首先,typename的最后>
    void运算符()(VAL&安培; V,首架F,最后升)const的{
        do_annotate(V,F,L,第一);
    }
  私人的:
    类型静态do_annotate(AST :: LocationInfo&放大器;李,这楼它升,它首先){
        使用std ::距离;
        li.line = get_line(F);
        li.column = get_column(第一,F);
        li.length =距离(F,L);
    }
    静态无效do_annotate(...){}
};

由于该 get_column 的工作方式,仿函数的状态的(因为它记住了开始迭代器) [ 3] 。正如你所看到的 do_annotate 刚刚接受任何来自 LocationInfo 派生的。

现在,布丁的证明:

 的std ::字符串常量内容=功能FuncName_1 {\\ n VAR VAR_1 = 3; \\ n VAR VAR_2 = 4;};pos_iterator_t第一(content.begin()),= ITER第一,去年(content.end());
ParseGrammar< pos_iterator_t>解析器(第一); //我们的解析器AST ::来源$ C ​​$ C程序;
布尔OK = phrase_parse(ITER,
        持续,
        解析器,
        齐::空间,
        程序);性病::法院LT&;<的std :: boolalpha;
性病::法院LT&;< OK<< OK<<的std :: ENDL;
性病::法院LT&;< 全:<< (ITER ==最后)LT;<的std :: ENDL;
如果(OK功放&;&安培; ITER ==最后一个)
{
    性病::法院LT&;< OK:解析完全成功\\ n \\ n;    性病::法院LT&;< 功能名称:<< program.function.name<< (见L<< program.printLoc()<<)\\ n;
    为(自动常量和放大器; VA:program.assignments)
        性病::法院LT&;< 变量<< va.id.name<< 赋值<< va.value<< 在L<< va.printLoc()&所述;&下; \\ n;
}
其他
{
    INT线= get_line(ITER);
    INT列= get_column(第一,ITER);
    性病::法院LT&;< ------------------------- \\ N的;
    性病::法院LT&;< 错误:分析失败或没有完成\\ n;
    性病::法院LT&;< 停在:<<线474;< :&所述;&下;列<< \\ n;
    性病::法院LT&;< 其余的:'<<标准::字符串(ITER,最后)LT;< '\\ n;
    性病::法院LT&;< ------------------------- \\ N的;
}

这将打印:

真:

 确定
全:真
OK:解析完全成功函数名称:FuncName_1(见L1:1:56)
在L2变量VAR_1分配的值3:3:14
在L3变量VAR_2赋值4:3:15

完整的示例代码

看它的 住在Coliru

也显示出:


  • 错误处理的,例如:

     错误:期待=在第3行:VAR VAR_2  -  4; }
               ^ ----在这里
    OK:假的
    全:假的
    -------------------------
    错误:分析故障或不完整
    停在:1:1
    其余:功能FuncName_1 {
    VAR VAR_1 = 3;
    VAR VAR_2 - 4; }
    -------------------------


  • BOOST_SPIRIT_DEBUG宏


  • 的哈克的方式来方便地流的任何AST节点的 LocationInfo 部分有点抱歉:)

  //#定义BOOST_SPIRIT_DEBUG
#定义BOOST_SPIRIT_USE_PHOENIX_V3
#包括LT&;升压/融合/调整/ struct.hpp>
#包括LT&;升压/精神/有/ qi.hpp>
#包括LT&;升压/精神/有/ phoenix.hpp>
#包括LT&;升压/精神/有/ support_line_pos_iterator.hpp>
#包括LT&;&了iomanip GT;命名空间补气=的boost ::精神::补气;
命名空间PHX =提振::凤;类型定义的boost ::精神:: line_pos_iterator<的std ::字符串::为const_iterator> pos_iterator_t;命名空间AST
{
    命名空间MANIP {结构LocationInfoPrinter; }    结构LocationInfo {
        无符号线,柱,长度;
        MANIP :: LocationInfoPrinter printLoc()const的;
    };    结构标识符:LocationInfo {
        性病::字符串名称;
    };    结构VarAssignment:LocationInfo {
        标识ID;
        int值;
    };    结构来源$ C ​​$ C:LocationInfo {
        标识功能;
        的std ::矢量<&VarAssignment GT;任务;
    };    ////////////////////////////////////////////////// /////////////////////////
    //完全没有必要TWEAK得到一个穷人的IO手去
    //所以我们可以做'的std ::法院LT&;< x.printLoc()`从派生类型x`的`
    // LocationInfo
    命名空间MANIP {
        结构LocationInfoPrinter {
            LocationInfoPrinter(LocationInfo常量和放大器; REF):REF(REF){}
            LocationInfo常量和放大器; REF;
            朋友的std :: ostream的&放大器;运营商的LT;≤(的std :: ostream的和放大器; OS,LocationInfoPrinter常量和放大器;唇){
                返回OS<< lip.ref.line<< ':'<< lip.ref.column<< ':'<< lip.ref.length;
            }
        };
    }    MANIP :: LocationInfoPrinter LocationInfo :: printLoc()const的{返回{*}这样; }
    //随意忽略这个技巧
    ////////////////////////////////////////////////// /////////////////////////
}BOOST_FUSION_ADAPT_STRUCT(AST ::标识符(标准::字符串,名))
BOOST_FUSION_ADAPT_STRUCT(AST :: VarAssignment,(AST ::标识符id)(INT,值))
BOOST_FUSION_ADAPT_STRUCT(AST ::来源$ C ​​$ C,(AST ::标识符,函数)(的std ::矢量< AST :: VarAssignment>中分配))结构error_handler_f {
    补气的typedef :: result_type的error_handler_result;
    模板< typename的T1,T2类型名称,类型名称T3,T4类型名称和GT;
        齐:: error_handler_result运算符()(T1 B,T2 E,T3在那里,T4常量和放大器;什么)const的{
            的std :: CERR<< 错误:期待<<什么<< 看齐<< get_line(其中)LT;< :\\ N
                <<标准::字符串(B,E)LT;< \\ n
                <<的std ::运输及工务局局长(性病::距离(b,其中))≤;< '^'<< ----在这里\\ n;
            返回气::失败;
        }
};模板< TYPENAME它>
结构annotation_f {
    无效的typedef result_type的;    annotation_f(它首先):第一(第一){}
    为const第一;    模板< TYPENAME缬氨酸,类型名称首先,typename的最后>
    void运算符()(VAL&安培; V,首架F,最后升)const的{
        do_annotate(V,F,L,第一);
    }
  私人的:
    类型静态do_annotate(AST :: LocationInfo&放大器;李,这楼它升,它首先){
        使用std ::距离;
        li.line = get_line(F);
        li.column = get_column(第一,F);
        li.length =距离(F,L);
    }
    静态无效do_annotate(...){}
};模板< typename的迭代器= pos_iterator_t,类型名船长=补气::空间类型>
结构ParseGrammar:公共补气::语法<迭代器,AST​​ ::来源$ C ​​$ C(),船长>
{
    ParseGrammar(迭代器在前):
        ParseGrammar :: base_type(来源$ C ​​$ C)
        注释(第一)
    {
        使用命名空间补气;
        KeywordFunction =亮(功能);
        KeywordVar =点亮(变种);
        分号=亮起(';');        标识符= as_string的α>> *(alnum | CHAR _(_));
        VarAssignment = KeywordVar>标识符> '='> int_>分号; //注:期望点
        来源$ C ​​$ C = KeywordFunction>>标识符>> {>> * VarAssignment>> };        ON_ERROR<&失败GT;(VarAssignment,处理器(_1,_2,_3,_4));
        ON_ERROR<&失败GT;(来源$ C ​​$ C,处理器(_1,_2,_3,_4));        汽车set_location_info =注释(_val,_1,_3);
        on_success(标识符,set_location_info);
        on_success(VarAssignment,set_location_info);
        on_success(来源$ C ​​$ C,set_location_info);        BOOST_SPIRIT_DEBUG_NODES((KeywordFunction)(KeywordVar)(SemiColon)(Identifier)(VarAssignment)(Source$c$c))
    }    PHX ::功能< error_handler_f>处理程序;
    PHX ::功能< annotation_f<&迭代器GT;>注释;    齐::规则<迭代器,AST​​ ::来源$ C ​​$ C(),船长>来源$ C ​​$ C;
    齐::规则<迭代器,AST​​ :: VarAssignment(),船长> VarAssignment;
    齐::规则<迭代器,AST​​ ::标识符()>标识;
    //没有队长,没有属性:
    齐::规则<&迭代器GT; KeywordFunction,KeywordVar,分号;
};诠释的main()
{
    标准::字符串常量内容=功能FuncName_1 {\\ n VAR VAR_1 = 3; \\ n VAR VAR_2 - 4;};    pos_iterator_t第一(content.begin()),= ITER第一,去年(content.end());
    ParseGrammar< pos_iterator_t>解析器(第一); //我们的解析器    AST ::来源$ C ​​$ C程序;
    布尔OK = phrase_parse(ITER,
            持续,
            解析器,
            齐::空间,
            程序);    性病::法院LT&;<的std :: boolalpha;
    性病::法院LT&;< OK<< OK<<的std :: ENDL;
    性病::法院LT&;< 全:<< (ITER ==最后)LT;<的std :: ENDL;
    如果(OK功放&;&安培; ITER ==最后一个)
    {
        性病::法院LT&;< OK:解析完全成功\\ n \\ n;        性病::法院LT&;< 功能名称:<< program.function.name<< (见L<< program.printLoc()<<)\\ n;
        为(自动常量和放大器; VA:program.assignments)
            性病::法院LT&;< 变量<< va.id.name<< 赋值<< va.value<< 在L<< va.printLoc()&所述;&下; \\ n;
    }
    其他
    {
        INT线= get_line(ITER);
        INT列= get_column(第一,ITER);
        性病::法院LT&;< ------------------------- \\ N的;
        性病::法院LT&;< 错误:分析失败或没有完成\\ n;
        性病::法院LT&;< 停在:<<线474;< :&所述;&下;列<< \\ n;
        性病::法院LT&;< 其余的:'<<标准::字符串(ITER,最后)LT;< '\\ n;
        性病::法院LT&;< ------------------------- \\ N的;
    }
    返回0;
}


[1] 可悲的是联合国(DER)记录,除了变戏法样品(S)

[2] 好,我用 as_string 来得到适当的分配给标识符没有太多的工作。

[3] 有可能是这个聪明的方式在性能方面,但现在,让我们保持它的简单

Lets say I have code like this (line numbers for reference):

1:
2:function FuncName_1 {
3:    var Var_1 = 3;
4:    var  Var_2 = 4;
5:    ...

I want to write a grammar that parses such text, puts all indentifiers (function and variable names) infos into a tree (utree?). Each node should preserve: line_num, column_num and symbol value. example:

root: FuncName_1 (line:2,col:10)
  children[0]: Var_1 (line:3, col:8)
  children[1]: Var_1 (line:4, col:9)

I want to put it into the tree because I plan to traverse through that tree and for each node I must know the 'context': (all parent nodes of current nodes).

E.g, while processing node with Var_1, I must know that this is a name for local variable for function FuncName_1 (that is currently being processed as node, but one level earlier)

I cannot figure out few things

  1. Can this be done in Spirit with semantic actions and utree's ? Or should I use variant<> trees ?
  2. How to pass to the node those three informations (column,line,symbol_name) at the same time ? I know I must use pos_iterator as iterator type for grammar but how to access those information in sematic action ?

I'm a newbie in Boost so I read the Spirit documentaiton over and over, I try to google my problems but I somehow cannot put all the pieces together ot find the solution. Seems like there was no one me with such use case like mine before (or I'm just not able to find it) Looks like the only solutions with position iterator are the ones with parsing error handling, but this is not the case I'm interested in. The code that only parses the code I was taking about is below but I dont know how to move forward with it.

  #include <boost/spirit/include/qi.hpp>
  #include <boost/spirit/include/support_line_pos_iterator.hpp>

  namespace qi = boost::spirit::qi;
  typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;

  template<typename Iterator=pos_iterator_t, typename Skipper=qi::space_type>
  struct ParseGrammar: public qi::grammar<Iterator, Skipper>
  {
        ParseGrammar():ParseGrammar::base_type(SourceCode)
        {
           using namespace qi;
           KeywordFunction = lit("function");
           KeywordVar    = lit("var");
           SemiColon     = lit(';');

           Identifier = lexeme [alpha >> *(alnum | '_')];
           VarAssignemnt = KeywordVar >> Identifier >> char_('=') >> int_ >> SemiColon;
           SourceCode = KeywordFunction >> Identifier >> '{' >> *VarAssignemnt >> '}';
        }

        qi::rule<Iterator, Skipper> SourceCode;
        qi::rule<Iterator > KeywordFunction;
        qi::rule<Iterator,  Skipper> VarAssignemnt;
        qi::rule<Iterator> KeywordVar;
        qi::rule<Iterator> SemiColon;
        qi::rule<Iterator > Identifier;
  };

  int main()
  {
     std::string const content = "function FuncName_1 {\n var Var_1 = 3;\n var  Var_2 = 4; }";

     pos_iterator_t first(content.begin()), iter = first, last(content.end());
     ParseGrammar<pos_iterator_t> resolver;    //  Our parser
     bool ok = phrase_parse(iter,
                            last,
                            resolver,
                            qi::space);

     std::cout << std::boolalpha;
     std::cout << "\nok : " << ok << std::endl;
     std::cout << "full   : " << (iter == last) << std::endl;
     if(ok && iter == last)
     {
        std::cout << "OK: Parsing fully succeeded\n\n";
     }
     else
     {
        int line   = get_line(iter);
        int column = get_column(first, iter);
        std::cout << "-------------------------\n";
        std::cout << "ERROR: Parsing failed or not complete\n";
        std::cout << "stopped at: " << line  << ":" << column << "\n";
        std::cout << "remaining: '" << std::string(iter, last) << "'\n";
        std::cout << "-------------------------\n";
     }
     return 0;
  }

解决方案

This has been a fun exercise, where I finally put together a working demo of on_success[1] to annotate AST nodes.

Let's assume we want an AST like:

namespace ast
{
struct LocationInfo {
    unsigned line, column, length;
};

struct Identifier     : LocationInfo {
    std::string name;
};

struct VarAssignment  : LocationInfo {
    Identifier id;
    int value;
};

struct SourceCode     : LocationInfo {
    Identifier function;
    std::vector<VarAssignment> assignments;
};
}

I know, 'location information' is probably overkill for the SourceCode node, but you know... Anyways, to make it easy to assign attributes to these nodes without requiring semantic actions or lots of specifically crafted constructors:

#include <boost/fusion/adapted/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(ast::Identifier,    (std::string, name))
BOOST_FUSION_ADAPT_STRUCT(ast::VarAssignment, (ast::Identifier, id)(int, value))
BOOST_FUSION_ADAPT_STRUCT(ast::SourceCode,    (ast::Identifier, function)(std::vector<ast::VarAssignment>, assignments))

There. Now we can declare the rules to expose these attributes:

qi::rule<Iterator, ast::SourceCode(),    Skipper> SourceCode;
qi::rule<Iterator, ast::VarAssignment(), Skipper> VarAssignment;
qi::rule<Iterator, ast::Identifier()>         Identifier;
// no skipper, no attributes:
qi::rule<Iterator> KeywordFunction, KeywordVar, SemiColon;

We don't (essentially) modify the grammar, at all: attribute propagation is "just automatic"[2] :

KeywordFunction = lit("function");
KeywordVar      = lit("var");
SemiColon       = lit(';');

Identifier      = as_string [ alpha >> *(alnum | char_("_")) ];
VarAssignment   = KeywordVar >> Identifier >> '=' >> int_ >> SemiColon; 
SourceCode      = KeywordFunction >> Identifier >> '{' >> *VarAssignment >> '}';

The magic

How do we get the source location information attached to our nodes?

auto set_location_info = annotate(_val, _1, _3);
on_success(Identifier,    set_location_info);
on_success(VarAssignment, set_location_info);
on_success(SourceCode,    set_location_info);

Now, annotate is just a lazy version of a calleable that is defined as:

template<typename It>
struct annotation_f {
    typedef void result_type;

    annotation_f(It first) : first(first) {}
    It const first;

    template<typename Val, typename First, typename Last>
    void operator()(Val& v, First f, Last l) const {
        do_annotate(v, f, l, first);
    }
  private:
    void static do_annotate(ast::LocationInfo& li, It f, It l, It first) {
        using std::distance;
        li.line   = get_line(f);
        li.column = get_column(first, f);
        li.length = distance(f, l);
    }
    static void do_annotate(...) { }
};

Due to way in which get_column works, the functor is stateful (as it remembers the start iterator)[3]. As you can see do_annotate just accepts anything that derives from LocationInfo.

Now, the proof of the pudding:

std::string const content = "function FuncName_1 {\n var Var_1 = 3;\n var  Var_2 = 4; }";

pos_iterator_t first(content.begin()), iter = first, last(content.end());
ParseGrammar<pos_iterator_t> resolver(first);    //  Our parser

ast::SourceCode program;
bool ok = phrase_parse(iter,
        last,
        resolver,
        qi::space,
        program);

std::cout << std::boolalpha;
std::cout << "ok  : " << ok << std::endl;
std::cout << "full: " << (iter == last) << std::endl;
if(ok && iter == last)
{
    std::cout << "OK: Parsing fully succeeded\n\n";

    std::cout << "Function name: " << program.function.name << " (see L" << program.printLoc() << ")\n";
    for (auto const& va : program.assignments)
        std::cout << "variable " << va.id.name << " assigned value " << va.value << " at L" << va.printLoc() << "\n";
}
else
{
    int line   = get_line(iter);
    int column = get_column(first, iter);
    std::cout << "-------------------------\n";
    std::cout << "ERROR: Parsing failed or not complete\n";
    std::cout << "stopped at: " << line  << ":" << column << "\n";
    std::cout << "remaining: '" << std::string(iter, last) << "'\n";
    std::cout << "-------------------------\n";
}

This prints:

ok  : true
full: true
OK: Parsing fully succeeded

Function name: FuncName_1 (see L1:1:56)
variable Var_1 assigned value 3 at L2:3:14
variable Var_2 assigned value 4 at L3:3:15

Full Demo Program

See it Live On Coliru

Also showing:

  • error handling, e.g.:

    Error: expecting "=" in line 3: 
    
    var  Var_2 - 4; }
               ^---- here
    ok  : false
    full: false
    -------------------------
    ERROR: Parsing failed or not complete
    stopped at: 1:1
    remaining: 'function FuncName_1 {
    var Var_1 = 3;
    var  Var_2 - 4; }'
    -------------------------
    

  • BOOST_SPIRIT_DEBUG macros

  • A bit of a hacky way to conveniently stream the LocationInfo part of any AST node, sorry :)

//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <iomanip>

namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;

typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;

namespace ast
{
    namespace manip { struct LocationInfoPrinter; }

    struct LocationInfo {
        unsigned line, column, length;
        manip::LocationInfoPrinter printLoc() const;
    };

    struct Identifier     : LocationInfo {
        std::string name;
    };

    struct VarAssignment  : LocationInfo {
        Identifier id;
        int value;
    };

    struct SourceCode     : LocationInfo {
        Identifier function;
        std::vector<VarAssignment> assignments;
    };

    ///////////////////////////////////////////////////////////////////////////
    // Completely unnecessary tweak to get a "poor man's" io manipulator going
    // so we can do `std::cout << x.printLoc()` on types of `x` deriving from
    // LocationInfo
    namespace manip {
        struct LocationInfoPrinter {
            LocationInfoPrinter(LocationInfo const& ref) : ref(ref) {}
            LocationInfo const& ref;
            friend std::ostream& operator<<(std::ostream& os, LocationInfoPrinter const& lip) {
                return os << lip.ref.line << ':' << lip.ref.column << ':' << lip.ref.length;
            }
        };
    }

    manip::LocationInfoPrinter LocationInfo::printLoc() const { return { *this }; }
    // feel free to disregard this hack
    ///////////////////////////////////////////////////////////////////////////
}

BOOST_FUSION_ADAPT_STRUCT(ast::Identifier,    (std::string, name))
BOOST_FUSION_ADAPT_STRUCT(ast::VarAssignment, (ast::Identifier, id)(int, value))
BOOST_FUSION_ADAPT_STRUCT(ast::SourceCode,    (ast::Identifier, function)(std::vector<ast::VarAssignment>, assignments))

struct error_handler_f {
    typedef qi::error_handler_result result_type;
    template<typename T1, typename T2, typename T3, typename T4>
        qi::error_handler_result operator()(T1 b, T2 e, T3 where, T4 const& what) const {
            std::cerr << "Error: expecting " << what << " in line " << get_line(where) << ": \n" 
                << std::string(b,e) << "\n"
                << std::setw(std::distance(b, where)) << '^' << "---- here\n";
            return qi::fail;
        }
};

template<typename It>
struct annotation_f {
    typedef void result_type;

    annotation_f(It first) : first(first) {}
    It const first;

    template<typename Val, typename First, typename Last>
    void operator()(Val& v, First f, Last l) const {
        do_annotate(v, f, l, first);
    }
  private:
    void static do_annotate(ast::LocationInfo& li, It f, It l, It first) {
        using std::distance;
        li.line   = get_line(f);
        li.column = get_column(first, f);
        li.length = distance(f, l);
    }
    static void do_annotate(...) {}
};

template<typename Iterator=pos_iterator_t, typename Skipper=qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, ast::SourceCode(), Skipper>
{
    ParseGrammar(Iterator first) : 
        ParseGrammar::base_type(SourceCode),
        annotate(first)
    {
        using namespace qi;
        KeywordFunction = lit("function");
        KeywordVar      = lit("var");
        SemiColon       = lit(';');

        Identifier      = as_string [ alpha >> *(alnum | char_("_")) ];
        VarAssignment   = KeywordVar > Identifier > '=' > int_ > SemiColon; // note: expectation points
        SourceCode      = KeywordFunction >> Identifier >> '{' >> *VarAssignment >> '}';

        on_error<fail>(VarAssignment, handler(_1, _2, _3, _4));
        on_error<fail>(SourceCode, handler(_1, _2, _3, _4));

        auto set_location_info = annotate(_val, _1, _3);
        on_success(Identifier,    set_location_info);
        on_success(VarAssignment, set_location_info);
        on_success(SourceCode,    set_location_info);

        BOOST_SPIRIT_DEBUG_NODES((KeywordFunction)(KeywordVar)(SemiColon)(Identifier)(VarAssignment)(SourceCode))
    }

    phx::function<error_handler_f> handler;
    phx::function<annotation_f<Iterator>> annotate;

    qi::rule<Iterator, ast::SourceCode(),    Skipper> SourceCode;
    qi::rule<Iterator, ast::VarAssignment(), Skipper> VarAssignment;
    qi::rule<Iterator, ast::Identifier()>             Identifier;
    // no skipper, no attributes:
    qi::rule<Iterator> KeywordFunction, KeywordVar, SemiColon;
};

int main()
{
    std::string const content = "function FuncName_1 {\n var Var_1 = 3;\n var  Var_2 - 4; }";

    pos_iterator_t first(content.begin()), iter = first, last(content.end());
    ParseGrammar<pos_iterator_t> resolver(first);    //  Our parser

    ast::SourceCode program;
    bool ok = phrase_parse(iter,
            last,
            resolver,
            qi::space,
            program);

    std::cout << std::boolalpha;
    std::cout << "ok  : " << ok << std::endl;
    std::cout << "full: " << (iter == last) << std::endl;
    if(ok && iter == last)
    {
        std::cout << "OK: Parsing fully succeeded\n\n";

        std::cout << "Function name: " << program.function.name << " (see L" << program.printLoc() << ")\n";
        for (auto const& va : program.assignments)
            std::cout << "variable " << va.id.name << " assigned value " << va.value << " at L" << va.printLoc() << "\n";
    }
    else
    {
        int line   = get_line(iter);
        int column = get_column(first, iter);
        std::cout << "-------------------------\n";
        std::cout << "ERROR: Parsing failed or not complete\n";
        std::cout << "stopped at: " << line  << ":" << column << "\n";
        std::cout << "remaining: '" << std::string(iter, last) << "'\n";
        std::cout << "-------------------------\n";
    }
    return 0;
}


[1] sadly un(der)documented, except for the conjure sample(s)

[2] well, I used as_string to get proper assignment to Identifier without too much work

[3] There could be smarter ways about this in terms of performance, but for now, let's keep it simple

这篇关于从提高语义动作::精神访问迭代器位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆