优化boost :: spirit :: qi解析器 [英] Optimizing a boost::spirit::qi parser

查看:81
本文介绍了优化boost :: spirit :: qi解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个解析器,它基本上以给定表达式的运算符优先级来打印出堆栈计算机的动作.我的目标是尽可能优化速度.我已经阅读了有关气管优化的文章,该文章提供了此示例代码.我了解主要文章中描述的优化要点,但是我不清楚如何将其集成到我的代码中.

I have a parser that basically prints out the actions of a stack machine with my operator precedence given some expression. My goal is to optimize for speed as much as possible. I have read an article concerning qi optimizations that provides this example code. I understand the gist of the optimizations described in the main article, however I am unclear how to integrate this into my code.

这是我的解析器的以下工作示例.我已经尝试通过使用raw[]提供基本迭代器对其进行某种程度的优化.必须给phoenix操作调用字符串或迭代器,以便它们可以创建字符串.这些函数的实际版本并非微不足道,并且它们的功能尚无法在解析时进行评估:

Here is a following working example of my parser. I have already tried to optimize it somewhat by using raw[] to provide base iterators. The phoenix action calls must be given strings or iterators by which they can create strings; the real versions of these functions are not trivial and their functionality can not yet be evaluated in parsing-time:

#include <iostream>
#include <vector>
#include <string>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_char.hpp>
#include <boost/spirit/include/qi_parse.hpp>
#include <boost/spirit/include/phoenix_bind.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using std::endl;
using std::cout;
using std::string;
using std::vector;

void fPushOp(const char* op){
  cout << "PushOp: " << op << endl;
}

void fPushInt(const boost::iterator_range<string::const_iterator>& my_str){
  cout << "PushInt: " << my_str << endl;
}

template<typename Iterator, typename Skipper = qi::space_type>
struct Calculator : public qi::grammar<Iterator, Skipper> {

  qi::rule<Iterator, Skipper>  
    expression, logical_or_expression, logical_and_expression, negate_expression, series_expression,
    single_expression, inclusive_or_expression, exclusive_or_expression, and_expression, equality_expression, 
    relational_expression, shift_expression, additive_expression, multiplicative_expression, 
    term, complement_factor, factor, result, integer, variable, variable_combo, word, prefix;

  qi::rule<Iterator> number;
  Calculator() : Calculator::base_type(result)
  {
    number = 
        qi::raw[
            ("0x" >> +qi::char_("0-9a-fA-F"))     
          | ("0b" >> +qi::char_("0-1"))
          | ("0" >>  +qi::char_("0-7"))
          | (+qi::char_("0-9"))
        ] [phx::bind(&fPushInt, qi::_1)]
        ;

    integer = 
          number
        | ('-' >> number) [phx::bind(&fPushOp, "OP_UNARY_MINUS")]
        ;

    variable =
          ((qi::alpha | qi::char_('_')) 
              >> *(qi::alnum | qi::char_('_')) 
              >> '['
              >>  (+(qi::alnum | qi::char_('_') | qi::char_(',')) 
                | ('\'' >> *~qi::char_('\'') >> '\'')) 
              >> ']')
        | ((qi::alpha | qi::char_('_')) >> *(qi::alnum | qi::char_('_')))
        ;

    variable_combo =
        qi::raw [
          variable >> *(qi::char_('.') >> variable)
        ] [phx::bind(&fPushInt, qi::_1)]
        ;

    word = 
        qi::raw[
          variable
        ] [phx::bind(&fPushInt, qi::_1)]
        ;

    factor =
            ("ceil(" >> expression >> ')')                                                      [phx::bind(&fPushOp, "OP_CEIL")]
        |   ("wrap(" >> expression >> ')')                                                      [phx::bind(&fPushOp, "OP_WRAP")]
        |   ("abs(" >> expression >> ')')                                                       [phx::bind(&fPushOp, "OP_ABS")]
        |   ("count1(" >> expression >> ')')                                                    [phx::bind(&fPushOp, "OP_COUNT1")]
        |   ("pick(" >> expression >> ')')                                                      [phx::bind(&fPushOp, "OP_PICK")]
        |   ("defined(" >> expression >> ')')                                                   [phx::bind(&fPushOp, "OP_DEF")]
        |   ("string_equal(" >> word >> ',' >> word >> ')')                                     [phx::bind(&fPushOp, "OP_STREQ")]
        |   ("string_contains(" >> word >> ',' >> word >> ')')                                  [phx::bind(&fPushOp, "OP_STRCON")]
        |   ("lsl(" >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')   [phx::bind(&fPushOp, "OP_LSL")]
        |   ("lsr(" >> single_expression >> ',' >> single_expression >> ')')                    [phx::bind(&fPushOp, "OP_LSR")]
        |   ("asr(" >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')   [phx::bind(&fPushOp, "OP_ASR")]
        |   ("ror(" >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')   [phx::bind(&fPushOp, "OP_ROR")]
        |   ("rrx(" >> single_expression >> ',' >> single_expression >> ',' >> single_expression >> ',' >> number >> ')')[phx::bind(&fPushOp, "OP_RRX")]
        |   ('(' >> expression >> ')')
        |   variable_combo
        |   integer
        ;
    complement_factor = factor
        | ('~' >> factor) [phx::bind(&fPushOp, "OP_COMPLEMENT")]
        ;
    term = complement_factor
      >> *( (".." >> complement_factor) [phx::bind(&fPushOp, "OP_LEGER")]
          | ('\\' >> complement_factor) [phx::bind(&fPushOp, "OP_MASK")]
          ); 
    multiplicative_expression = term
      >> *( ('/' >> term) [phx::bind(&fPushOp, "OP_DIV")]
          | ('%' >> term) [phx::bind(&fPushOp, "OP_MOD")]
          | ('*' >> term) [phx::bind(&fPushOp, "OP_MUL")]
          );
    additive_expression = multiplicative_expression
      >> *( ('+' >> multiplicative_expression)  [phx::bind(&fPushOp, "OP_ADD")]
          | ('-' >> multiplicative_expression)  [phx::bind(&fPushOp, "OP_SUB")]
          );
    shift_expression = additive_expression
      >> *( (">>" >> additive_expression) [phx::bind(&fPushOp, "OP_SRL")]
          | ("<<" >> additive_expression) [phx::bind(&fPushOp, "OP_SLL")]
          );
    relational_expression = shift_expression
      >> *( ('<' >> shift_expression) [phx::bind(&fPushOp, "OP_LT")]
          | ('>' >> shift_expression) [phx::bind(&fPushOp, "OP_GT")]
          | ("<=" >> shift_expression)[phx::bind(&fPushOp, "OP_LET")]
          | (">=" >> shift_expression)[phx::bind(&fPushOp, "OP_GET")]
          );
    equality_expression = relational_expression 
      >> *( ("==" >> relational_expression)[phx::bind(&fPushOp, "OP_EQ")]
          | ("!=" >> relational_expression)[phx::bind(&fPushOp, "OP_NEQ")] 
          );
    and_expression = equality_expression 
      >> *(('&' >> equality_expression)     [phx::bind(&fPushOp, "OP_AND")]); 
    exclusive_or_expression = and_expression 
      >> *(('^' >> and_expression)          [phx::bind(&fPushOp, "OP_XOR")]); 
    inclusive_or_expression = exclusive_or_expression 
      >> *(('|' >> exclusive_or_expression) [phx::bind(&fPushOp, "OP_OR")]); 
    single_expression = inclusive_or_expression;
    series_expression = inclusive_or_expression 
      >> *((',' >> inclusive_or_expression) [phx::bind(&fPushOp, "OP_SERIES")]);
    negate_expression = series_expression
        | ('!' >> series_expression)        [phx::bind(&fPushOp, "OP_NEGATE")];
    logical_and_expression = negate_expression
      >> *(("&&" >> negate_expression)      [phx::bind(&fPushOp, "OP_LOGICAL_AND")]); 
    logical_or_expression = logical_and_expression 
      >> *(("||" >> logical_and_expression) [phx::bind(&fPushOp, "OP_LOGICAL_OR")]);
    expression = logical_or_expression;

    result = expression;
  }
};

int main(){
  Calculator<string::const_iterator> calc;
  const string expr("0xff0000 >> 3 && 3 + (!9) | (0,200)");
  cout << "Expression: " << expr << endl;

  string::const_iterator it = expr.begin();
  phrase_parse(it, expr.end(), calc, qi::space);

  cout << "Remaining: " << (string(it,expr.end())) << endl;
  return 0;
}

此外,我阅读了此pdf中与utree有关的幻灯片,并试图弄清楚如果可能的话,如何使用utree输出而不是语义动作,因为这样的事情显然是邪恶的.有人可以提供一个简单的示例或向我指出如何构造一个utree,然后将该utree送入堆栈计算机以按顺序打印出操作吗?

Additionally, I read the slides from this pdf concerning utree and am trying to figure out how, if possible, to use the utree output instead of semantic actions since apparently such things are evil. Can someone provide or point me to a simple example on how to construct a utree that can then be fed to a stack machine to print out operations in order?

推荐答案

优化取决于您要实现的目标.因此,我认为您过早地进行了优化.

The optimizations depend on what you want to achieve. Therefore, I think you're optimizing prematurely.

例如如果您以后要解释符号,则将variable_combo解析为raw[]输入序列没有任何意义(因为您必须再次解析变量combo ,甚至必须预料到那里的空格:"foo . bar .tux"是有效的变量组合).

E.g. parsing a variable_combo as a raw[] input sequence does not make any sense if you want to interpret the symbols later (because you'll have to parse the variable combo again, and you'll even have to anticipate whitespace in there: "foo . bar .tux" is a valid variable combo here).

我有很多关于优化Boost Spirit的文章(请在此处开始例如).快速观察:

I have quite a lot of posts in general dealing with optimizing Boost Spirit (start here e.g.). Quick observations here:

  • 考虑回溯下的正确性;语法分析为'ceil(3.7')时,您将获得:

  • consider correctness under backtracking; with your grammar parsing 'ceil(3.7') you'll get:

Expression: ceil(3.7)
PushInt: 3
PushInt: ceil
Remaining: (3.7)

请注意,如果解析失败,它将如何发出操作码.还要注意,它发出错误操作码

Note how this emits opcodes when parsing failed. Note also, it emits the wrong opcodes

  • 它按3而不是3.7
  • 是否将ceil作为PushInt推送?
  • it pushes 3 instead of 3.7
  • it pushes ceil as an PushInt?

因此,它不仅会检测到解析太晚的失败,还会忽略括号,无法发现函数调用并解析错误的数字.

So not only does it detect failure to parse too late, it just ignores the parentheses, fails to spot the function call and parses the number wrong.

关于过早的评估,我要指出一个流行的答案: Boost精神:语义行为是邪恶的"?

Regarding the premature evaluation, I'm going to point to this popular answer: Boost Spirit: "Semantic actions are evil"?

除此之外,我只是在怀疑您正在做过早的优化.考虑做

Other than that, I'm just confirming my suspicion that you're doing premature optimization. Consider doing

#define BOOST_SPIRIT_DEBUG

然后在语法构造函数中:

and then later, in the grammar constructor:

BOOST_SPIRIT_DEBUG_NODES(
        (expression)(logical_or_expression)(logical_and_expression)(negate_expression)(series_expression)(single_expression)
        (inclusive_or_expression)(exclusive_or_expression)(and_expression)(equality_expression)(relational_expression)
        (shift_expression)(additive_expression)(multiplicative_expression)(term)(complement_factor)(factor)(result)(integer)
        (variable)(variable_combo)(word)(prefix)

真正了解解析器的行为.

To really see how your parser behaves.

考虑qi ::符号,例如:

consider qi::symbols e.g.:

qi::symbols<char,const char*> unary_function;

unary_function.add
    ("ceil",    "OP_CEIL")
    ("wrap",    "OP_WRAP")
    ("abs",     "OP_ABS")
    ("count1",  "OP_COUNT1")
    ("pick",    "OP_PICK")
    ("defined", "OP_DEF");

unary_call = (unary_function >> "(" >> expression >> ')') [phx::bind(&fPushOp, qi::_1)];

  • 特征可能会给编译器内联后优化留下更多的潜力(与语义操作相反,因为模板实例化的许多级别可以掩盖某些情况,尤其是涉及到bind的情况)

  • traits might leave more potential for the compiler to optimize after inlining (as opposed to semantic actions, since the many levels of template instantiation can obscure some cases, especially when bind is involved)

    您可能希望在此处驱动运算符优先级表,如一些主调样本所示.使用规则层次结构来执行优先级的传统方法使语法变得复杂.这有两个主要缺点:

    You may want to make operator precedence table driven here, as some of the spirit samples show. The traditional way to use rule-hierarchy to enforce precedence that you've taken complicates the grammar. This has two key downsides:

    • 每个规则都引入了虚拟调度(Spirit X3将来可能不再具有此限制)
    • 您的语法太复杂了,您已经失去了概览(请参阅第一个项目符号)

    我建议

    1. 随着语义动作变得笨拙而在解析过程中远离评估,面对(后期)回溯(甚至解析器失败;很容易检测到后者),非常(非常)棘手.但是回溯也可能是良性的,并且很难纠正语义动作何时会产生副作用).

    1. moving away from evaluating during parsing as the semantic actions grow unwieldy, and are very (very) tricky to get right in the face of (late) backtracking (or even parser failures; the latter could be detected easily, but backtracking can also be benign and very hard to correct for when semantic actions have side effects).

    从最简单的规则开始构建语法,然后在添加测试用例时逐步构建语法

    start building the grammar from the simplest rule, gradually building it as you add test cases

    这篇关于优化boost :: spirit :: qi解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆