如何使用Boost Spirit Qi处理gor解析bnf语法的多行规则 [英] How to handle multi-line rules for gor parsing bnf grammar using boost spirit qi

查看:58
本文介绍了如何使用Boost Spirit Qi处理gor解析bnf语法的多行规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这样的BNF语法

Assuming I have a BNF grammar like this

<code>   ::=  <letter><digit> | <letter><digit><code>
<letter> ::= a | b | c | d | e
             | f | g | h | i
<digit>  ::= 0 | 1 | 2 | 3 |
             4

如果您查看< letter> 规则,则其延续从 | 开始,但以&digit; 规则的延续开始从生产开始,在前一行的末尾出现 | .我也不想使用特定的符号来表示规则的结尾.

If you look at the <letter> rule, its continuation starts with the | but that of the <digit> rule starts with the production with | appearing at the end of the previous line. I also don't want to use a particular symbol to represent the end of a rule.

如何检查规则是否使用Boost Spirit Qi实施?

How do check if a rule as ended using the Boost Spirit Qi for implementation.

我刚刚浏览了增强页面上的教程,想知道我将如何处理它.

I have just gone through the tutorial on the boost page and wondering how I am going to handle this.

推荐答案

维基百科

BNF语法只能在一行中表示一个规则,而在EBNF中,终止字符是分号;".标志着规则的结束.

BNF syntax can only represent a rule in one line, whereas in EBNF a terminating character, the semicolon character ";" marks the end of a rule.

所以简单的答案是:输入的不是BNF.

So the simple answer is: the input isn't BNF.

无论如何,如果您想要支持它(后果自负 :),则必须这样做.因此,让我们写一个简单的BFN语法,从字面上映射 Wikipedia BNF

Iff you want to support it anyways (at your own peril :)) you'll have to make it so. So, let's write a simplistic BFN grammar, literally mapping from Wikipedia BNF

<syntax>         ::= <rule> | <rule> <syntax>
<rule>           ::= <opt-whitespace> "<" <rule-name> ">" <opt-whitespace> "::=" <opt-whitespace> <expression> <line-end>
<opt-whitespace> ::= " " <opt-whitespace> | ""
<expression>     ::= <list> | <list> <opt-whitespace> "|" <opt-whitespace> <expression>
<line-end>       ::= <opt-whitespace> <EOL> | <line-end> <line-end>
<list>           ::= <term> | <term> <opt-whitespace> <list>
<term>           ::= <literal> | "<" <rule-name> ">"
<literal>        ::= '"' <text1> '"' | "'" <text2> "'"
<text1>          ::= "" | <character1> <text1>
<text2>          ::= '' | <character2> <text2>
<character>      ::= <letter> | <digit> | <symbol>
<letter>         ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" | "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
<digit>          ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
<symbol>         ::=  "|" | " " | "!" | "#" | "$" | "%" | "&" | "(" | ")" | "*" | "+" | "," | "-" | "." | "/" | ":" | ";" | ">" | "=" | "<" | "?" | "@" | "[" | "\" | "]" | "^" | "_" | "`" | "{" | "}" | "~"
<character1>     ::= <character> | "'"
<character2>     ::= <character> | '"'
<rule-name>      ::= <letter> | <rule-name> <rule-char>
<rule-char>      ::= <letter> | <digit> | "-"

它可能看起来像这样:

template <typename Iterator>
struct BNF: qi::grammar<Iterator, Ast::Syntax()> {
    BNF(): BNF::base_type(start) {
        using namespace qi;
        start = skip(blank) [ _rule % +eol ];

        _rule       = _rule_name >> "::=" >> _expression;
        _expression = _list % '|';
        _list       = +_term;
        _term       = _literal | _rule_name;
        _literal    = '"' >> *(_character - '"') >> '"'
                    | "'" >> *(_character - "'") >> "'";
        _character  = alnum | char_("\"'| !#$%&()*+,./:;>=<?@]\\^_`{}~[-");
        _rule_name  = '<' >> (alpha >> *(alnum | char_('-'))) >> '>';

        BOOST_SPIRIT_DEBUG_NODES(
            (_rule)(_expression)(_list)(_term)
            (_literal)(_character)
            (_rule_name))
    }

  private:
    qi::rule<Iterator, Ast::Syntax()>     start;
    qi::rule<Iterator, Ast::Rule(),       qi::blank_type> _rule;
    qi::rule<Iterator, Ast::Expression(), qi::blank_type> _expression;
    qi::rule<Iterator, Ast::List(),       qi::blank_type> _list;
    // lexemes
    qi::rule<Iterator, Ast::Term()>       _term;
    qi::rule<Iterator, Ast::Name()>       _rule_name;
    qi::rule<Iterator, std::string()>     _literal;
    qi::rule<Iterator, char()>            _character;
};

现在它将解析您的样本(更正为BNF):

Now it will parse your sample (corrected to be BNF):

    std::string const input = R"(<code>   ::=  <letter><digit> | <letter><digit><code>
<letter> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i"
<digit>  ::= "0" | "1" | "2" | "3" | "4"
    )";

在编译器资源管理器中实时运行

Live On Compiler Explorer

打印:

code ::= {<letter>, <digit>} | {<letter>, <digit>, <code>}
letter ::= {a} | {b} | {c} | {d} | {e} | {f} | {g} | {h} | {i}
digit ::= {0} | {1} | {2} | {3} | {4}
Remaining: "
    "

支持换行规则

最好的方法是不接受它们-因为语法不是像它那样设计的EBNF.

Support Line-Wrapped Rules

The best way is to not accept them - since the grammar wasn't designed for it unlike e.g. EBNF.

您可以通过在船长中进行负前瞻来强制执行此问题:

You can force the issue by doing a negative look-ahead in the skipper:

_skipper = blank | (eol >> !_rule);
start = skip(_skipper) [ _rule % +eol ];

出于技术原因( Boost spirit skipper问题)进行编译,因此我们需要在预读中向其提供占位符船长:

For technical reasons (Boost spirit skipper issues) that doesn't compile, so we need to feed it a placeholder skipper inside the look-ahead:

_blank = blank;
_skipper = blank | (eol >> !skip(_blank.alias()) [ _rule ]);
start = skip(_skipper.alias()) [ _rule % +eol ];

现在它解析相同,但是具有各种换行符:

Now it parses the same but with various line-breaks:

    std::string const input = R"(<code>   ::=  <letter><digit> | <letter><digit><code>
<letter> ::= "a" | "b" | "c" | "d" | "e"
           | "f" | "g" | "h" | "i"
<digit>  ::= "0" | "1" | "2" | "3" |
             "4"
    )";

打印:

code ::= {<letter>, <digit>} | {<letter>, <digit>, <code>}
letter ::= {a} | {b} | {c} | {d} | {e} | {f} | {g} | {h} | {i}   
digit ::= {0} | {1} | {2} | {3} | {4}

完整列表

编译器资源管理器

FULL LISTING

Compiler Explorer

//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted.hpp>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <iomanip>
namespace qi = boost::spirit::qi;

namespace Ast {
    struct Name : std::string {
        using std::string::string;
        using std::string::operator=;

        friend std::ostream& operator<<(std::ostream& os, Name const& n) {
            return os << '<' << n.c_str() << '>';
        }
    };

    using Term = boost::variant<Name, std::string>;
    using List = std::list<Term>;
    using Expression = std::list<List>;

    struct Rule {
        Name name; // lhs
        Expression rhs;
    };

    using Syntax = std::list<Rule>;
}

BOOST_FUSION_ADAPT_STRUCT(Ast::Rule, name, rhs)

namespace Parser {
    template <typename Iterator>
    struct BNF: qi::grammar<Iterator, Ast::Syntax()> {
        BNF(): BNF::base_type(start) {
            using namespace qi;
            _blank = blank;
            _skipper = blank | (eol >> !skip(_blank.alias()) [ _rule ]);
            start = skip(_skipper.alias()) [ _rule % +eol ];

            _rule       = _rule_name >> "::=" >> _expression;
            _expression = _list % '|';
            _list       = +_term;
            _term       = _literal | _rule_name;
            _literal    = '"' >> *(_character - '"') >> '"'
                        | "'" >> *(_character - "'") >> "'";
            _character  = alnum | char_("\"'| !#$%&()*+,./:;>=<?@]\\^_`{}~[-");
            _rule_name  = '<' >> (alpha >> *(alnum | char_('-'))) >> '>';

            BOOST_SPIRIT_DEBUG_NODES(
                (_rule)(_expression)(_list)(_term)
                (_literal)(_character)
                (_rule_name))
        }

      private:
        using Skipper = qi::rule<Iterator>;
        Skipper _skipper, _blank;

        qi::rule<Iterator, Ast::Syntax()>     start;
        qi::rule<Iterator, Ast::Rule(),       Skipper> _rule;
        qi::rule<Iterator, Ast::Expression(), Skipper> _expression;
        qi::rule<Iterator, Ast::List(),       Skipper> _list;
        // lexemes
        qi::rule<Iterator, Ast::Term()>       _term;
        qi::rule<Iterator, Ast::Name()>       _rule_name;
        qi::rule<Iterator, std::string()>     _literal;
        qi::rule<Iterator, char()>            _character;
    };
}

int main() {
    Parser::BNF<std::string::const_iterator> const parser;

    std::string const input = R"(<code>   ::=  <letter><digit> | <letter><digit><code>
<letter> ::= "a" | "b" | "c" | "d" | "e"
           | "f" | "g" | "h" | "i"
<digit>  ::= "0" | "1" | "2" | "3" |
             "4"
    )";

    auto it = input.begin(), itEnd = input.end();

    Ast::Syntax syntax;
    if (parse(it, itEnd, parser, syntax)) {
        for (auto& rule : syntax)
            fmt::print("{} ::= {}\n", rule.name, fmt::join(rule.rhs, " | "));
    } else {
        std::cout << "Failed\n";
    }

    if (it != itEnd)
        std::cout << "Remaining: " << std::quoted(std::string(it, itEnd)) << "\n";
}

在Coliru上直播 (无libfmt)

Also Live On Coliru (without libfmt)

这篇关于如何使用Boost Spirit Qi处理gor解析bnf语法的多行规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆