Spirit X3,这种错误处理方法有用吗? [英] Spirit X3, Is this error handling approach useful?

查看:63
本文介绍了Spirit X3,这种错误处理方法有用吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

错误处理和一些实验.我被得出结论了.

After reading the the Spirit X3 tutorial on error handling and some experimentation. I was drawn to a conclusion.

我相信X3中的错误处理主题还有一些改进的余地.从我的角度来看,一个重要的目标是提供有意义的错误消息.首先,最重要的是添加一个将_pass(ctx)成员设置为false的语义动作不会这样做,因为X3会尝试匹配其他内容.仅抛出x3::expectation_failure会过早退出解析功能,即不尝试匹配其他任何内容.因此,剩下的就是解析器指令expect[a]和解析器operator>以及从语义动作中手动抛出x3::expectation_failure的情况.我确实相信有关此错误处理的词汇太有限了.请考虑以下X3 PEG语法行:

I believe there is some room for improvement on the topic of error handing in X3. An important goal from my perspective is to provide a meaningful error message. First and foremost adding a semantic action that will set the _pass(ctx) member to false wouldn’t do it because X3 will try to match something else. Only throwing an x3::expectation_failure will quit the parse function prematurely, i.e. without trying to match anything else. So what is left are the parser directive expect[a] and parser operator> as well as manually throwing x3::expectation_failure from an semantic action. I do believe the vocabulary regarding this error handing is too limited. Please consider the following lines of X3 PEG grammar:

const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;

const auto main_rule__def =
(
 a |
 b |
 c );

现在表达式a不能使用expect[]operator>,因为其他替代方法可能是有效的.我可能是错的,但我认为X3要求我拼写出可以匹配的替代错误表达式,如果匹配,它们可能会抛出x3::expectation_failure,这很麻烦.

Now for expression a I cannot use expect[] or operator>, as other alternatives might be valid. I could be wrong but I think X3 requires me to spell out alternate wrong expressions that can match and if they match they can throw x3::expectation_failure which is cumbersome.

问题是,是否有一种很好的方法来检查我的PEG构造中的错误情况,并使用当前的X3工具检查a,b和c的有序替代项?

The question is, is there a good way of checking for error conditions in my PEG construct with the ordered alternatives for a, b and c using current X3 facilities?

如果答案是否定的,我想提出自己的想法,以为此提供合理的解决方案.我相信我需要一个新的解析器指令.该指令应该做什么?解析失败时,它应调用附加的语义操作.该属性显然未使用,但是在第一次出现解析不匹配时,我需要在迭代器位置上设置_where成员.因此,如果a2失败,则应在a1结束后将_where设置为1.我们将其称为解析指令neg_sa.这意味着否定语义动作.

If the answer is no, I would like to present my idea to provide a reasonable solution for this. I believe I would need a new parser directive for that. What should this directive do? It should call the attached semantic action when the parse fails instead. The attribute is obviously unused, but I would need the _where member to be set on the iterator position on the first occurrence of a parsing mismatch. So if a2 fails, _where should be set 1 after the end of a1. Let’s call the parsing directive neg_sa. That means negate semantic action.

pseudocode

// semantic actions
auto a_sa = [&](auto& ctx)
{
  // add _where to vector v
};

auto b_sa = [&](auto& ctx)
{
  // add _where to vector v
};

auto c_sa = [&](auto& ctx)
{
  // add _where to vector v

  // now we know we have a *real* error.
  // find the peak iterator value in the vector v
  // the position tells whether it belongs to a, b or c.
  // now we can formulate an error message like: "cannot make sense of b upto this position."
  // lastly throw x3::expectation_failure
};

// PEG
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;

const auto main_rule__def =
(
 neg_sa[a][a_sa] |
 neg_sa[b][b_sa] |
 neg_sa[c][c_sa] );

我希望我清楚地提出了这个想法.如果需要进一步说明,请在评论部分告诉我.

I hope I presented this idea clearly. Let me know in the comment section if I need to explain something further.

推荐答案

好的,在一个示例中可能会混淆太多内容,

Okay, risking conflating too many things in an example, here goes:

namespace square::peg {
    using namespace x3;

    const auto quoted_string = lexeme['"' > *(print - '"') > '"'];
    const auto bare_string   = lexeme[alpha > *alnum] > ';';
    const auto two_ints      = int_ > int_;

    const auto main          = quoted_string | bare_string | two_ints;

    const auto entry_point   = skip(space)[ expect[main] > eoi ];
} // namespace square::peg

那应该做.关键在于,唯一应该期待的事情 点是使各个分支失败的东西 无疑是正确的分支. (否则,实际上不会有 很难期望).

That should do. The key is that the only things that should be expectation points is things that make the respective branch fail BEYOND the point where it was unambiguously the right branch. (Otherwise, there would literally not be a hard expectation).

由于有两个较小的get_info专长用于更漂亮的消息¹,这可能会导致 甚至当手动捕获异常时也可以提供体面的错误消息:

With two minor get_info specialization for prettier messages¹, this could lead to decent error messages even when manually catching the exception:

在Coliru上直播

int main() {
    using It = std::string::const_iterator;

    for (std::string const input : {
            "   -89 0038  ",
            "   \"-89 0038\"  ",
            "   something123123      ;",
            // undecidable
            "",
            // violate expecations, no successful parse
            "   -89 oops  ",   // not an integer
            "   \"-89 0038  ", // missing "
            "   bareword ",    // missing ;
            // trailing debris, successful "main"
            "   -89 3.14  ",   // followed by .14
        })
    {
        std::cout << "====== " << std::quoted(input) << "\n";

        It iter = input.begin(), end = input.end();
        try {
        if (parse(iter, end, square::peg::entry_point)) {
            std::cout << "Parsed successfully\n";
        } else {
            std::cout << "Parsing failed\n";
        }
        } catch (x3::expectation_failure<It> const& ef) {
            auto pos = std::distance(input.begin(), ef.where());
            std::cout << "Expect " << ef.which() << " at "
                << "\n\t" << input
                << "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
        }
    }
}

打印

====== "   -89 0038  "
Parsed successfully
====== "   \"-89 0038\"  "
Parsed successfully
====== "   something123123      ;"
Parsed successfully
====== ""
Expect quoted string, bare string or integer number pair at

    ^
====== "   -89 oops  "
Expect integral number at
       -89 oops 
    -------^
====== "   \"-89 0038  "
Expect '"' at
       "-89 0038 
    --------------^
====== "   bareword "
Expect ';' at
       bareword
    ------------^
====== "   -89 3.14  "
Expect eoi at
       -89 3.14 
    --------^

这已经超出了大多数人对解析器的期望.

This is already beyond what most people expect from their parsers.

我们可能不仅仅满足于期望并获得救助.确实,您可以报告并继续解析,因为通常存在不匹配:这是on_error出现的地方.

We might not be content reporting just the one expectation and bailing out. Indeed, you can report and continue parsing as there were just a regular mismatch: this is where on_error comes in.

让我们创建一个标签库:

Let's create a tag base:

struct with_error_handling {
    template<typename It, typename Ctx>
        x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const&) const {
            std::string s(f,l);
            auto pos = std::distance(f, ef.where());

            std::cout << "Expecting " << ef.which() << " at "
                << "\n\t" << s
                << "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";

            return error_handler_result::fail;
        }
};

现在,我们要做的就是从with_error_handling和BAM!中获得我们的规则ID,我们不必编写任何异常处理程序,规则将通过适当的诊断简单地失败".此外,某些输入可能会导致多种(非常有用的)诊断:

Now, all we have to do is derive our rule ID from with_error_handlingand BAM!, we don't have to write any exception handlers, rules will simply "fail" with the appropriate diagnostics. What's more, some inputs can lead to multiple (hopefully helpful) diagnostics:

auto const eh = [](auto p) {
    struct _ : with_error_handling {};
    return rule<_> {} = p;
};

const auto quoted_string = eh(lexeme['"' > *(print - '"') > '"']);
const auto bare_string   = eh(lexeme[alpha > *alnum] > ';');
const auto two_ints      = eh(int_ > int_);

const auto main          = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;

const auto entry_point   = skip(space)[ eh(expect[main] > eoi) ];

现在,main变为:

在Coliru上直播

for (std::string const input : { 
        "   -89 0038  ",
        "   \"-89 0038\"  ",
        "   something123123      ;",
        // undecidable
        "",
        // violate expecations, no successful parse
        "   -89 oops  ",   // not an integer
        "   \"-89 0038  ", // missing "
        "   bareword ",    // missing ;
        // trailing debris, successful "main"
        "   -89 3.14  ",   // followed by .14
    })
{
    std::cout << "====== " << std::quoted(input) << "\n";

    It iter = input.begin(), end = input.end();
    if (parse(iter, end, square::peg::entry_point)) {
        std::cout << "Parsed successfully\n";
    } else {
        std::cout << "Parsing failed\n";
    }
}

程序将打印:

====== "   -89 0038  "
Parsed successfully
====== "   \"-89 0038\"  "
Parsed successfully
====== "   something123123      ;"
Parsed successfully
====== ""
Expecting quoted string, bare string or integer number pair at 

    ^
Parsing failed
====== "   -89 oops  "
Expecting integral number at 
       -89 oops  
    -------^
Expecting quoted string, bare string or integer number pair at 
       -89 oops  
    ^
Parsing failed
====== "   \"-89 0038  "
Expecting '"' at 
       "-89 0038  
    --------------^
Expecting quoted string, bare string or integer number pair at 
       "-89 0038  
    ^
Parsing failed
====== "   bareword "
Expecting ';' at 
       bareword 
    ------------^
Expecting quoted string, bare string or integer number pair at 
       bareword 
    ^
Parsing failed
====== "   -89 3.14  "
Expecting eoi at 
       -89 3.14  
    --------^
Parsing failed

属性传播,on_success

当解析器实际上不解析任何内容时,它们并不是很有用,因此让我们添加一些建设性的价值处理方法,同时展示on_success:

Attribute Propagation, on_success

Parsers aren't very useful when they don't actually parse anything, so let's add some constructive value handling, also showcaseing on_success:

定义一些AST类型以接收属性:

Defining some AST types to receive the attributes:

struct quoted : std::string {};
struct bare   : std::string {};
using  two_i  = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;

确保我们可以打印Value:

static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
    struct {
        std::ostream& _os;
        void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")";             } 
        void operator()(bare const& v) const   { _os << "bare(" << v << ")";                            } 
        void operator()(two_i const& v) const  { _os << "two_i(" << v.first << ", " << v.second << ")"; } 
    } vis{os};

    boost::apply_visitor(vis, v);
    return os;
}

现在,使用旧的as<>技巧强制属性类型,这次使用错误处理:

Now, use the old as<> trick to coerce attribute types, this time with error-handling:

锦上添花,让我们在with_error_handling中演示on_success:

As icing on the cake, let's demonstrate on_success in with_error_handling:

    template<typename It, typename Ctx>
        void on_success(It f, It l, two_i const& v, Ctx const&) const {
            std::cout << "Parsed " << std::quoted(std::string(f,l)) << " as integer pair " << v.first << ", " << v.second << "\n";
        }

现在具有很大程度上未修改的主程序(也只打印结果值):

Now with largely unmodified main program (just prints the result value as well):

在Coliru上直播

    It iter = input.begin(), end = input.end();
    Value v;
    if (parse(iter, end, square::peg::entry_point, v)) {
        std::cout << "Result value: " << v << "\n";
    } else {
        std::cout << "Parsing failed\n";
    }

打印

====== "   -89 0038  "
Parsed "-89 0038" as integer pair -89, 38
Result value: two_i(-89, 38)
====== "   \"-89 0038\"  "
Result value: quoted("-89 0038")
====== "   something123123      ;"
Result value: bare(something123123)
====== ""
Expecting quoted string, bare string or integer number pair at 

    ^
Parsing failed
====== "   -89 oops  "
Expecting integral number at 
       -89 oops  
    -------^
Expecting quoted string, bare string or integer number pair at 
       -89 oops  
    ^
Parsing failed
====== "   \"-89 0038  "
Expecting '"' at 
       "-89 0038  
    --------------^
Expecting quoted string, bare string or integer number pair at 
       "-89 0038  
    ^
Parsing failed
====== "   bareword "
Expecting ';' at 
       bareword 
    ------------^
Expecting quoted string, bare string or integer number pair at 
       bareword 
    ^
Parsing failed
====== "   -89 3.14  "
Parsed "-89 3" as integer pair -89, 3
Expecting eoi at 
       -89 3.14  
    --------^
Parsing failed

真的过分做事

我不了解您,但是我讨厌这样做,更不用说从解析器打印到控制台了.让我们改用x3::with.

我们想通过Ctx&参数附加到诊断程序,而不是编写 到on_error处理程序中的std::cout:

We want to append to the diagnostics via the Ctx& argument instead of writing to std::cout in the on_error handler:

struct with_error_handling {
    struct diags;

    template<typename It, typename Ctx>
        x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
            std::string s(f,l);
            auto pos = std::distance(f, ef.where());

            std::ostringstream oss;
            oss << "Expecting " << ef.which() << " at "
                << "\n\t" << s
                << "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";

            x3::get<diags>(ctx).push_back(oss.str());

            return error_handler_result::fail;
        }
};

在呼叫站点上,我们可以传递上下文:

And on the call site, we can pass the context:

std::vector<std::string> diags;

if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
    std::cout << "Result value: " << v;
} else {
    std::cout << "Parsing failed";
}

std::cout << " with " << diags.size() << " diagnostics messages: \n";

完整程序还会打印诊断:

The full program also prints the diagnostics:

在魔盒上直播² >

完整列表

Live On Wandbox²

//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>

namespace x3 = boost::spirit::x3;

struct quoted : std::string {};
struct bare   : std::string {};
using  two_i  = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;

static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
    struct {
        std::ostream& _os;
        void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")";             } 
        void operator()(bare const& v) const   { _os << "bare(" << v << ")";                            } 
        void operator()(two_i const& v) const  { _os << "two_i(" << v.first << ", " << v.second << ")"; } 
    } vis{os};

    boost::apply_visitor(vis, v);
    return os;
}

namespace square::peg {
    using namespace x3;

    struct with_error_handling {
        struct diags;

        template<typename It, typename Ctx>
            x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
                std::string s(f,l);
                auto pos = std::distance(f, ef.where());

                std::ostringstream oss;
                oss << "Expecting " << ef.which() << " at "
                    << "\n\t" << s
                    << "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";

                x3::get<diags>(ctx).push_back(oss.str());

                return error_handler_result::fail;
            }
    };

    template <typename T = x3::unused_type> auto const as = [](auto p) {
        struct _ : with_error_handling {};
        return rule<_, T> {} = p;
    };

    const auto quoted_string = as<quoted>(lexeme['"' > *(print - '"') > '"']);
    const auto bare_string   = as<bare>(lexeme[alpha > *alnum] > ';');
    const auto two_ints      = as<two_i>(int_ > int_);

    const auto main          = quoted_string | bare_string | two_ints;
    using main_type = std::remove_cv_t<decltype(main)>;

    const auto entry_point   = skip(space)[ as<Value>(expect[main] > eoi) ];
} // namespace square::peg

namespace boost::spirit::x3 {
    template <> struct get_info<int_type> {
        typedef std::string result_type;
        std::string operator()(int_type const&) const { return "integral number"; }
    };
    template <> struct get_info<square::peg::main_type> {
        typedef std::string result_type;
        std::string operator()(square::peg::main_type const&) const { return "quoted string, bare string or integer number pair"; }
    };
}

int main() {
    using It = std::string::const_iterator;
    using D = square::peg::with_error_handling::diags;

    for (std::string const input : { 
            "   -89 0038  ",
            "   \"-89 0038\"  ",
            "   something123123      ;",
            // undecidable
            "",
            // violate expecations, no successful parse
            "   -89 oops  ",   // not an integer
            "   \"-89 0038  ", // missing "
            "   bareword ",    // missing ;
            // trailing debris, successful "main"
            "   -89 3.14  ",   // followed by .14
        })
    {
        std::cout << "====== " << std::quoted(input) << "\n";

        It iter = input.begin(), end = input.end();
        Value v;
        std::vector<std::string> diags;

        if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
            std::cout << "Result value: " << v;
        } else {
            std::cout << "Parsing failed";
        }

        std::cout << " with " << diags.size() << " diagnostics messages: \n";

        for(auto& msg: diags) {
            std::cout << " - " << msg << "\n";
        }
    }
}


¹您可以使用规则及其名称来代替规则,从而避免使用更复杂的技巧


¹ you could use rules with their names instead, obviating this more complex trick

²在旧版本的库中,您可能需要为获得with<>数据上的引用语义而奋斗:

² on older versions of the library you may have to battle to get reference semantics on the with<> data: Live On Coliru

这篇关于Spirit X3,这种错误处理方法有用吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆