递归宏以在Rust中解析匹配武器 [英] Recursive macro to parse match arms in Rust

查看:91
本文介绍了递归宏以在Rust中解析匹配武器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图编写一个宏以将一组规则扩展为执行令牌匹配的代码,但无法生成适当的代码而不会引起宏扩展错误.我知道我可以用其他方法来处理,但是这里的关键问题不是如何解析令牌,而是如何编写一个可以用匹配臂递归扩展令牌树的宏.

I am trying to write a macro to expand a set of rules into code that perform token matching, but am unable to generate the proper code without causing macro expansion errors. I know that I can handle this other ways, but the key question here is not how to parse tokens but rather how to write a macro that can recursively expand a token tree with match arms.

这个想法是我们想从字符串中读取一个令牌并打印出来.需要添加更多代码以使其变得更有用,但是此示例用于说明这种情况:

The idea is that we want to read a token from the string and print it out. More code need to be added to turn it into something more useful, but this example serves to illustrate the situation:

#[derive(Debug, PartialEq)]
enum Digit {
    One,
    Two,
    Three,
    Ten,
    Eleven,
}

#[test]
fn test1(buf: &str) {
    let buf = "111";
    let token = parse!(buf, {
        '1' => Digit::One,
        '2' => Digit::Two,
        '3' => Digit::Three,
    });
    assert_eq!(token, Some(Digit::One));
}

我们要从此示例生成的代码是:

The code we want to generate from this example is:

fn test1(buf: &str) {
    let token = {
        let mut chars = buf.chars().peekable();
        match chars.peek() {
            Some(&'1') => {
                chars.next().unwrap();
                Digit::One
            }
            Some(&'2') => {
                chars.next().unwrap();
                Digit::Two
            }
            Some(&'3') => {
                chars.next().unwrap();
                Digit::Three
            }
            Some(_) | None => None,
        }
    };
    assert_eq!(token, Some(Digit::One));
}

忽略这样一个事实,我们不会从字符串中读取更多的标记,因此chars.next().unwrap()并不是很有用.以后会有用.

Ignore the fact that we do not read more tokens from the string and hence the chars.next().unwrap() is not very useful. It will be useful later.

用于生成上述代码的宏非常简单:

The macro for generating the above code is straightforward:

macro_rules! parse {
    ($e:expr, { $($p:pat => $t:expr),+ }) => {
        {
            let mut chars = $e.chars().peekable();
            match chars.peek() {
                $(Some(&$p) => {
                    chars.next().unwrap();
                    Some($t)
                },)+
                Some(_) | None => None
            }
        }
    };
}

现在让我们扩展此示例,以处理一些更高级的匹配,并允许它先行读取多个字符,因此仅当字符匹配某些模式时才行.如果不是,则不应读取多余的字符.我们以与上一个示例类似的方式创建了一个带有匹配臂的令牌树,但是在这里我们要支持一个递归结构:

Let us now expand this example to handle a little more advanced matching and allow it to read multiple characters with lookahead, so only if the characters match certain patterns. If not, the extraneous characters should not be read. We create a token tree with match arms in a similar way to the previous example, but here we want to support a recursive structure:

#[test]
fn test2() {
    let buf = "111";
    let token = parse!(buf, {
        '1' => {
            '0' => Digit::Ten,
            '1' => Digit::Eleven,
            _ => Digit::One,
        },
        '2' => Digit::Two,
        '3' => Digit::Three
    });
    assert_eq!(token, Some(Digit::Eleven));
}

我们要从此示例生成的代码是:

The code we want to generate from this example is:

fn test2() {
    let buf = "111";
    let token = {
        let mut chars = buf.chars().peekable();
        match chars.peek() {
            Some(&'1') => {
                chars.next().unwrap();
                match chars.peek() {
                    Some(&'0') => {
                        chars.next().unwrap();
                        Some(Digit::Ten)
                    },
                    Some(&'1') => {
                        chars.next().unwrap();
                        Some(Digit::Eleven)
                    },
                    Some(_) | None => Some(Digit::One)
                }
            },
            Some(&'2') => {
                chars.next().unwrap();
                Some(Digit::Two)
            },
            Some(&'3') => {
                chars.next().unwrap();
                Some(Digit::Three)
            },
            Some(_) | None => None,
        }
    };
    assert_eq!(token, Some(Digit::Eleven));
}

尝试编写宏来处理此问题,大致可以像这样:

Trying to write a macro to handle this could work roughly like this:

macro_rules! expand {
    ($t:tt) => {{
        chars.next().unwrap();
        inner!($t)
    }};
    ($e:expr) => {{
        chars.next().unwrap();
        Some($e)
    }};
}

macro_rules! inner {
    ($i:ident, { $($p:pat => ???),+ }) => {
        match $i.peek() {
            $( Some(&$p) => expand!($i, ???), )+
            Some(_) | None => None
        }
    };
}

macro_rules! parse {
    ($e:expr, $t:tt) => {
        {
            let mut chars = $e.chars().peekable();
            inner!(chars, $t)
        }
    };
}

但是,我找不到替换inner!???的东西 具有与表达式或令牌树匹配的内容的宏.

However, I am unable to find something to replace the ??? in the inner! macro with something that matches either an expression or a token tree.

  • 类似$e:expr的东西此时将无法匹配令牌树.

  • Something like $e:expr will not be able to match a token tree at this point.

类似于$t:tt的东西与枚举常量Digit::Two不匹配,这是一个非常有效的表达式.

Something like $t:tt does not match the enum constant Digit::Two, which is a perfectly valid expression.

$($rest:tt)*这样的通用匹配项将失败,因为Kleene-star闭包是贪婪的,并且将尝试匹配以下逗号.

Something like $($rest:tt)* as a generic matcher will fail since the Kleene-star closure is greedy and will try to match the following comma.

一个递归宏逐项匹配项,例如,沿着{ $p:pat => $t:expr, $($rest:tt)* }线的模式将无法在inner!宏的match语句内扩展,因为这会期望语法上类似于... => ...的内容,所以这扩展给出了一个错误,声称它在宏之后需要一个=>:

A recursive macro matching the items one by one, e.g., a pattern along the lines { $p:pat => $t:expr, $($rest:tt)* } will not be possible to expand inside the match statement in the inner! macro since that expect something that syntactically looks like ... => ..., so this expansion gives an error claiming that it expect a => after the macro:

match $e.peek() {
     Some(&$p) => ...$t...,
     inner!($rest)
                   ^ Expect => here
}

这看起来像是语法要求之一在书中提到.

This looks like one of the syntactic requirements mentioned in the book.

更改匹配部分的语法不允许使用pat 要求,因为需要在其后加上=>(根据书中的宏章节).

Changing the syntax of the matching part does not allow use of the pat requirement since that need to be followed by a => (according to the macro chapter in the book).

推荐答案

当您需要基于重复内的不同匹配进行分支时,您需要执行

When you need to branch based on different matches inside repetitions like this, you need to do incremental parsing.

所以

macro_rules! parse {

这是宏的入口点.它设置最外层,并将输入馈入一般的解析规则.我们向下传递chars,以便更深的层可以找到它.

This is the entry point for the macro. It sets up the outer-most layer, and feeds the input into a general parsing rule. We pass down chars so the deeper layers can find it.

    ($buf:expr, {$($body:tt)*}) => {
        {
            let mut chars = $buf.chars().peekable();
            parse! { @parse chars, {}, $($body)* }
        }
    };

终止规则:一旦我们用完了输入(以逗号为模),就将累积的匹配臂代码片段转储到match表达式中,并附加最后的全部捕获臂.

Termination rule: once we run out of input (modulo some commas), dump the accumulated match arm code fragments into a match expression, and append the final catch-all arm.

    (@parse $chars:expr, {$($arms:tt)*}, $(,)*) => {
        match $chars.peek() {
            $($arms)*
            _ => None
        }
    };

或者,如果指定了包罗万象的手臂,请使用它.

Alternately, if the catch-all arm is specified, use that.

    (@parse $chars:expr, {$($arms:tt)*}, _ => $e:expr $(,)*) => {
        match $chars.peek() {
            $($arms)*
            _ => Some($e)
        }
    };

这处理递归.如果看到一个块,则前进$chars并使用空的代码累加器解析该块的内容.所有这些的结果是附加到当前累加器( ie $($arms)).

This handles the recursion. If we see a block, we advance $chars and parse the contents of the block with an empty code accumulator. The result of all this is appended to the current accumulator (i.e. $($arms)).

    (@parse $chars:expr, {$($arms:tt)*}, $p:pat => { $($block:tt)* }, $($tail:tt)*) => {
        parse! {
            @parse
            $chars,
            {
                $($arms)*
                Some(&$p) => {
                    $chars.next().unwrap();
                    parse!(@parse $chars, {}, $($block)*)
                },
            },
            $($tail)*
        }
    };

非递归情况.

    (@parse $chars:expr, {$($arms:tt)*}, $p:pat => $e:expr, $($tail:tt)*) => {
        parse! {
            @parse
            $chars,
            {
                $($arms)*
                Some(&$p) => Some($e),
            },
            $($tail)*
        }
    };
}

为完整起见,其余的测试代码.请注意,我必须更改test1,因为它不是有效的测试.

And, for completeness, the rest of the test code. Note that I had to change test1, as it wasn't a valid test.

#[derive(Debug, PartialEq)]
enum Digit { One, Two, Three, Ten, Eleven }

#[test]
fn test1() {
    let buf = "111";
    let token = parse!(buf, {
        '1' => Digit::One,
        '2' => Digit::Two,
        '3' => Digit::Three,
    });
    assert_eq!(token, Some(Digit::One));
}

#[test]
fn test2() {
    let buf = "111";
    let token = parse!(buf, {
        '1' => {
            '0' => Digit::Ten,
            '1' => Digit::Eleven,
            _ => Digit::One,
        },
        '2' => Digit::Two,
        '3' => Digit::Three,
    });
    assert_eq!(token, Some(Digit::Eleven));
}

这篇关于递归宏以在Rust中解析匹配武器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆