使用语法解析可能嵌套的支撑项目 [英] Parsing a possibly nested braced item using a grammar

查看：60 发布时间：2020/11/20 4:49:18 grammar raku

本文介绍了使用语法解析可能嵌套的支撑项目的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我开始写BibTeX解析器.我想做的第一件事是解析一个支撑项.括号项目可以是例如作者字段或标题.字段中可能有嵌套的花括号.以下代码不不处理嵌套花括号:

I am starting to write BibTeX parser. The first thing I would like to do is to parse a braced item. A braced item could be an author field or a title for example. There might be nested braces within the field. The following code does not handle nested braces:

use v6;

my $str = q:to/END/;
  author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}, 
  END

$str .= chomp;

grammar ExtractBraced {
    rule TOP {
        'author=' <braced-item> .*
    }
    rule braced-item      { '{' <-[}]>* '}' }
}

ExtractBraced.parse( $str ).say;

输出:

｢author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},｣
 braced-item => ｢{Belayneh, M. and Geiger, S. and Matth{\"{a}｣

现在，为了使解析器接受嵌套的大括号，我想保留一个当前解析的打开大括号的计数器，当遇到一个关闭大括号时，我们将计数器递减.如果计数器达到零，则假定我们已经解析了完整的项目.

Now, in order to make the parser accept nested braces, I would like to keep a counter of the number of opening braces currently parsed and when encountering a closing brace, we decrement the counter. If the counter reaches zero, we assume that we have parsed the complete item.

为了遵循这个想法，我尝试分解braced-item正则表达式，以对每个字符执行语法操作. (下面braced-item-char正则表达式上的action方法应处理大括号计数器):

To follow this idea, I tried to split up the braced-item regex, to implement an grammar action on each char. (The action method on the braced-item-char regex below should then handle the brace-counter):

grammar ExtractBraced {
    rule TOP {
        'author=' <braced-item> .*
    }
    rule braced-item      { '{' <braced-item-char>* '}' }
    rule braced-item-char { <-[}]> }
}

但是，现在突然解析失败.可能是一个愚蠢的错误，但是我不知道为什么现在应该失败?

However, suddenly now the parsing fails. Probably a silly mistake, but I cannot see why it should fail now?

推荐答案

在不知道结果数据如何显示的情况下，我将其更改为如下所示:

Without knowing how you want the resultant data to look I would change it to look something like this:

my $str = ｢author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},｣;

grammar ExtractBraced {
    token TOP {
        'author='
        $<author> = <.braced-item>
        .*
    }
    token braced-item {
       '{' ~ '}'

           [
           || <- [{}] >+
           || <.before '{'> <.braced-item>
           ]*
    }
}

ExtractBraced.parse( $str ).say;

｢author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},｣
 author => ｢{Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}｣

如果您想要更多的结构，它可能看起来像这样:

If you want a bit more structure It might look a bit more like this:

my $str = ｢author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},｣;

grammar ExtractBraced {
    token TOP {
        'author='
        $<author> = <.braced-item>
        .*
    }
    token braced-part {
        || <- [{}] >+
        || <.before '{'> <braced-item>
    }
    token braced-item {
        '{' ~ '}'
            <braced-part>*
    }
}

class Print {
    method TOP ($/){
        make $<author>.made
    }
    method braced-part ($/){
        make $<braced-item>.?made // ~$/
    }
    method braced-item ($/){
        make [~] @<braced-part>».made
    }
}


my $r = ExtractBraced.parse( $str, :actions(Print) );
say $r;
put();
say $r.made;

｢author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},｣
 author => ｢{Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}｣
  braced-part => ｢Belayneh, M. and Geiger, S. and Matth｣
  braced-part => ｢{\"{a}}｣
   braced-item => ｢{\"{a}}｣
    braced-part => ｢\"｣
    braced-part => ｢{a}｣
     braced-item => ｢{a}｣
      braced-part => ｢a｣
  braced-part => ｢i, S.K.｣

Belayneh, M. and Geiger, S. and Matth\"ai, S.K.

请注意，<-[{}]>+上的+和<before '{'>是一项优化，都可以省略，并且仍然可以使用.

Note that the + on <-[{}]>+ is an optimization, as well as <before '{'>, both can be omitted and it will still work.

这篇关于使用语法解析可能嵌套的支撑项目的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用语法解析可能嵌套的支撑项目 [英] Parsing a possibly nested braced item using a grammar

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用语法解析可能嵌套的支撑项目 [英] Parsing a possibly nested braced item using a grammar

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭