如何使用PLY为一个令牌处理多个规则 [英] How to handle multiple rules for one token with PLY

查看:148
本文介绍了如何使用PLY为一个令牌处理多个规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用jison文件,并使用python PLY中的lex模块将其转换为解析器生成器.

I'm working with a jison file and converting it to a parser generator using the lex module from python PLY.

我已经注意到,在此jison文件中,某些令牌具有与之关联的多个规则.例如,对于令牌CONTENT,文件指定以下三个规则:

I've noticed that in this jison file, certain tokens have multiple rules associated with them. For example, for the token CONTENT, the file specifies the following three rules:

[^\x00]*?/("{{")                 {
                                   if(yytext.slice(-2) === "\\\\") {
                                     strip(0,1);
                                     this.begin("mu");
                                   } else if(yytext.slice(-1) === "\\") {
                                     strip(0,1);
                                     this.begin("emu");
                                   } else {
                                     this.begin("mu");
                                   }
                                   if(yytext) return 'CONTENT';
                                 }

[^\x00]+                         return 'CONTENT';

// marks CONTENT up to the next mustache or escaped mustache
<emu>[^\x00]{2,}?/("{{"|"\\{{"|"\\\\{{"|<<EOF>>) {
                                   this.popState();
                                   return 'CONTENT';
                                 }

在另一种情况下,COMMENT令牌有多个规则:

In another case, there are multiple rules for the COMMENT token:

<com>[\s\S]*?"--}}"              strip(0,4); this.popState(); return 'COMMENT';
<mu>"{{!--"                      this.popState(); this.begin('com');
<mu>"{{!"[\s\S]*?"}}"            strip(3,5); this.popState(); return 'COMMENT';

在规则适用于不同状态时似乎很容易区分它们,但是当它们适用于相同状态时又如何呢?

It seems easy enough to distinguish the rules when they apply to different states, but what about when they apply to the same state?

如何使用ply.lex将此jison转换为python规则?

How can I translate this jison to python rules using ply.lex?

修改

以防万一,此jison文件是handlebars.js源代码的一部分.参见: https://github.com/wycats/handlebars.js /blob/master/src/handlebars.l

In case it helps, this jison file is part of the handlebars.js source code. See: https://github.com/wycats/handlebars.js/blob/master/src/handlebars.l

推荐答案

这个问题很难回答;这也是两个问题合而为一.

This question is difficult to answer; it is also two questions in one.

Jison(是编写车把解析器的语言,而不是野牛语言)具有某些其他词法分析器未发现的功能,尤其是在PLY中未发现的某些功能.这使得很难将您显示的词汇代码从Jison转换为PLY.但是,这不是您关注的问题.可能会回答您的基本问题,多个正则表达式如何在PLY中返回单个标记,但这并不能为您提供实现示例所选择代码的解决方案!

Jison (that's the language that the handlebars parser is written in, not bison) has some features not found in other lexers, and in particular not found in PLY. This makes it difficult to convert the lexical code you have shown from Jison to PLY. However, this is not the question you were focussed on. It is possible to answer your base question, how can multiple regular expressions return a single token in PLY, but this would not give you the solution to implementing the code you chose as your example!

首先,让我们解决您提出的问题.如

First, lets address the question you asked. Returning one token for multiple regular expressions in PLY can be accomplished by the @TOKEN decorator in PLY as shown in the PLY manual (section 4.11).

例如,我们可以执行以下操作:

For example, we can do the following:

comment1 = r'[^\x00]*?/("{{")'
comment2 = r'[^\x00]+'
comment = r'(' + comment1 + r'|' + comment2 + r')'

@TOKEN(comment)
def t_COMMENT(t)
 ....

但是,这对于您使用jison的规则并不能真正起作用,因为它们使用了jison的一项称为开始条件的新功能(请参见

However, this won't really work for the rules you have from jison as they use a new feature of jison called start conditions (see the Jison Manual). Here, the phrase this.begin is used to introduce a state name, which can then be used elsewhere in a pattern. This is where the <mu>, <emu> and <com> come from. There is no feature like this in PLY.

要匹配这些词素,确实有必要回到车把/小胡子语言/符号的语法并创建新的正则表达式.不知何故,我认为在SO答案中完全为您重新实现整个车把可能是一个遥不可及的步骤.

To match these lexemes, it is really necessary to back to the syntax of the handlebars/moustache language/notation and create new regular expressions. Somehow I fell that completely re-implementing the whole of handlebars for you in a SO answer is perhaps a step too far.

但是,我已经为您和其他踏上这条道路的人确定了解决方案的步骤.

However, I have identified the steps to a solution for you, and anyone else who treads this path.

这篇关于如何使用PLY为一个令牌处理多个规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆