从通用字符串中区分标识符 [英] Distinguishing identifiers from common strings

查看：115 发布时间：2020/4/30 10:12:13 parsing bison yacc lex

本文介绍了从通用字符串中区分标识符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用Bison/Yacc + Lex编写一个解析器，它可以解析如下语句:

I want to write a parser using Bison/Yacc + Lex which can parse statements like:

VARIABLE_ID = 'STRING'

其中:

ID       [a-zA-Z_][a-zA-Z0-9_]*

和:

STRING      [a-zA-Z0-9_]+

因此，var1 = '123abc'是有效的语句，而1var = '123abc'则不是.

So, var1 = '123abc' is a valid statement while 1var = '123abc' isn't.

因此，VARIABLE_ID是STRING，但STRING并不总是是VARIABLE_ID.

Therefore, a VARIABLE_ID is a STRING but a STRING not always is a VARIABLE_ID.

我想知道的是，区分两者的唯一方法是编写更高级别的检查程序(例如，在Bison代码内部)还是我可以在Lex代码中实现它.

What I would like to know is if the only way to distinguish between the two is writing a checking procedure at a higher level (i.e. inside Bison code) or if I can work it out in the Lex code.

推荐答案

您的抽象语句语法实际上是:

Your abstract statement syntax is actually:

VARIABLE = STRING

不是

VARIABLE = 'STRING'

因为引号分隔符是我们通常希望避免使用的语法细节.因此，令牌模式实际上是这样的:

because the quote delimiters are a lexical detail that we generally want to keep out of the syntax. And so, the token patterns are actually this:

ID       [a-zA-Z_][a-zA-Z0-9_]*
STRING   '[a-zA-Z_0-9]*'

ID是字母或下划线，后跟字母，数字和下划线的任意组合(包括空).

An ID is a letter or underscore, followed by any combination (including empty) of letters, digits and underscores.

STRING是一个单引号，然后是一个序列(可能为空)字母，数字和下划线，然后是另一个单引号.

A STRING is a single quote, followed by a sequence (possibly empty) letters, digits and underscores, followed by another single quote.

因此，您担心的歧义不存在.实际上，ID并不是STRING，反之亦然.

So the ambiguity you are concerned about does not exist. An ID is not in fact a STRING, nor vice versa.

在Bison解析器内部或词法分析器中的某个位置，您可能希望对STRING匹配项的yytext进行修饰以删除引号，而只是将它们之间的文本保留为字符串.这可能是Bison规则，可能类似于:

Somewhere inside your Bison parser, or possibly in the lexer, you might want to massage the yytext of a STRING match to remove the quotes and just retain the text in between them as a string. This could be a Bison rule, possibly similar to:

string : STRING 
       {
          $$ = strip_quotes($1);
       }
       ;

这篇关于从通用字符串中区分标识符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从通用字符串中区分标识符 [英] Distinguishing identifiers from common strings

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从通用字符串中区分标识符 [英] Distinguishing identifiers from common strings

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭