用Flex编写可重入词法分析器 [英] Writing re-entrant lexer with Flex
问题描述
我是新手.我正在尝试使用flex编写一个简单的可重入词法分析器/扫描器.词法分析器定义如下.我陷入了如下所示的编译错误(yyg问题):
I'm newbie to flex. I'm trying to write a simple re-entrant lexer/scanner with flex. The lexer definition goes below. I get stuck with compilation errors as shown below (yyg issue):
/* Definitions */
digit [0-9]
letter [a-zA-Z]
alphanum [a-zA-Z0-9]
identifier [a-zA-Z_][a-zA-Z0-9_]+
integer [0-9]+
natural [0-9]*[1-9][0-9]*
decimal ([0-9]+\.|\.[0-9]+|[0-9]+\.[0-9]+)
%{
#include <stdio.h>
#define ECHO fwrite(yytext, yyleng, 1, yyout)
int totalNums = 0;
%}
%option reentrant
%option prefix="simpleit_"
%%
^(.*)\r?\n printf("%d\t%s", yylineno++, yytext);
%%
/* Routines */
int yywrap(yyscan_t yyscanner)
{
return 1;
}
int main(int argc, char* argv[])
{
yyscan_t yyscanner;
if(argc < 2) {
printf("Usage: %s fileName\n", argv[0]);
return -1;
}
yyin = fopen(argv[1], "rb");
yylex(yyscanner);
return 0;
}
编译错误:
vietlq@mylappie:~/Desktop/parsers/reentrant$ gcc lex.simpleit_.c
reentrant.l: In function ‘main’:
reentrant.l:44: error: ‘yyg’ undeclared (first use in this function)
reentrant.l:44: error: (Each undeclared identifier is reported only once
reentrant.l:44: error: for each function it appears in.)
推荐答案
对于可重入词法分析器,所有通信都必须包含 state ,该状态包含在 scanner 中.
For a reentrant lexer, all communication must include the state, which is contained within the scanner.
程序中的任何地方(例如main
中的 ),您都可以通过将扫描程序传递给它们的特殊功能来访问状态变量. 例如,在原始的reentrant.l
中,您可以执行以下操作:
Anywhere in your program (e.g. inside main
) you can access the state variables via special functions to which you will pass your scanner. E.g., in your original reentrant.l
, you can do this:
yyscan_t scanner;
yylex_init(&scanner);
yyset_in(fopen(argv[1], "rb"), scanner);
yylex(scanner);
yylex_destroy(scanner);
我已重命名了scanner
,以避免在操作中与yyscanner
混淆.与一般的C代码相反,您的所有操作都在一个名为yylex
的巨型函数中发生,该函数通过名称yyscanner
传递给扫描仪.因此,yyscanner
可用于所有操作.此外,yylex
具有一个称为yyg
的局部变量,该变量保存整个状态,大多数宏都方便地引用yyg
.
I have renamed scanner
to avoid confusion with yyscanner
in the actions. In contrast with general C code, all your actions occur within a giant function called yylex
, which is passed your scanner by the name yyscanner
. Thus, yyscanner
is available to all your actions. In addition, yylex
has a local variable called yyg
that holds the entire state, and most macros conveniently refer to yyg
.
确实可以通过定义yyg
来使用main
内的yyin
宏,就像在您自己的Answer中所做的那样,不建议这样做.对于可重入词法分析器,这些宏仅用于操作.
While it is true that you can use the yyin
macro inside main
by defining yyg
as you did in your own Answer, that is not recommended. For a reentrant lexer, the macros are meant for actions only.
要查看其实现方式,您始终可以查看生成的代码:
To see how this is implemented, you can always view the generated code:
/* For convenience, these vars
are macros in the reentrant scanner. */
#define yyin yyg->yyin_r
...
/* Holds the entire state of the reentrant scanner. */
struct yyguts_t
...
#define YY_DECL int yylex (yyscan_t yyscanner)
/** The main scanner function which does all the work.
*/
YY_DECL
{
struct yyguts_t * yyg = (struct yyguts_t*)yyscanner;
...
}
flex 文档中的reentrant
选项还有很多,其中包括一个干净的编译示例. (使用Google" flex reentrant ",并查找flex.sourceforge
链接.)与 bison 不同, flex 具有一个相当简单的模型再入.我强烈建议将
There is lots more on the reentrant
option in the flex docs, which include a cleanly compiling example. (Google "flex reentrant", and look for the flex.sourceforge
link.) Unlike bison, flex has a fairly straight-forward model for reentrancy. I strongly suggest using reentrant flex with Lemon Parser, rather than with yacc/bison.
这篇关于用Flex编写可重入词法分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!