在flex中匹配的奇怪的情况 [英] A weird situation about matching in flex
问题描述
我在flex编写一个扫描器,我有以下两个定义:
I am writing a scanner in flex, and I have the following two definitions:
%%
"int" printf("JUST_INT");
"int"[ \t\n]+"matrix" printf("MATRIX_INT");
[A-Za-z][A-Za-z0-9]* printf("IDENTIFIER");
%%
当扫描仪的输入为 int matrixM = 3;
输出为 MATRIX_INT IDENTIFIER
。它看到输入为 int matrix M = 3
。但实际上,这不是一个矩阵
,标识符的名称是 matrixM
。输出应为 JUST_INT IDENTIFIER
。为什么会发生这种情况?是因为我的定义吗?
When the input to the scanner is int matrixM = 3;
the output is MATRIX_INT IDENTIFIER
. It sees the input as int matrix M=3
. But actually, this is not a matrix
, the name of identifier is matrixM
. Output should be JUST_INT IDENTIFIER
. Why is this happening? Is that because of my definitions?
推荐答案
有一些规则可以跨过 t \\\
。只需将int,matrix和IDENTIFIER返回为三个单独的令牌,并让解析器将其排序。这比
] +matrix flex
好多了。 flex
只是第一个最长的匹配。
It's not a good idea to have rules that scan across spaces like "int"[ \t\n]+"matrix"
. Just return "int", "matrix", and IDENTIFIER as three separate tokens and let the parser sort it out. It's much better at that than flex
is. flex
just does the first longest match.
这篇关于在flex中匹配的奇怪的情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!