忽略令牌字符中的令牌? [英] Ignore tokens in the token characters?
问题描述
我的词法分析器中有以下标记定义,用于定义一个CharacterString(例如'abcd'):
I have the following token definition in my lexer defining a CharacterString (e.g. 'abcd'):
CharacterString:
Apostrophe
(Alphanumeric)*
Apostrophe
;
是否有可能忽略两个撇号,从而能够在词法分析器中不带两个撇号的情况下获取令牌字符串(通过$ CharacterString.text-> chars)?
Is it possible to ignore the two apostrophes to then be able to get the token string without them in the lexer (via $CharacterString.text->chars)?
我尝试过...
CharacterString:
Apostrophe { $channel = HIDDEN; }
(Alphanumeric)*
Apostrophe { $channel = HIDDEN; }
;
...没有成功...这种情况甚至不再与我的字符串匹配(例如,'oiu'在解析器中将失败-匹配项设置不匹配).
... without success... This case does not even match my string anymore (e.g. 'oiu' will fail in the parser - Missmatched Set Exception).
谢谢:)
推荐答案
内联代码{$channel=HIDDEN;}
影响整个CharacterString
,因此您不能像尝试的那样进行操作.
The inline code {$channel=HIDDEN;}
affects the entire CharacterString
, so you can't do it like the way you tried.
您将需要添加一些自定义代码并自行删除引号.这是一个小的C演示:
You will need to add some custom code and remove the quotes yourself. Here's a small C demo:
grammar T;
options {
language=C;
}
parse
: (t=. {printf(">\%s<\n", $t.text->chars);})+ EOF
;
CharacterString
: '\'' ~'\''* '\''
{
pANTLR3_STRING quoted = GETTEXT();
SETTEXT(quoted->subString(quoted, 1, quoted->len-1));
}
;
Any
: .
;
和一些测试功能:
#include "TLexer.h"
#include "TParser.h"
int main(int argc, char *argv[])
{
pANTLR3_UINT8 fName = (pANTLR3_UINT8)"input.txt";
pANTLR3_INPUT_STREAM input = antlr3AsciiFileStreamNew(fName);
if(input == NULL)
{
fprintf(stderr, "Failed to open file %s\n", (char *)fName);
exit(1);
}
pTLexer lexer = TLexerNew(input);
if(lexer == NULL)
{
fprintf(stderr, "Unable to create the lexer due to malloc() failure1\n");
exit(1);
}
pANTLR3_COMMON_TOKEN_STREAM tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lexer));
if(tstream == NULL)
{
fprintf(stderr, "Out of memory trying to allocate token stream\n");
exit(1);
}
pTParser parser = TParserNew(tstream);
if(parser == NULL)
{
fprintf(stderr, "Out of memory trying to allocate parser\n");
exit(ANTLR3_ERR_NOMEM);
}
parser->parse(parser);
parser->free(parser); parser = NULL;
tstream->free(tstream); tstream = NULL;
lexer->free(lexer); lexer = NULL;
input->close(input); input = NULL;
return 0;
}
和测试input.txt
文件包含:
'abc'
如果您现在1)生成词法分析器和解析器,2)编译所有.c
源文件,并且3)运行main
:
If you now 1) generate the lexer and parser, 2) compile all .c
source files, and 3) run main
:
# 1
java -cp antlr-3.3.jar org.antlr.Tool T.g
# 2
gcc -Wall main.c TLexer.c TParser.c -l antlr3c -o main
# 3
./main
您会看到abc
(不带引号)正在打印到控制台.
you'll see that abc
(without the quotes) is being printed to the console.
这篇关于忽略令牌字符中的令牌?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!