处理以空格开头的字符串 [英] Handle strings starting with whitespaces
问题描述
我正在尝试使用以下规则集创建ANTLR v4语法:
I'm trying to create an ANTLR v4 grammar with the following set of rules:
1.如果一行以@开头,则被视为标签:
1.In case a line starts with @, it is considered a label:
@label
2.如果该行以cmd开头,则将其视为命令
2.In case the line starts with cmd, it is treated as a command
cmd param1 param2
3.如果一行以空格开头,则将其视为字符串.所有文本都应被提取.字符串可以是多行,因此它们以空行结尾
3.If a line starts with a whitespace, it is considered a string. All the text should be extracted. Strings can be multiline, so they end with an empty line
A long string with multiline support
and any special characters one can imagine.
<-empty line here->
4.最后,如果一行以空格( @
和 cmd
)以外的任何内容开头,则应将第一个单词视为标题.
4.Lastly, in case a line starts with anything but whitespace, @
and cmd
, it's first word should be considered a heading.
Heading A long string with multiline support
and any special characters one can imagine.
<-empty line here->
处理标签和命令很容易.但是我对字符串和标题一无所知.无论doubleNewline 和 doublenewline
为何,分隔 whitespace单词空白的最佳方法是什么?我已经看到了很多带有空格的示例,但是它们都不能用于随机文本和换行符.我不希望您为我编写实际的代码.建议一种方法.
It was easy to handle lables and commands. But I am clueless about strings and headings.
What is the best way to separate whitespace word whitespace whatever doubleNewline
and whatever doubleNewline
? I've seen a lot of samples with whitespaces, but none of them works with both random text and newlines. I don't expect you to write actual code for me. Suggesting an approach will do.
推荐答案
这种方法应该可以解决问题:
Something like this should do the trick:
lexer grammar DemoLexer;
LABEL
: '@' [a-zA-Z]+
;
CMD
: 'cmd' ~[\r\n]+
;
STRING
: ' ' .*? NL NL
;
HEADING
: ( ~[@ \t\r\nc] | 'c' ~'m' | 'cm' ~'d' ).*? NL NL
;
SPACE
: [ \t\r\n] -> skip
;
OTHER
: .
;
fragment NL
: '\r'? '\n'
| '\r'
;
这并不强制要求行的开始".要求.如果这是您想要的内容,则必须在语法中添加语义谓词,然后将其与目标语言联系起来.对于Java,看起来像这样:
This does not mandate the "beginning of the line" requirement. If that is something you want, you'll have to add semantic predicates to your grammar, which ties it to a target language. For Java, that would look like this:
LABEL
: {getCharPositionInLine() == 0}? '@' [a-zA-Z]+
;
请参阅:
这篇关于处理以空格开头的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!