解析不完整的语法 [英] Parsing with incomplete grammars
问题描述
是否有任何常见的解决方案如何使用不完整的语法?就我而言,我只想检测 Delphi (Pascal) 文件中的方法,这意味着 procedures
和 functions
.以下第一次尝试有效
方法:( 过程 | 函数 | . )+;
但这真的是一个解决方案吗?有没有更好的解决方案?是否可以通过操作停止解析(例如,在检测到 implementation
之后).使用预处理器有意义吗?当是时 - 如何?
如果你只是在寻找名字,那么就这么简单:
grammar PascalFuncProc;解析:(程序|函数)* EOF;程序: 'procedure' 空格标识符;功能: 'function' 空格标识符;忽略:(StrLiteral | 评论 | .) {skip();};片段空间:(' ' | '\t' | '\r' | '\n')+;片段标识符 : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*;片段 StrLiteral : '\'' ~'\''* '\'';片段注释:'{' ~'}'* '}';
会解决问题.请注意,我对 Delhpi/Pascal 不是很熟悉,所以我肯定会搞砸 StrLiteral
和/或 Comment
,但这很容易解决.>
从上面的语法生成的词法分析器只会产生两种类型的标记(Procedure
s 和 Function
s),其余的输入(字符串文字、注释或如果没有匹配,单个字符:.
) 将立即从词法分析器中被丢弃(skip()
方法).
对于这样的输入:
一些有效的来源{函数 NotAFunction ...}程序过程开始...结尾;程序功能开始s = '函数 NotAFunction!!!'结尾;
创建如下解析树:
Are there any common solutions how to use incomplete grammars? In my case I just want to detect methods in Delphi (Pascal)-files, that means procedures
and functions
. The following first attempt is working
methods
: ( procedure | function | . )+
;
but is that a solution at all? Are there any better solutions? Is it possible to stop parsing with an action (e. g. after detecting implementation
). Does it make sense to use a preprocessor? And when yes - how?
If you're only looking for names, then something as simple as this:
grammar PascalFuncProc;
parse
: (Procedure | Function)* EOF
;
Procedure
: 'procedure' Spaces Identifier
;
Function
: 'function' Spaces Identifier
;
Ignore
: (StrLiteral | Comment | .) {skip();}
;
fragment Spaces : (' ' | '\t' | '\r' | '\n')+;
fragment Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*;
fragment StrLiteral : '\'' ~'\''* '\'';
fragment Comment : '{' ~'}'* '}';
will do the trick. Note that I am not very familiar with Delhpi/Pascal, so I am surely goofing up StrLiteral
s and/or Comment
s, but that'll be easily fixed.
The lexer generated from the grammar above will only produce two type of tokens (Procedure
s and Function
s), the rest of the input (string literals, comments or if nothing is matched, a single character: the .
) is being discarded from the lexer immediately (the skip()
method).
For input like this:
some valid source
{
function NotAFunction ...
}
procedure Proc
Begin
...
End;
procedure Func
Begin
s = 'function NotAFunction!!!'
End;
the following parse tree is created:
这篇关于解析不完整的语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!