解析不完整的语法 [英] Parsing with incomplete grammars

查看:35
本文介绍了解析不完整的语法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有任何常见的解决方案如何使用不完整的语法?就我而言,我只想检测 Delphi (Pascal) 文件中的方法,这意味着 proceduresfunctions.以下第一次尝试有效

 方法:( 过程 | 函数 | . )+;

但这真的是一个解决方案吗?有没有更好的解决方案?是否可以通过操作停止解析(例如,在检测到 implementation 之后).使用预处理器有意义吗?当是时 - 如何?

解决方案

如果你只是在寻找名字,那么就这么简单:

grammar PascalFuncProc;解析:(程序|函数)* EOF;程序: 'procedure' 空格标识符;功能: 'function' 空格标识符;忽略:(StrLiteral | 评论 | .) {skip();};片段空间:(' ' | '\t' | '\r' | '\n')+;片段标识符 : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*;片段 StrLiteral : '\'' ~'\''* '\'';片段注释:'{' ~'}'* '}';

会解决问题.请注意,我对 Delhpi/Pascal 不是很熟悉,所以我肯定会搞砸 StrLiteral 和/或 Comment,但这很容易解决.

从上面的语法生成的词法分析器只会产生两种类型的标记(Procedures 和 Functions),其余的输入(字符串文字、注释或如果没有匹配,单个字符:.) 将立即从词法分析器中被丢弃(skip() 方法).

对于这样的输入:

一些有效的来源{函数 NotAFunction ...}程序过程开始...结尾;程序功能开始s = '函数 NotAFunction!!!'结尾;

创建如下解析树:

Are there any common solutions how to use incomplete grammars? In my case I just want to detect methods in Delphi (Pascal)-files, that means procedures and functions. The following first attempt is working

    methods
      : ( procedure | function | . )+
      ;

but is that a solution at all? Are there any better solutions? Is it possible to stop parsing with an action (e. g. after detecting implementation). Does it make sense to use a preprocessor? And when yes - how?

解决方案

If you're only looking for names, then something as simple as this:

grammar PascalFuncProc;

parse
  :  (Procedure | Function)* EOF
  ;

Procedure
  :  'procedure' Spaces Identifier
  ;

Function
  :  'function' Spaces Identifier
  ;

Ignore
  :  (StrLiteral | Comment | .) {skip();}
  ;

fragment Spaces     : (' ' | '\t' | '\r' | '\n')+;
fragment Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*;
fragment StrLiteral : '\'' ~'\''* '\'';
fragment Comment    : '{' ~'}'* '}';

will do the trick. Note that I am not very familiar with Delhpi/Pascal, so I am surely goofing up StrLiterals and/or Comments, but that'll be easily fixed.

The lexer generated from the grammar above will only produce two type of tokens (Procedures and Functions), the rest of the input (string literals, comments or if nothing is matched, a single character: the .) is being discarded from the lexer immediately (the skip() method).

For input like this:

some valid source
{ 
  function NotAFunction ...
}

procedure Proc
Begin
  ...
End;

procedure Func
Begin
  s = 'function NotAFunction!!!'
End;

the following parse tree is created:

这篇关于解析不完整的语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆