解析上下文敏感语言 [英] Parsing Context Sensitive Language

查看:224
本文介绍了解析上下文敏感语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读Terence Parr的权威ANTLR参考,他说:

i am reading the Definitive ANTLR reference by Terence Parr, where he says:

语义谓词功能强大 识别上下文相关的手段 通过允许语言结构 要运行的运行时信息 识别

Semantic predicates are a powerful means of recognizing context-sensitive language structures by allowing runtime information to drive recognition

但是书中的例子很简单.我需要知道的是:ANTLR是否可以解析上下文敏感规则,例如:

But the examples in the book are very simple. What i need to know is: can ANTLR parse context-sensitive rules like:

xAy-> xBy

如果ANTLR无法解析这些规则,那么是否还有另一个工具可以处理上下文相关的语法?

If ANTLR can't parse these rules, is there is another tool that deals with context-sensitive grammars?

推荐答案

ANTLR仅解析LL(*)的语法.对于完整的上下文相关语言(例如您提供的示例),它无法使用语法进行解析.我认为Parr的意思是ANTLR可以解析需要 some (左)上下文约束的 some 语言.

ANTLR parses only grammars which are LL(*). It can't parse using grammars for full context-sensitive languages such as the example you provided. I think what Parr meant was that ANTLR can parse some languages that require some (left) context constraints.

尤其是,人们可以在减少动作"上使用语义谓词(我们对GLR解析器执行此操作) 由我们的 DMS软件再造工具包使用,但我认为ANTLR的想法与此类似)检查到目前为止解析器收集的任何数据,无论是作为其他语义动作的临时副作用,还是在部分构建的解析树中.

In particular, one can use semantic predicates on "reduction actions" (we do this for GLR parsers used by our DMS Software Reengineering Toolkit but the idea is similar for ANTLR, I think) to inspect any data collected by the parser so far, either as ad hoc side effects of other semantic actions, or in a partially-built parse tree.

对于基于DMS的基于DMS的Fortran前端,上下文相关检查,以确保DO循环正确排列.考虑:

For our DMS-based DMS-based Fortran front end, there's a context-sensitive check to ensure that DO-loops are properly lined up. Consider:

 DO  20, I= ...
   DO 10, J = ...
       ...
20  CONTINUE
10  CONTINUE

从解析器的角度来看,词法流看起来像这样:

From the point of view of the parser, the lexical stream looks like this:

DO  <number> , <variable> =  ...
    DO <number> , <variable> = ...
         ...
<number> CONTINUE
<number> CONTINUE

解析器如何才能知道哪个DO语句与哪个CONTINUE语句一起使用? (说每个DO与其最接近的CONTINUE匹配都是行不通的,因为FORTRAN可以 与多个DO头共享CONTINUE语句.

How can the parser then know which DO statement goes with which CONTINUE statement? (saying that each DO matches its closest CONTINUE won't work, because FORTRAN can share a CONTINUE statement with multiple DO-heads).

对于以下规则,我们在语义上使用语义谓词"CheckMatchingNumbers":

We use a semantic predicate "CheckMatchingNumbers" on the reduction for the following rule:

block = 'DO' <number> rest_of_do_head newline 
         block_of_statements
         <number> 'CONTINUE' newline ; CheckMatchingNumbers

检查DO关键字后面的数字和CONTINUE关键字后面的数字是否匹配.如果语义谓词说它们匹配,则该规则的简化成功,并且我们已将DO头与正确的CONTINUE对齐.如果谓词失败,则不建议进行归约(该规则已从候选者中删除以解析本地上下文);其他一些规则必须解析文本.

to check that the number following the DO keyword, and the number following the CONTINUE keyword match. If the semantic predicate says they match, then a reduction for this rule succeeds and we've aligned the DO head with correct CONTINUE. If the predicate fails, then no reduction is proposed (and this rule is removed from candidates for parsing the local context); some other set of rules has to parse the text.

处理共享共享的FORTRAN嵌套的实际规则和语义谓词比这要复杂得多,但我认为这很重要.

The actual rules and semantic predicates to handle FORTRAN nesting with shared continues is more complex than this but I think this makes the point.

您想要的是完整的上下文相关的解析引擎.我知道人们已经构建了它们,但是我不知道任何完整的实现,也不希望它们很快.

What you want is full context-sensitive parsing engine. I know people have built them, but I don't know of any full implementations, and don't expect them to be fast.

我确实在一段时间内遵循了奎因·泰勒·杰克逊的MetaS语法系统;听起来像是一种实际的尝试.

I did follow Quinn Taylor Jackson's MetaS grammar system for awhile; it sounded like a practical attempt to come close.

这篇关于解析上下文敏感语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆