解析上下文相关语言 [英] Parsing Context Sensitive Language

查看:26
本文介绍了解析上下文相关语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 Terence Parr 的 Definitive ANTLR reference,他说:

i am reading the Definitive ANTLR reference by Terence Parr, where he says:

语义谓词是一个强大的识别上下文相关的方法通过允许语言结构要驱动的运行时信息识别

Semantic predicates are a powerful means of recognizing context-sensitive language structures by allowing runtime information to drive recognition

但是书中的例子很简单.我需要知道的是:ANTLR 可以解析 context-sensitive 规则,例如:

But the examples in the book are very simple. What i need to know is: can ANTLR parse context-sensitive rules like:

xAy --> xBy

如果 ANTLR 无法解析这些规则,是否还有其他工具可以处理上下文相关的语法?

If ANTLR can't parse these rules, is there is another tool that deals with context-sensitive grammars?

推荐答案

ANTLR 只解析 LL(*) 的语法.它无法使用语法来解析完整的上下文相关语言,例如您提供的示例.我认为 Parr 的意思是 ANTLR 可以解析需要 some(左)上下文约束的 some 语言.

ANTLR parses only grammars which are LL(*). It can't parse using grammars for full context-sensitive languages such as the example you provided. I think what Parr meant was that ANTLR can parse some languages that require some (left) context constraints.

特别是,可以在归约操作"上使用语义谓词(我们为 GLR 解析器执行此操作我们的 DMS Software Reengineering Toolkit 使用了这个想法,但我认为 ANTLR 的想法是相似的) 来检查到目前为止由解析器收集的任何数据,无论是作为其他语义操作的临时副作用,还是在部分构建的解析树中.

In particular, one can use semantic predicates on "reduction actions" (we do this for GLR parsers used by our DMS Software Reengineering Toolkit but the idea is similar for ANTLR, I think) to inspect any data collected by the parser so far, either as ad hoc side effects of other semantic actions, or in a partially-built parse tree.

对于我们基于 DMS 的 基于 DMS 的 Fortran 前端,有一个上下文相关检查以确保 DO 循环正确排列.考虑:

For our DMS-based DMS-based Fortran front end, there's a context-sensitive check to ensure that DO-loops are properly lined up. Consider:

 DO  20, I= ...
   DO 10, J = ...
       ...
20  CONTINUE
10  CONTINUE

从解析器的角度来看,词法流是这样的:

From the point of view of the parser, the lexical stream looks like this:

DO  <number> , <variable> =  ...
    DO <number> , <variable> = ...
         ...
<number> CONTINUE
<number> CONTINUE

解析器如何知道哪个 DO 语句与哪个 CONTINUE 语句对应?(说每个 DO 匹配它最接近的 CONTINUE 是行不通的,因为 FORTRAN 可以与多个 DO 头共享一个 CONTINUE 语句).

How can the parser then know which DO statement goes with which CONTINUE statement? (saying that each DO matches its closest CONTINUE won't work, because FORTRAN can share a CONTINUE statement with multiple DO-heads).

我们使用语义谓词CheckMatchingNumbers"来减少以下规则:

We use a semantic predicate "CheckMatchingNumbers" on the reduction for the following rule:

block = 'DO' <number> rest_of_do_head newline 
         block_of_statements
         <number> 'CONTINUE' newline ; CheckMatchingNumbers

检查 DO 关键字后面的数字和 CONTINUE 关键字后面的数字是否匹配.如果语义谓词说它们匹配,则此规则的归约成功,并且我们已将 DO 头与正确的 CONTINUE 对齐.如果谓词失败,则不建议减少(并且从用于解析本地上下文的候选中删除此规则);其他一些规则必须解析文本.

to check that the number following the DO keyword, and the number following the CONTINUE keyword match. If the semantic predicate says they match, then a reduction for this rule succeeds and we've aligned the DO head with correct CONTINUE. If the predicate fails, then no reduction is proposed (and this rule is removed from candidates for parsing the local context); some other set of rules has to parse the text.

使用共享继续处理 FORTRAN 嵌套的实际规则和语义谓词比这更复杂,但我认为这很重要.

The actual rules and semantic predicates to handle FORTRAN nesting with shared continues is more complex than this but I think this makes the point.

您想要的是完整的上下文相关解析引擎.我知道人们已经构建了它们,但我不知道任何完整的实现,也不期望它们很快.

What you want is full context-sensitive parsing engine. I know people have built them, but I don't know of any full implementations, and don't expect them to be fast.

我确实关注了 Quinn Taylor Jackson 的 MetaS 语法系统一段时间;这听起来像是一次接近现实的实际尝试.

I did follow Quinn Taylor Jackson's MetaS grammar system for awhile; it sounded like a practical attempt to come close.

这篇关于解析上下文相关语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆