在 OCaml 中向我的 AST 添加行信息 [英] Adding line information to my AST in OCaml

查看:26
本文介绍了在 OCaml 中向我的 AST 添加行信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 OCaml 中创建一个编译器,其语法如下:

I am creating a compiler in OCaml where the grammar is as follow:

type expr =
| Cons of const
| Var of string
| List of ( expr list )
| Sum of ( expr * expr )
| Less_than of ( expr * expr ) 
| Conditional of ( expr * expr * expr )
| Array_literal of ( expr )
| Array_read of ( expr * expr )

AST 的节点如下所示:

The node of an AST looks like this:

type 'a astNode = 
{
data: 'a;
metadata: Metadata;
}

元数据模块如下所示:

module Metadata = struct
    type loc = Lexing.position
    type loc_range = loc * loc

    and metadata = ?
end

元数据的语法应该是什么?在行 and metadata = 之后我的代码会是什么样子?基本上,我什么时候需要用元数据信息更新 AST.我应该如何构建我的 AST 以包含元数据信息?元数据对我来说目前意味着它的位置,例如行号、文件名等.这包含在 Lexing.position 模块中.

What should the grammar for metadata be? What would my code look like after the line and metadata = ? Basically, when would I be required to update the AST with the metadata information. How should I structure my AST to contain the metadata information? Metadata for me currently means it's position such as line number, file name etc. This is encompassed in the Lexing.position module.

推荐答案

这更多是一个软件设计问题,有一些常见的解决方案.我会尽量涵盖所有这些.

This is more a software design question and there are a few common solutions. I will try to cover them all.

最常见的解决方案是将您的表达式类型包装到一个记录中,该记录除了表达式负载外,还有一些元数据.元数据的类型可以是抽象的,也可以是具体的,这也是一个品味问题.元数据的抽象类型使其以后更容易扩展.经典方法是在 OCaml 编译器本身中实现的,参考 Locations 模块,它将告诉您如何在经典的 OCaml Yacc/Menhir 解析器中获取位置信息.

The most common solution is to wrap your expression type into a record that has, besides the expression payload, some metadata. The type of metadata could be made abstract or concrete, it is also a matter of taste. Abstract type for metadata makes it easier to extend it later. The classical approach is implemented in OCaml compiler itself, refer to the Locations module, it will tell you how to get location information in classical OCaml Yacc/Menhir parsers.

一种稍微不同的方法是索引树并将标识符附加到每个 AST 节点.然后您可以使用外部映射(如任何关联容器)将任意元数据添加到您的 AST.这种方法用于 BAP Primus Lisp Parser.好处是这种方法可以很容易地将它与散列consing结合起来.它还使可以与节点关联的属性集易于扩展,映射的成本现在是部分的.(或记录和映射之间的常见选择).

A slightly different approach is to index the tree and attach an identifier to each AST node. Then you can use external maps (like any associative container) to add arbitrary metadata to your AST. Such approach is used in BAP Primus Lisp Parser. The nice thing is that this approach is that it makes it easy to combine it with hash-consing. It also makes the set of attributes that you can associate with your nodes easily extensible, with the cost the mapping is now partial. (Or common choice between a record a mapping).

不是将索引直接存储在节点中,您可以选择一些特定的方法来对节点进行编号(例如,DFS 编号).使用这种方法您不需要修改您的 AST 类型,如果您不控制它,那就特别好.这种方法的一个例子是来自 Janestreet 的 Positions 模块parsexp 库,它实现了基于某种遍历(不是 DFS 编号)的紧凑位置集.不幸的是,它们的实现不够通用,无法与不同的 AST 重用,但方法是通用的.

Instead of storing indexes directly in the nodes, you may pick some particular method for numbering your nodes (e.g., DFS numbering). With this approach you do not need to modify your AST type, that is especially nice if you are not controlling it. An example of such approach is the Positions module from the Janestreet's parsexp library that implements the compact set of positions based on some traversal (not the DFS numbering). Unfortunately, their implementation is not general enough to be reused with different AST, but the approach is general.

这篇关于在 OCaml 中向我的 AST 添加行信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆