手动编写解析器的最佳方法是什么? [英] What's the best way to write a parser by hand?

查看：13 发布时间：2022/1/5 10:10:30 c# .net parsing f#

本文介绍了手动编写解析器的最佳方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们已经使用 ANTLR 为类似 SQL 的语法创建了一个解析器，虽然在大多数情况下结果令人满意，但我们需要修复一些边缘情况；而且由于我们没有自己编写解析器，因此我们对它的理解还不够深入，无法做出合理的更改.

We've used ANTLR to create a parser for a SQL-like grammar, and while the results are satisfactory in most cases, there are a few edge cases that we need to fix; and since we didn't write the parser ourselves we don't really understand it well enough to be able to make sensible changes.

所以，我们想编写自己的解析器.手动编写解析器的最佳方法是什么?我们应该使用什么样的解析器——推荐使用递归下降；是对的吗?我们将使用 C# 编写它，因此我们将不胜感激地收到任何使用该语言编写解析器的教程.

So, we'd like to write our own parser. What's the best way to go about writing a parser by hand? What sort of parser should we use - recursive descent has been recommended; is that right? We'll be writing it in C#, so any tutorials for writing parsers in that language would be gratefully received.

更新:我也对涉及 F# 的答案感兴趣 - 我一直在寻找在项目中使用它的理由.

UPDATE: I'd also be interested in answers that involve F# - I've been looking for a reason to use that in a project.

推荐答案

我强烈推荐 F# 语言作为在 .NET 平台上进行解析的首选语言.它源于 ML 语言家族，这意味着它对面向语言的编程具有出色的支持.

I would highly recommend the F# language as your language of choice for parsing on the .NET Platform. It's roots in the ML family of languages means it has excellent support for language-oriented programming.

有区别的联合和模式匹配允许对你的 AST 进行非常简洁和强大的规范.高阶函数允许定义解析操作及其组合.对 monadic 类型的一流支持允许隐式处理状态管理，大大简化了解析器的组合.强大的类型推断极大地帮助了这些(复杂)类型的定义.所有这一切都可以交互指定和执行，让您可以快速制作原型.

Discriminated unions and pattern-matching allow for a very succinct and powerful specification of your AST. Higher-order functions allow for definition of parse operations and their composition. First-class support for monadic types allows for state management to be handled implicitly greatly simplifying the composition of parsers. Powerful type-inference greatly aides the definition of these (complex) types. And all of this can be specified and executed interactively allowing you to rapidly prototype.

Stephan Tolksdorf 已通过他的解析器组合库FParsec

Stephan Tolksdorf has put this into practice with his parser combinator library FParsec

从他的例子中我们可以看出 AST 是多么自然地指定:

From his examples we see how naturally an AST is specified:

type expr =
    | Val of string
    | Int of int
    | Float of float
    | Decr of expr

type stmt =
    | Assign of string * expr
    | While of expr * stmt
    | Seq of stmt list
    | IfThen of expr * stmt
    | IfThenElse of expr * stmt * stmt
    | Print of expr

type prog = Prog of stmt list

解析器的实现(部分省略)同样简洁:

the implementation of the parser (partially elided) is just as succinct:

let stmt, stmtRef = createParserForwardedToRef()

let stmtList = sepBy1 stmt (ch ';')

let assign =
    pipe2 id (str ":=" >>. expr) (fun id e -> Assign(id, e))

let print = str "print" >>. expr |>> Print

let pwhile =
    pipe2 (str "while" >>. expr) (str "do" >>. stmt) (fun e s -> While(e, s))

let seq =
    str "begin" >>. stmtList .>> str "end" |>> Seq

let ifthen =
    pipe3 (str "if" >>. expr) (str "then" >>. stmt) (opt (str "else" >>. stmt))
          (fun e s1 optS2 ->
               match optS2 with
               | None    -> IfThen(e, s1)
               | Some s2 -> IfThenElse(e, s1, s2))

do stmtRef:= choice [ifthen; pwhile; seq; print; assign]


let prog =
    ws >>. stmtList .>> eof |>> Prog

在第二行，例如，stmt 和 ch 是解析器，而 sepBy1 是一个采用两个简单解析器的 monadic 解析器组合子并返回一个组合解析器.在这种情况下，sepBy1 p sep 返回一个解析器，该解析器解析一个或多个由 sep 分隔的 p.因此，您可以看到从简单的解析器中组合出强大的解析器的速度有多快.F# 对重写运算符的支持还允许使用简洁的中缀表示法，例如排序组合子和选择组合子可以指定为 >>>. 和 <|>.

On the second line, as an example, stmt and ch are parsers and sepBy1 is a monadic parser combinator that takes two simple parsers and returns a combination parser. In this case sepBy1 p sep returns a parser that parses one or more occurrences of p separated by sep. You can thus see how quickly a powerful parser can be combined from simple parsers. F#'s support for overridden operators also allow for concise infix notation e.g. the sequencing combinator and the choice combinator can be specified as >>. and <|>.

祝你好运，

丹尼

这篇关于手动编写解析器的最佳方法是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

手动编写解析器的最佳方法是什么? [英] What's the best way to write a parser by hand?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

手动编写解析器的最佳方法是什么? [英] What&#39;s the best way to write a parser by hand?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

手动编写解析器的最佳方法是什么? [英] What's the best way to write a parser by hand?

登录关闭