在 Emacs Lisp 中解析 [英] Parsing in Emacs Lisp

查看:21
本文介绍了在 Emacs Lisp 中解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 Emacs Lisp 编写一个解析器.它是文本文件的解析器看起来像这样:

I'm writing a parser in Emacs Lisp. It's a parser for text files looking like this:

rule:
  int: 1, 2, 3, ...
  string: and, or, then, when
  text:
  ----------
  Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Pellentesque
  in tellus. In pharetra consequat augue. In congue. Curabitur
  pellentesque iaculis eros. Proin magna odio, posuere sed, commodo nec,
  varius nec, tortor.
  ----------
  more: ...

rule:
  ...

我并不真正关心键(整数、字符串、...).我想要价值.所以对于上面的文件,int 的值为 "1, 2, 3, ...", stringand, or, then, when"和文本Lorem ..."(不包括破折号).

I don't really care about the key (int, string, ...). I want the value. So for the file above int has value "1, 2, 3, ...", string "and, or, then, when" and text "Lorem ..." (excluding the dashes).

我正在考虑两种不同的解决方案,但我不知道该使用哪一种.我应该:

I'm thinking about two different solutions, but I don't which one to use. Should I:

  1. 创建一个简单的解析器,循环遍历所有行和每行line 将它与一些正则表达式匹配,然后将我想要的部分分组?

  1. create a simple parser that loops through all lines and for each line matches it with some regex and then group the parts I want out?

用词法分析器和解析器做一个更复杂的解析器?

do a more sophisticated parser with a lexer and a parser?

现在文件很简单,我想我不需要做与第二个选项一样先进的东西.但是这些文件可能会得到一个有点复杂,所以我想让它更容易扩展.

Right now the files are quite simple and I guess I don't need to do something as advance as the second option. But these files may get a bit more complicated, so I want to make it easy to extend.

你会如何解决这个问题?

How would you solve this?

推荐答案

您是否已经熟悉 递归下降解析器?使用您最喜欢的编程语言(包括 Emacs Lisp)手工编写它们相对容易.对于非常简单的解析,您通常可以使用 looking-atsearch-forward.这些也将构成您的递归下降解析器将调用的任何标记化例程的基础,或者任何其他风格的解析器.

Are you already familiar with recursive descent parsers? They're relatively easy to write by hand in your favourite programming language, which would include Emacs Lisp. For very simple parsing, you can often get by with looking-at and search-forward. These would also form the basis of any tokenizing routines that would be called by your recursive descent parser, or any other style of parser.

[2009 年 2 月 11 日] 我在下面的 emacs lisp 中添加了一个示例递归下降解析器.它解析简单的算术表达式,包括加法、减法、乘法、除法、求幂和带括号的子表达式.现在,它假设所有标记都在全局变量 *tokens* 中,但是如果您根据需要修改 gettokpeektok,您可以拥有它们穿过缓冲区.要按原样使用它,只需尝试以下操作:

[11 Feb 2009] I added an example recursive descent parser in emacs lisp below. It parses simple arithmetic expressions including addition, subtraction, multiplication, division, exponentiation, and parenthesized sub-expressions. Right now, it assumes all tokens are in the global variable *tokens*, but if you modify gettok and peektok as necessary you can have them walk through a buffer. To use it as is, just try out the following:

(setq *token* '( 3 ^ 5 ^ 7 + 5 * 3 + 7 / 11))
(rdh/expr)
=> (+ (+ (^ 3 (^ 5 7)) (* 5 3)) (/ 7 11))

解析代码如下.

(defun gettok ()
  (and *token* (pop *token*)))
(defun peektok ()
  (and *token* (car *token*)))

(defun rdh/expr ()
  (rdh/expr-tail (rdh/factor)))

(defun rdh/expr-tail (expr)
  (let ((tok (peektok)))
    (cond ((or (null tok)
           (equal tok ")"))
       expr)
      ((member tok '(+ -))
       (gettok)
       (let ((fac (rdh/factor)))
         (rdh/expr-tail (list tok expr fac))))
      (t (error "bad expr")))))

(defun rdh/factor ()
  (rdh/factor-tail (rdh/term)))

(defun rdh/factor-tail (fac)
  (let ((tok (peektok)))
    (cond ((or (null tok)
           (member tok '(")" + -)))
       fac)
      ((member tok '(* /))
       (gettok)
       (let ((term (rdh/term)))
         (rdh/factor-tail (list tok fac term))))
      (t (error "bad factor")))))

(defun rdh/term ()
  (let* ((prim (rdh/prim))
         (tok (peektok)))
    (cond ((or (null tok)
               (member tok '(")" + - / *)))
           prim)
          ((equal tok '^)
           (gettok)
           (list tok prim (rdh/term)))
          (t (error "bad term")))))

(defun rdh/prim ()
  (let ((tok (gettok)))
    (cond ((numberp tok) tok)
      ((equal tok "(")
       (let* ((expr (rdh/expr))
          (tok (peektok)))
         (if (not (equal tok ")"))
         (error "bad parenthesized expr")
           (gettok)
           expr)))
      (t (error "bad prim")))))

这篇关于在 Emacs Lisp 中解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆