如何操作解析树? [英] How do I manipulate parse trees?

查看:107
本文介绍了如何操作解析树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究自然语言解析树,并以各种方式对其进行操作.我一直在使用斯坦福(Stanford)的Tregex和Tsurgeon工具,但是代码很乱,无法很好地适应我的大多数Python环境(那些工具是Java,因此不太适合进行调整).我想要一个工具集,当我需要更多功能时,可以轻松地进行黑客入侵.还有其他工具非常适合在树上进行模式匹配,然后对那些匹配的分支进行操作吗?

I've been playing around with natural language parse trees and manipulating them in various ways. I've been using Stanford's Tregex and Tsurgeon tools but the code is a mess and doesn't fit in well with my mostly Python environment (those tools are Java and aren't ideal for tweaking). I'd like to have a toolset that would allow for easy hacking when I need more functionality. Are there any other tools that are well suited for doing pattern matching on trees and then manipulation of those matched branches?

例如,我想将以下树作为输入:

For example, I'd like to take the following tree as input:

(ROOT
  (S
    (NP
      (NP (NNP Bank))
      (PP (IN of)
        (NP (NNP America))))
    (VP (VBD used)
      (S
        (VP (TO to)
          (VP (VB be)
            (VP (VBN called)
              (NP
                (NP (NNP Bank))
                (PP (IN of)
                  (NP (NNP Italy)))))))))))

和(这是一个简化的示例):

and (this is a simplified example):

  1. 找到带有标签NP的任何节点,该节点的第一个子节点带有标签NP,后代名为银行",第二个子节点带有标签PP.
  2. 如果匹配,则将PP节点的所有子节点移到匹配的NP子节点的末尾.

例如,采取树的这一部分:

For example, take this part of the tree:

(NP
  (NP (NNP Bank))
  (PP (IN of)
    (NP (NNP America))))

并将其转换为此:

(NP
  (NP (NNP Bank) (IN of) (NP (NNP America))))

由于我的输入树是S表达式,因此我考虑过使用Lisp(嵌入到我的Python程序中),但是由于我已经在Lisp中编写了很多重要的东西,甚至我什至都不知道从哪里开始,所以已经很久了.

Since my input trees are S-expressions I've considered using Lisp (embedded into my Python program) but it's been so long that I've written anything significant in Lisp that I have no idea where to even start.

什么是描述模式的好方法?什么是描述这些操作的好方法?考虑这个问题的好方法是什么?

What would be a good way to describe the patterns? What would be a good way to describe the manipulations? What's a good way to think about this problem?

推荐答案

这是使用Lisp的典型情况.您将需要一个在树上映射另一个功能的功能.

This is a typical case of using Lisp. You would need a function that maps another function over the tree.

这是使用Common Lisp的程序匹配示例. Lisp中有一些匹配器可以处理列表结构,可以使用它们来代替.使用列表匹配器可以简化示例(有关使用模式匹配器的示例,请参见我的其他答案).

Here is a procedural matching example using Common Lisp. There are matchers in Lisp that work over list structures, which could be used instead. Using a list matcher would simplify the example (see my other answer for an example using a pattern matcher).

代码:

(defun node-children (node)
  (rest node))

(defun node-name (node)
  (second node))

(defun node-type (node)
  (first node))


(defun treemap (tree matcher transformer)
  (cond ((null tree) nil)
        ((consp tree)
         (if (funcall matcher tree)
             (funcall transformer tree)
           (cons (node-type tree)
                 (mapcar (lambda (child)
                           (treemap child matcher transformer))
                         (node-children tree)))))
        (t tree))))

示例:

(defvar *tree*
  '(ROOT
    (S
     (NP
      (NP (NNP Bank))
      (PP (IN of)
          (NP (NNP America))))
     (VP (VBD used)
         (S
          (VP (TO to)
              (VP (VB be)
                  (VP (VBN called)
                      (NP
                       (NP (NNP Bank))
                       (PP (IN of)
                           (NP (NNP Italy))))))))))))



(defun example ()
  (pprint
   (treemap *tree*
            (lambda (node)
              (and (= (length (node-children node)) 2)
                   (eq (node-type (first (node-children node))) 'np)
                   (some (lambda (node)
                           (eq (node-name node) 'bank))
                         (children (first (node-children node))))
                   (eq (first (second (node-children node))) 'pp)))
            (lambda (node)
              (list (node-type node)
                    (append (first (node-children node))
                            (node-children (second (node-children node)))))))))

运行示例:

CL-USER 75 > (example)

(ROOT
 (S
  (NP
   (NP (NNP BANK) (IN OF) (NP (NNP AMERICA))))
  (VP
   (VBD USED)
   (S
    (VP
     (TO TO)
     (VP
      (VB BE)
      (VP
       (VBN CALLED)
       (NP
        (NP
         (NNP BANK)
         (IN OF)
         (NP (NNP ITALY)))))))))))

这篇关于如何操作解析树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆