使用斯坦福解析器(CoreNLP)查找词组词头 [英] Using Stanford Parser(CoreNLP) to find phrase heads

查看:55
本文介绍了使用斯坦福解析器(CoreNLP)查找词组词头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将使用 Stanford Corenlp 2013 来查找词组中心词.我看到了这个话题.

I am going to use Stanford Corenlp 2013 to find phrase heads. I saw this thread.

但是,我不清楚答案,我无法添加任何评论来继续该线程.所以,我很抱歉重复.

But, the answer was not clear to me and I couldn't add any comment to continue that thread. So, I'm sorry for duplication.

我目前拥有的是一个句子的解析树(使用斯坦福 Corenlp)(我也尝试过使用斯坦福 Corenlp 创建的 CONLL 格式).而我需要的正是名词短语的头部.

What I have at the moment is the parse tree of a sentence (using Stanford Corenlp) (I also tried with CONLL format which is created by Stanford Corenlp). And what I need is exactly the head of noun phrases.

我不知道如何使用依赖项和解析树来提取名词短语的头部.我所知道的是,如果我有 nsubj (x, y),则 y 是主题的头部.如果我有 dobj(x,y),y 是直接对象的头部.f 我有 iobj(x,y),y 是间接对象的头部.

I don't know how I can use dependencies and the parse tree to extract heads of nounphrases. What I know is that if I have nsubj (x, y), y is the head of the subject. If I have dobj(x,y), y is the head of the direct object. f I have iobj(x,y), y is the head of the indirect object.

但是,我不确定这种方式是否是找到所有词组中心词的正确方式.如果是,我应该添加哪些规则来获得所有名词短语的中心词?

However, I am not sure if this way is the correct way to find all phrase heads. If it is, which rules I should add to get all heads of noun phrases?

也许,值得一提的是,我需要 java 代码中的名词短语的头部.

Maybe, it is worth saying that I need the heads of noun phrases in a java code.

推荐答案

由于我无法对 Chaitanya 给出的答案发表评论,因此在此处添加更多内容.

Since I couldnt comment on the answer given by Chaitanya, adding more to his answer here.

Stanford CoreNLP 套件实现了 Collins head finder 启发式算法和

Stanford CoreNLP suite has implementation of Collins head finder heuristics and a semantic head finder heuristic in the form of

  1. CollinsHeadFinder
  2. ModCollinsHeadFinder
  3. SemanticHeadFinder

您所需要的只是实例化三者之一并执行以下操作.

All you would need is instantiate one of the three and do the following.

Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
headFinder.determineHead(tree).pennPrint(out);

您可以遍历树的节点并在需要的地方确定主词.

You can iterate through the nodes of the tree and determine head words wherever required.

PS:我的回答基于截至 20140104 发布的 StanfordCoreNLP 套件.

PS: My answer is based on the StanfordCoreNLP suite released as of 20140104.

这是一个简单的 dfs,可以让您提取句子中所有名词短语的中心词

Here is a simple dfs that lets you extract head words for all noun phrases in a sentence

public static void dfs(Tree node, Tree parent, HeadFinder headFinder) {
      if (node == null || node.isLeaf()) {
         return;
      }
      //if node is a NP - Get the terminal nodes to get the words in the NP      
      if(node.value().equals("NP") ) {

         System.out.println(" Noun Phrase is ");
         List<Tree> leaves = node.getLeaves();

         for(Tree leaf : leaves) {
            System.out.print(leaf.toString()+" ");

         }
         System.out.println();

         System.out.println(" Head string is ");
         System.out.println(node.headTerminal(headFinder, parent));

    }

    for(Tree child : node.children()) {
         dfs(child, node, headFinder);
    }

 }

这篇关于使用斯坦福解析器(CoreNLP)查找词组词头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆