我如何在 NLTK 中进行依赖解析? [英] How do I do dependency parsing in NLTK?

查看:26
本文介绍了我如何在 NLTK 中进行依赖解析?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读 NLTK 的书,不清楚如何从给定的句子生成依赖树.

Going through the NLTK book, it's not clear how to generate a dependency tree from a given sentence.

本书的相关部分:关于依赖的子章节语法 给出了一个示例图,但它没有显示如何解析句子以得出这些关系 - 或者我可能遗漏了 NLP 中的一些基本知识?

The relevant section of the book: sub-chapter on dependency grammar gives an example figure but it doesn't show how to parse a sentence to come up with those relationships - or maybe I'm missing something fundamental in NLP?

我想要类似于 stanford parser 的功能:给定一个句子我在睡梦中射杀了一头大象",它应该返回如下内容:

I want something similar to what the stanford parser does: Given a sentence "I shot an elephant in my sleep", it should return something like:

nsubj(shot-2, I-1)
det(elephant-4, an-3)
dobj(shot-2, elephant-4)
prep(shot-2, in-5)
poss(sleep-7, my-6)
pobj(in-5, sleep-7)

推荐答案

我们可以使用来自 NLTK 的 Stanford Parser.

We can use Stanford Parser from NLTK.

你需要从他们的网站下载两件东西:

You need to download two things from their website:

  1. Stanford CoreNLP 解析器.
  2. 语言模型适用于您所需的语言(例如 英语语言模型)
  1. The Stanford CoreNLP parser.
  2. Language model for your desired language (e.g. english language model)

警告!

确保您的语言模型版本与您的斯坦福 CoreNLP 解析器版本匹配!

Warning!

Make sure that your language model version matches your Stanford CoreNLP parser version!

截至 2018 年 5 月 22 日的当前 CoreNLP 版本为 3.9.1.

The current CoreNLP version as of May 22, 2018 is 3.9.1.

下载这两个文件后,将 zip 文件解压到您喜欢的任何位置.

After downloading the two files, extract the zip file anywhere you like.

接下来,加载模型并通过NLTK使用

Next, load the model and use it through NLTK

from nltk.parse.stanford import StanfordDependencyParser

path_to_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser.jar'
path_to_models_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser-3.4.1-models.jar'

dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)

result = dependency_parser.raw_parse('I shot an elephant in my sleep')
dep = result.next()

list(dep.triples())

输出

最后一行的输出为:

Output

The output of the last line is:

[((u'shot', u'VBD'), u'nsubj', (u'I', u'PRP')),
 ((u'shot', u'VBD'), u'dobj', (u'elephant', u'NN')),
 ((u'elephant', u'NN'), u'det', (u'an', u'DT')),
 ((u'shot', u'VBD'), u'prep', (u'in', u'IN')),
 ((u'in', u'IN'), u'pobj', (u'sleep', u'NN')),
 ((u'sleep', u'NN'), u'poss', (u'my', u'PRP$'))]

我认为这就是你想要的.

I think this is what you want.

这篇关于我如何在 NLTK 中进行依赖解析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆