斯坦福大学CoreNLP OpenIE注释器 [英] Stanford CoreNLP OpenIE annotator

查看:642
本文介绍了斯坦福大学CoreNLP OpenIE注释器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Stanford CoreNLP OpenIE注释器有疑问.

I have a question regarding Stanford CoreNLP OpenIE annotator.

我正在使用Stanford CoreNLP版本 stanford-corenlp-full-2015-12-09 ,以便使用OpenIE提取关系.我不太了解Java,这就是为什么我将pycorenlp包装器用于Python 3.4的原因.

I am using Stanford CoreNLP version stanford-corenlp-full-2015-12-09 in order to extract relations using OpenIE. I don't know much Java that's why I am using the pycorenlp wrapper for Python 3.4.

我想提取一个句子中所有单词之间的关系,下面是我使用的代码.我也有兴趣展示每个三元组的信心:

I want to extract relation between all words of a sentence, below is the code I used. I am also interested in showing the confidence of each triplet:

import nltk
from pycorenlp import *
import collections
nlp=StanfordCoreNLP("http://localhost:9000/")
s="Twenty percent electric motors are pulled from an assembly line"
output = nlp.annotate(s, properties={"annotators":"tokenize,ssplit,pos,depparse,natlog,openie",
                                 "outputFormat": "json","triple.strict":"true"})
result = [output["sentences"][0]["openie"] for item in output]
print(result)
for i in result:
for rel in i:
    relationSent=rel['relation'],rel['subject'],rel['object']
    print(relationSent)

这是我得到的结果:

[[{'relationSpan': [4, 6], 'subject': 'Twenty percent electric motors', 'objectSpan': [8, 10], 'relation': 'are pulled from', 'object': 'assembly line', 'subjectSpan': [0, 4]}, {'relationSpan': [4, 6], 'subject': 'percent electric motors', 'objectSpan': [8, 10], 'relation': 'are pulled from', 'object': 'assembly line', 'subjectSpan': [1, 4]}, {'relationSpan': [4, 5], 'subject': 'Twenty percent electric motors', 'objectSpan': [5, 6], 'relation': 'are', 'object': 'pulled', 'subjectSpan': [0, 4]}, {'relationSpan': [4, 5], 'subject': 'percent electric motors', 'objectSpan': [5, 6], 'relation': 'are', 'object': 'pulled', 'subjectSpan': [1, 4]}]]

三胞胎是:

('are pulled from', 'Twenty percent electric motors', 'assembly line')
('are pulled from', 'percent electric motors', 'assembly line')
('are', 'Twenty percent electric motors', 'pulled')
('are', 'percent electric motors', 'pulled')

第一个问题是置信度未显示在结果中.第二个问题是我只想检索包含句子所有单词的三元组,即该三元组:

First problem is that the confidence is not showing in the result. Second problem is that I only want to retrieve the triplet that that includes all words of the sentence i.e this triplet:

('are pulled from', 'Twenty percent electric motors', 'assembly line')

我得到的不仅仅是三胞胎的一种组合.我尝试使用选项"triple.strict":"true",因为它会提取只有在三元组消耗了整个片段的情况下,三元组",但是它无法正常工作.

What I’m getting is more than one combination of triplets. I tried to use the option "triple.strict":"true" because it extracts "triples only if they consume the entire fragment" but it is NOT working.

有人可以建议我吗?

推荐答案

您应尝试以下设置:

"openie.triple.strict":"true"

浏览这段时间出现的代码,置信度不会与返回的json一起存储,因此您无法从CoreNLP服务器获得该信息.

Looking through the code it appears at this time the confidence is not stored with the returned json, so you cannot get that from the CoreNLP server.

自从您提出了这一点后,我将推动一项更改,将其添加到输出json中,并让您知道该更改何时在GitHub上发布.

Since you bring this up I will push a change that will add those to the output json and let you know when that is live on the GitHub.

这篇关于斯坦福大学CoreNLP OpenIE注释器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆