远程监管:如何将命名实体连接到Freebase(KB)关系 [英] distant supervision: how to connect named entities to freebase (KB) relations

查看:107
本文介绍了远程监管:如何将命名实体连接到Freebase(KB)关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个遥远的监督语料库.到目前为止,我已经整理了数据,并将其通过NER系统传递,因此您可以在下面看到一个示例.

I'm trying to create a distant supervision corpus. Thus far I've assembled the data, and passed it through an NER system, so you can see an example below.

原始数据:

<p>
Myles Brand, the president of the National Collegiate Athletic Association, said in a telephone interview that he had not been approached about whether the N.C.A.A. might oversee a panel for the major bowl games similar to the one that chooses teams for the men's and women's basketball tournaments.
</p>

由斯坦福大学NER处理:

Processed with Stanford NER:

<p>
<PERSON>Myles Brand</PERSON>, the president of the <ORGANIZATION>National Collegiate Athletic Association</ORGANIZATION>, said in a telephone interview that he had not been approached about whether the <ORGANIZATION>N.C.A.A.</ORGANIZATION> might oversee a panel for the major bowl games similar to the one that chooses teams for the men's and women's basketball tournaments.
</p>

现在这是一个包含人Myles Brand和组织National Collegiate Athletic Association的句子.

Now here is a sentence which contains the person Myles Brand and the organization National Collegiate Athletic Association.

在Freebase中,您可以观察到,这两个实体共享President的关系键:

In Freebase we have these two entities sharing the relational bond of President as you can observe:

Freebase关系:

Freebase Relationship:

人们会认为以下代码可以解决问题,基于此问题,但实际上并非如此,尽管如上图所示,Freebase似乎维护了这两个实体在其主体中的关系.这是我做错了吗?

One would think the following code would do the trick, based on this question, but actually it doesn't, though as you can see from the picture above Freebase seems to maintain the relationship between these two entities in their corpus. Is this something that I am doing wrong?

我一直在此处进行操作.

[{ 
 "type" : "/type/link", 
 "source" : { "id" : "/en/myles_brand" }, 
 "master_property" : null, 
 "target" : { "id" : "/en/national_collegiate_athletic_association" }, 
 "target_value" : null 
}]

此外,我有成千上万的实体对,我想我可以使用Freebase Java API编写一些简短的Java程序,从而依次找出所有这些关系,有人有没有这样的程序示例?我可以看看吗?

Moreover, I have many thousands of entity pairs, I guess I can write some short java program using the Freebase Java API to figure out the relationships for all of these in turn, does anyone have an example of a program like that which I could take a peek at?

我真正想知道的是,一旦我有了关系,将那些与远距离监督语料库联系起来的最佳方法是什么,我对它们最终融合在一起后的外观感到困惑.

The real thing I want to know though is once I have the relationships, what is the best way to assosicate those with a distance supervision corpus, I'm confused about how it all looks when finally it's been fit together.

推荐答案

Freebase方面存在一些问题.首先,迈尔斯品牌与NCAA之间的关系不是直接的,而是由代表他的工作的节点来调节的.该节点具有到雇主,雇员,其职称,开始日期和结束日期的链接.其次,反射查询比标准MQL查询具有更强的方向性,在这种情况下,Myles Brand是目标,而不是源.

You've got a couple of problems with the Freebase side of things. First, the relationship between Myles Brand and the NCAA isn't a direct one, but is mediated by a node representing his employment. This node has links to the employer, the employee, their title, the start date, and the end date. Second, the reflection queries have stronger directionality than the standard MQL queries and in this case Myles Brand is the target, not the source.

此查询将向您显示到/business/employment_tenure节点的链接:

This query will show you the links to the /business/employment_tenure nodes:

[{
  "type": "/type/link",
  "source": {
    "id": null
  },
  "master_property": null,
  "target": {
    "id": "/en/myles_brand"
  }
}]

,但是它需要扩展以处理您要查找的多跳关系(并提取标题).

but it would need to be extended to deal with the multi-hop relationship that you're trying to find (and also extract the title).

如果没有足够多的感兴趣的集合,您可以直接测试它们之间的关系,而不必使用反射来完成.

Rather than doing this using reflection, you could test for the relationships directly if you've got a small enough set of them that you're interested in.

例如,您可以使用以下方法测试雇佣关系(并获取职位,如果有的话):

For example, you could test for an employment relationship (and fetch the title, if any) using:

[{  
 "/business/employment_tenure/person" : { "id" : "/en/myles_brand" }, 
 "/business/employment_tenure/company" : { "id" : "/en/national_collegiate_athletic_association" }, 
 "/business/employment_tenture/title": null
}]

这篇关于远程监管:如何将命名实体连接到Freebase(KB)关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆