RDF句子表示 [英] RDF representation of sentences

查看:74
本文介绍了RDF句子表示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要以RDF格式表示句子.

I need to represent sentences in RDF format.

换句话说,约翰喜欢可乐"将自动表示为:

In other words "John likes coke" would be automatically represented as:

Subject : John
Predicate : Likes
Object : Coke

有人知道我应该从哪里开始吗?是否有任何程序可以自动执行此操作,还是我需要从头开始做所有事情?

Does anyone know where I should start? Are there any programs which can do this automatically or would I need to do everything from scratch?

推荐答案

您似乎想要句子的类型化依存关系,例如对于John likes coke:

It looks like you want the typed dependencies of a sentence, e.g. for John likes coke:

 nsubj(likes-2, John-1)
 dobj(likes-2, coke-3)

我不知道任何直接产生RDF的依赖解析器.但是,其中许多文件会以称为 CoNLL-X 的标准制表符受限表示形式来生成解析,从CoNLL-X转换为RDF并不难.

I'm not aware of any dependency parser that directly produces RDF. However, many of them produce parses in a standardized tab limited representation known as CoNLL-X, and it shouldn't be too hard to convert from CoNLL-X to RDF.

开放源代码依赖解析器

可以从提取类型依赖项的众多解析器中进行选择,包括以下最新技术开源选项:

There are a number of parsers to choose from that extract typed dependencies, including the following state-of-art open source options:

  • Stanford Parser - see online demo.
  • MaltParser
  • MSTParser

斯坦福解析器包括用于解析英语的预训练模型.要获得类型化的依赖关系,您将需要使用标志-outputFormat typedDependencies.

The Stanford Parser includes a pre-trained model for parsing English. To get typed dependencies you'll need to use the flag -outputFormat typedDependencies.

对于 MaltParser ,您可以在此处下载英​​语模型>.

For the MaltParser you can download an English model here.

MSTParser 包含一小段200句的英语培训集,您可以使用它来创建自己的英语解析模型.但是,对这些少量数据进行训练会损害生成的解析器的准确性.因此,如果决定使用此解析器,则最好使用可用的预训练模型此处.

The MSTParser includes a small 200 sentence English training set that you can use to create you're own English parsing model. However, training on this little data will hurt the accuracy of the resulting parser. So, if you decide to use this parser, you are probably better off using the pretrain model available here.

上面链接的所有预训练模型都根据Stanford Dependency形式主义(手册).

All of the pretrained models linked above produce parses according to the Stanford Dependency formalism (ACL paper, and manual).

在这三个中,斯坦福解析器是最准确的. MaltParser是最快的,此程序包的某些配置能够在8秒内解析 1800个句子.

Of these three, the Stanford Parser is the most accurate. The MaltParser is the fastest, with some configurations of this package being able to parse 1800 sentences in only 8 seconds.

这篇关于RDF句子表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆