句子的 RDF 表示 [英] RDF representation of sentences

查看:24
本文介绍了句子的 RDF 表示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要用 RDF 格式表示句子.

I need to represent sentences in RDF format.

换句话说,约翰喜欢可乐"将自动表示为:

In other words "John likes coke" would be automatically represented as:

Subject : John
Predicate : Likes
Object : Coke

有谁知道我应该从哪里开始?是否有任何程序可以自动执行此操作,或者我需要从头开始做所有事情?

Does anyone know where I should start? Are there any programs which can do this automatically or would I need to do everything from scratch?

推荐答案

看起来你想要一个句子的类型依赖,例如约翰喜欢可乐:

It looks like you want the typed dependencies of a sentence, e.g. for John likes coke:

 nsubj(likes-2, John-1)
 dobj(likes-2, coke-3)

我不知道有任何直接生成 RDF 的依赖项解析器.但是,它们中的许多以标准化的制表符限制表示形式生成解析,称为 CoNLL-X,从 CoNLL-X 转换为 RDF 应该不会太难.

I'm not aware of any dependency parser that directly produces RDF. However, many of them produce parses in a standardized tab limited representation known as CoNLL-X, and it shouldn't be too hard to convert from CoNLL-X to RDF.

开源依赖解析器

有许多解析器可供选择来提取类型化的依赖项,包括以下最先进的开源选项:

There are a number of parsers to choose from that extract typed dependencies, including the following state-of-art open source options:

  • Stanford Parser - see online demo.
  • MaltParser
  • MSTParser

Stanford Parser 包含一个用于解析英语的预训练模型.要获取类型化依赖项,您需要使用标志 -outputFormat typedDependencies.

The Stanford Parser includes a pre-trained model for parsing English. To get typed dependencies you'll need to use the flag -outputFormat typedDependencies.

对于MaltParser,您可以在此处下载英文模型.

MSTParser 包括一个小型的 200 句英语训练集,您可以使用它来创建自己的英语解析模型.但是,对这些小数据进行训练会损害生成的解析器的准确性.因此,如果您决定使用此解析器,最好使用可用的预训练模型 此处.

The MSTParser includes a small 200 sentence English training set that you can use to create you're own English parsing model. However, training on this little data will hurt the accuracy of the resulting parser. So, if you decide to use this parser, you are probably better off using the pretrain model available here.

上面链接的所有预训练模型都根据斯坦福依赖形式主义(ACL 论文手册).

All of the pretrained models linked above produce parses according to the Stanford Dependency formalism (ACL paper, and manual).

在这三个中,Stanford Parser 是最准确的.MaltParser 是最快的,这个包的一些配置能够解析 1800 个句子,只需 8 秒.

Of these three, the Stanford Parser is the most accurate. The MaltParser is the fastest, with some configurations of this package being able to parse 1800 sentences in only 8 seconds.

这篇关于句子的 RDF 表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆