流汗演员IDS和传记从数据转储或游离碱的API [英] Gettting Actor Ids and biographies from the data dumps or Freebase API

查看:158
本文介绍了流汗演员IDS和传记从数据转储或游离碱的API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道从游离碱API获取IMDB IDS和传记让演员从IDS数据中游离碱转储,后来的最佳方式?

Does anyone know the best way of getting Actor Ids from Freebase data dumps, and later on getting the IMDB ids and biographies from the Freebase API?

推荐答案

演员将有类型/电影/演员,像这样在转储:

Actors will have the type /film/actor and look like this in the dump:

ns:m.010q36     rdf:type        ns:film.actor.

您可以找到他们都在从COM $ P $几分钟pssed用一个简单的grep转储:

You can find them all in a few minutes from the compressed dump with a simple grep:

zgrep $'rdf:type\tns:film.actor.' freebase-rdf-<date of dump>.gz | cut -f 1 | cut -d ':' -f 2 > actor-mids.txt

这将在 m.010q36 从而重新presents的MID / M / 010q36

This will generate a list of MIDs in the form m.010q36 which represents the MID /m/010q36.

使用移动互联网设备列表中,寻找具有所有行的MID在第一列,第二你想要的属性之一。你可以这样做使用Python,grep的,或您所选择的工具/语言。当然,如果你使用的编程语言如Python,你可以滚了最初的搜索。

Using the list of MIDs, look for all lines which have that MID in the first column, one of your desired properties in the second. You could do this using Python, grep, or the tool/language of your choice. Of course if you're using a programming language like Python, you could roll the initial search.

维基百科和IMDB ID存储为哪些游离碱的调用键看起来像这样(的MusicBrainz&安培; Netflix公司包括太):

Wikipedia and IMDB IDs are stored as what Freebase calls keys and look like this (MusicBrainz & Netflix included too):

ns:m.010q36     ns:type.object.key      "/wikipedia/en/Mr$002ERodgers".
ns:m.010q36     ns:type.object.key      "/authority/imdb/name/nm0736872".
ns:m.010q36     ns:type.object.key      "/authority/musicbrainz/87467525-3724-412d-ad3e-595ecb6a3bfd".
ns:m.010q36     ns:type.object.key      "/authority/netflix/role/30006685".

密钥可以连接codeD(如维基百科键上方)。你可以找到关于如何处理与他们的中游离碱维基文档。

这篇关于流汗演员IDS和传记从数据转储或游离碱的API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆