使用SPARQL提取dbpedia实体的层次结构 [英] Extracting hierarchy for dbpedia entity using SPARQL
问题描述
我正在尝试使用 SPARQL端点为DBpedia资源提取Wikipedia类别或Yago分类的层次结构。例如,我想以实体的分层形式找出所有可能的类别和类,例如 http:// dbpedia.org/resource/Nokia ,例如Thing→组织→公司→ …→诺基亚。
I am trying to extract the hierarchy of Wikipedia category or Yago classification for DBpedia resources using the SPARQL endpoint. For instance, I would like to find out all the possible categories and classes in hierarchical form of entity, say, http://dbpedia.org/resource/Nokia, like Thing → Organization → Company → … → Nokia.
推荐答案
一个简单的SPARQL select可以检索您感兴趣的信息,尽管它不会按层次排列。您对获取资源的所有类型以及它们之间的 rdfs:subClassOf
关系感兴趣。这是对诺基亚的非常简单的查询,可以在 DBpedia SPARQL端点
A simple SPARQL select can retrieve the information that you're interested in, though it won't be arranged hierarchically. You're interested in getting all the types of a resource, as well as the rdfs:subClassOf
relations between them. Here's a very simple query for Nokia that can be run on the DBpedia SPARQL endpoint
SELECT * WHERE {
dbpedia:Nokia a ?c1 ; a ?c2 .
?c1 rdfs:subClassOf ?c2 .
}
如果将结果集中的每一对类都视为有向边,并执行拓扑排序,然后您将看到诺基亚资源所属的类的层次结构。实际上,由于将其视为图可能很方便,因此可以使用SPARQL构造查询以RDF图的形式获取它。
If you treat each pair of classes in that result set as a directed edge and perform a topological sort , then you'll see the hierarchy of the classes to which the Nokia resource belongs. In fact, since it is probably convenient to treat this as a graph, you can get it in the form of an RDF graph by using a SPARQL construct query.
CONSTRUCT WHERE {
dbpedia:Nokia a ?c1 ; a ?c2 .
?c1 rdfs:subClassOf ?c2 .
}
构造查询生成此图(N3格式):
The construct query produces this graph (in N3 format):
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix yago: <http://dbpedia.org/class/yago/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dbpedia: <http://dbpedia.org/resource/> .
dbpedia-owl:Agent rdfs:subClassOf owl:Thing .
dbpedia-owl:Company rdfs:subClassOf dbpedia-owl:Organisation .
dbpedia-owl:Organisation rdfs:subClassOf dbpedia-owl:Agent .
yago:CompaniesBasedInEspoo rdfs:subClassOf yago:Company108058098 .
dbpedia:Nokia rdf:type yago:CompaniesListedOnTheHelsinkiStockExchange ,
owl:Thing ,
yago:CompaniesBasedInEspoo ,
dbpedia-owl:Agent ,
yago:DisplayTechnologyCompanies ,
yago:ElectronicsCompaniesOfFinland ,
dbpedia-owl:Company ,
dbpedia-owl:Organisation ,
yago:Company108058098 ,
yago:CompaniesEstablishedIn1865 .
yago:CompaniesEstablishedIn1865 rdfs:subClassOf yago:Company108058098 .
yago:CompaniesListedOnTheHelsinkiStockExchange rdfs:subClassOf yago:Company108058098 .
yago:DisplayTechnologyCompanies rdfs:subClassOf yago:Company108058098 .
yago:ElectronicsCompaniesOfFinland rdfs:subClassOf yago:Company108058098 .
备注
以上查询检索诺基亚的 rdf:type
层次结构。在问题中,您还提到了Wikipedia类别。 DBpedia资源通过 dcterms:subject
属性与相应文章所属的Wikipedia类别相关联。然后,这些Wikipedia类别由 skos:broader
进行分层结构。但是,这些对于个人而言确实不是类型。例如,数据包含:
Remarks
The queries above retrieve the rdf:type
hierarchy for Nokia. In the question, you also mention Wikipedia categories. DBpedia resources are associated with the Wikipedia categories to which their corresponding articles belong by the dcterms:subject
property. Those Wikipedia categories are then structured hierarchically by skos:broader
. These really are not types for the individuals though. For instance, the data contain:
dbpedia:Nokia dcterms:subject category:Finnish_brands
category:Finnish_brands skos:broader category:Brands_by_country
虽然可以说诺基亚是 Finnish_brand,说诺基亚是 Brand_by_country则毫无意义。
While it probably makes sense to say that Nokia is a Finnish_brand, it makes much less sense to say that Nokia is a Brand_by_country.
这篇关于使用SPARQL提取dbpedia实体的层次结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!