SPARQL 获取所有节点的所有父节点 [英] SPARQL to get all parents of all nodes

查看:64
本文介绍了SPARQL 获取所有节点的所有父节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用这篇文章来获取单个 RDF 节点的父节点或谱系:SPARQL 查询以获取节点的所有父节点

I have been using this post to get the parents or lineage of a single RDF node: SPARQL query to get all parent of a node

这在我的 virtuoso 服务器上运行良好.抱歉,找不到包含具有类似结构的数据的公共端点.

This works nicely on my virtuoso server. Sorry, couldn't find a public endpoint containing data with a similar structure.

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix bto: <http://purl.obolibrary.org/obo/>
select (group_concat(distinct ?midlab ; separator = "|") AS ?lineage)
where
{ 
  bto:BTO_0000207 rdfs:subClassOf* ?mid .
  ?mid rdfs:subClassOf* ?class .
  ?mid rdfs:label ?midlab .
}
group by ?lineage
order by (count(?mid) as ?ordercount)

给予

+---------------------------------------------------------+
|                         lineage                         |
+---------------------------------------------------------+
| bone|cartilage|connective tissue|tibia|tibial cartilage |
+---------------------------------------------------------+

然后我想知道是否可以通过将select更改为

select ?s (group_concat(distinct ?midlab ; separator = "|") AS ?lineage)

和 where 语句中的第一行

and the first line in the where statement to

?s rdfs:subClassOf* ?mid .

那些拥有比我更多 SPARQL 经验的人可能不会对查询超时感到惊讶.

Those who have more SPARQL experience than I will probably not be surprised that the query timed out.

这是一个合理的方法吗?我在语法上做错了吗?

我怀疑不同的关键字或组子句是瓶颈,因为这只需要一两秒钟:

I suspect that the distinct keyword or group clause are bottlenecks, because this only takes a second or two:

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix bto: <http://purl.obolibrary.org/obo/>
select ?s ?midlab
where
{ 
  ?s rdfs:subClassOf* ?mid .
  ?mid rdfs:subClassOf* ?class .
  ?mid rdfs:label ?midlab .
  ?s <http://www.geneontology.org/formats/oboInOwl#hasOBONamespace> "BrendaTissueOBO"^^<http://www.w3.org/2001/XMLSchema#string> .
}

推荐答案

您的第一个查询不合法.您可以查看 sparql.org 的查询验证器.虽然您可以按计数(?mid)排序,但不能将值绑定到变量并在同一子句中按它排序.那会给你:

Your first query isn't legal. You can check at sparql.org's query validator. While you can order by count(?mid), you can't bind the value to a variable and order by it in the same clause. That would give you:

select (group_concat(distinct ?midlab ; separator = "|") AS ?lineage)
where
{ 
  bto:BTO_0000207 rdfs:subClassOf* ?mid .
  ?mid rdfs:subClassOf* ?class .
  ?mid rdfs:label ?midlab .
}
group by ?lineage
order by count(?mid)

现在,这是合法的,但没有多大意义.group_concat 要求您有一些组,并且您将对每个组中的值进行串联.在没有 group by 子句的情况下,您会得到一个隐式组,因此没有 group bygroup_concat 是可以的.但是你有一个 group by ?lineage 并没有多大意义,因为 ?lineage 每个组已经只有一个值(因为它已经是一个总计的).更好的是按 ?s 分组,如下所示.这似乎更正确,并且可能不会超时:

Now, that's legal, but it doesn't make quite as much sense. group_concat requires that you have some groups, and that you'll do a concatenation for the values within each group. In the absence of a group by clause, you get an implicit group, so the group_concat without a group by is OK. But you've got a group by ?lineage that doesn't make a whole lot of sense, because ?lineage already only has one value per group (since it's already an aggregate). Better would be to group by ?s, as in the following. This seems more correct, and might not time out:

select ?s (group_concat(distinct ?midlab ; separator = "|") AS ?lineage)
where
{ 
  ?s rdfs:subClassOf* ?mid .
  ?mid rdfs:subClassOf* ?class .
  ?mid rdfs:label ?midlab .
}
group by ?s
order by count(?mid)

这篇关于SPARQL 获取所有节点的所有父节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆