如何使用SPARQL查询列出和统计图数据中不同类型的节点和边界实体? [英] How to list and count the different types of node and edge entities in the graph data using SPARQL query?
问题描述
我希望为数据集提供一些汇总统计信息,我想列出图中不同类型的边缘实体和节点(顶点)实体。
<例如:在用户的Twitter社交网络图和以下关系(同构图)中,只有一种类型的顶点实体(用户),但是在诸如ConceptNet数据的异构图中,它将具有多个值。
- >可以通过计算我相信使用查询的不同数量的谓词来计算边缘实体:
SELECT DISTINCT(?p AS?DistinctEdges){?s?p?o}
但我不知道如何为顶点做这件事。顶点类型可以来自三元组的主体或对象字段,而对象又可以是值(文字)或其他资源本身。
请原谅我如果我在任何地方出现错误的词汇。我刚开始创建一个语义Web应用程序。 您可以使用 UNION
code>子句使用 IsLiteral()
函数将两个模式与 FILTER
子句结合以省略文字例如
pre $ code> SELECT DISTINCT?顶点
其中
{
{
?顶点?p []
}
UNION
{
[]?p?顶点
FILTER(!IsLiteral(?vertex))
}
[]
是一个匿名变量,因为你不关心 UNION
两边的某些职位,所以通过给他们一个匿名变量我们匹配任何值但不携带那些值在查询中。
FILTER
子句用于过滤掉是文字。在LHS中没有必要这样做,因为RDF禁止文字主题,因此LHS中的任何?vertex
值都必须是资源,即URI /空白节点 I'm looking to provide some summary stats for a data set and I want to list the different types of edge entities and node(vertex) entities in the graph.
For example:
-> In Twitter Social network graph of users and following relationship (Homogeneous graph), there is only one type of vertex entity (user), but in heterogeneous graphs such as ConceptNet data, it will have multiple values.
-> The edge entities can be computed by just counting the different number of predicates I believe using the query :
SELECT DISTINCT (?p AS ?DistinctEdges) { ?s ?p ?o }
But I am not sure how to do so for vertices. The vertex type can be from a subject or object field of the triple and the object in turn can be either a value(literal) or another resource itself.
Please excuse me if I have gone wrong with the vocabulary anywhere. I have just started working on building a semantic web application.
You can use the UNION
clause to combine two patterns in conjunction with a FILTER
clause using the IsLiteral()
function to omit literals e.g.
SELECT DISTINCT ?vertex
WHERE
{
{
?vertex ?p []
}
UNION
{
[] ?p ?vertex
FILTER(!IsLiteral(?vertex))
}
}
The []
is an anonymous variable because you don't care about the some of the positions on either side of the UNION
so by giving them an anonymous variable we match any value but don't carry those values out in the query.
The FILTER
clause in the RHS of the union is used to filter out objects which are literals. It is not necessary to have this in the LHS because RDF forbids literal subjects so any ?vertex
value from the LHS must be a resource i.e. a URI/blank node
这篇关于如何使用SPARQL查询列出和统计图数据中不同类型的节点和边界实体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!