具有Blank节点的Sparql查询可能很复杂 [英] Sparql query with Blank node can be complex

查看:71
本文介绍了具有Blank节点的Sparql查询可能很复杂的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读了此博客文章,

I read this blog article, Problems of the RDF model: Blank Nodes, and there's mentioned that using blank nodes can complicate the handling of data.

能给我一个例子,为什么使用空白节点很难执行SPARQL查询吗? 我不了解空白节点的复杂性. 您能为我解释一个存在变量的含义和语义吗? 我不太清楚RDF语义建议书 1.5.空白节点作为现有变量.

Can you give me an example why using blank nodes is difficult to perform a SPARQL query? I do not understand the complexity of blank nodes. Can you explain me the meaning and semantics of an existential variable? I do not understand clearly this explanation given in the RDF Semantics Recommendation, 1.5. Blank Nodes as Existential Variables.

推荐答案

现有变量

在(一阶)谓词演算中,存在存在的量化使我们可以对存在的事物进行断言,而无需说(或可能知道)我们实际上在谈论的领域中的哪个特定个体.例如,类似

Existential Variables

In the (first-order) predicate calculus, there is existential quantification which lets us make assertions about things that exist, without saying (or, possibly, knowing) which specific individuals in the domain we're actually talking about. For instance, a sentence like

hasUserId(JoshuaTaylor,1281433)

hasUserId(JoshuaTaylor,1281433)

需要句子

x .hasUserId( x ,1281433)

x.hasUserId(x,1281433)

当然,在很多情况下,第二句话可能是正确的而第一个句子不是正确的.从这个意义上讲,第二句话给我们的信息少于第一句话.还需要注意的是,第二句话中的变量 x 并没有提供任何方法来找出话语域中的哪个 元素实际上具有给定的userId.它还没有宣称只有 种具有给定用户ID的东西.为了更清楚一点,我们可以使用一个示例:

Of course, there are lots of scenarios in which the second sentence could be true without the first one being true. In that sense, the second sentence gives us less information than the first. It's also important to note that the variable x in the second sentence doesn't provide any way to find out which element in the domain of discourse actually has the given userId. It also also doesn't make any claim that there's only one such thing that has the given user id. To make that clearer, we might use an example:

y .hasAge(y,29)

y.hasAge(y,29)

这可能是对的,因为某个人或某物的年龄为29岁.请注意,尽管我们不能将 y 称为29岁的 the 个人,因为其中可能有很多.这句话告诉我们至少有一个.

This is presumably true, since someone or something out there is age 29. Note that we can't talk about y as the individual that is age 29, though, because there could be lots of them. All this sentence tells us is that there is at least one.

即使我们在两个句子中使用了不同的变量,也没有什么可以说具有指定属性的个体可能是不同的.这对于嵌套量化(例如

Even though we used different variables in the two sentences, there's nothing to say that the individuals with the specified properties might not be the same. This is particularly important in nested quantification, e.g.,

x .∃ y .likes( x y )

x.∃y.likes(x, y)

这句话可能是正确的,因为域中有一个个人喜欢自己.仅仅因为 x y 在句子中具有不同的名称并不意味着它们可能不会指代同一个人.

This sentence could be true because there is one individual in the domain that likes itself. just because x and y have different names in the sentence doesn't mean that they might not refer to the same individual.

RDF语义中定义了一个已定义的RDF包含模型.另一个堆栈溢出问题 RDF Graph Entailment 中对此进行了详细描述.想法是,将RDF图视为图中存在的空白节点上的一个较大的存在量化.例如,如果图中的三元组是t 1 ,…,t n ,并且出现在这些三元组中的空白节点是b 1 ,…,b m ,则该图为公式:

There is a defined RDF entailment model defined in RDF Semantics. This has been described more in another Stack Overflow question, RDF Graph Entailment. The idea is that an RDF graph is treated a big existential quantification over the blank nodes mentioned in the graph. E.g., if the triples in the graph are t1, …, tn, and the blank nodes that appear in those triples are b1, …, bm, then the graph is a formula:

∃ b 1 ,…,b m .(t 1 ∧…∧ t n )

∃b1, …, bm.(t1 ∧ … ∧ tn)

基于上面对存在变量的讨论,请注意,这意味着数据中的空白节点可以引用域的相同元素或不同元素,并且不需要精确地用一个元素代替一个空白节点.这意味着,以这种方式解释的带有空白节点的图所提供的信息要比您预期的少得多.

Based on the discussion of the existential variables above, note that this means that blank nodes in the data might refer to same element of the domain, or different elements, and that it's not required that exactly one element could take the place of a blank node. This means that a graph with blank nodes, when interpreted in this manner, provides much less information than you might expect.

现在,如果人们使用空白节点作为生存变量,那么上面的讨论将很有用.在许多情况下,作者将它们更多地看作是匿名的,但是确定的和截然不同的对象.例如,如果我们随便写

Now, the discussion above is useful if people are using blank nodes as existential variables. In many cases, authors think of them more as anonymous, but definite and distinct objects. E.g., if we casually write

@prefix : <https://stackoverflow.com/q/20629437/1281433/> .

:Carol :hasAddress [ :hasNumber 4222 ;
                     :hasStreet :Clinton_Way ] .

我们可能很想说的是那里有一个带有指定属性的地址,但是根据RDF包含模型,这不是我们正在做的.

we may well be trying to say that there is a single address out there with the specified properties, but according to the RDF entailment model, that's not what we're doing.

实际上,这并不是什么大问题,因为我们通常不使用RDF包含.但是, 的问题是,由于空白变量的范围是图形的局部值,因此我们无法针对要求Carol地址的端点运行SPARQL查询,并获取可重复使用的IRI .如果我们运行这样的查询:

In practice, this isn't so much of a problem, because we're usually not using RDF entailment. What is a problem though is that since the scope of blank variables is local to a graph, we can't run a SPARQL query against an endpoint asking for Carol's address and get back an IRI that we can reuse. If we run a query like this:

prefix : <https://stackoverflow.com/q/20629437/1281433/>

construct {
  :Mike :hasAddress ?address
}
where {
  :Carol :hasAddress ?address
}

然后,我们得到以下(无用的)图形结果:

then we get back the following (unhelpful) graph as a result:

@prefix :      <https://stackoverflow.com/q/20629437/1281433/> .

:Mike   :hasAddress  []  .

我们将无法获得有关地址的更多信息,因为我们现在所拥有的只是一个空白节点.如果我们使用了IRI,例如

We won't have a way to get more information about the address because all we have now is a blank node. If we had used IRIs, e.g.,

@prefix : <https://stackoverflow.com/q/20629437/1281433/> .

:Carol :hasAddress :address1267389 .
:address1267389 :hasNumber 4222 ;
                :hasStreet :Clinton_Way .

然后查询将产生更多帮助:

then the query would have produced something more helpful:

@prefix :      <https://stackoverflow.com/q/20629437/1281433/> .

:Mike   :hasAddress  :address1267389 .

为什么这更有用?第一种情况就像有数据

Why is this more useful? The first case is like having the data

∃ x.(hasAddress(Carol,x)∧ hasNumber(x,4222)∧ hasStreet(x,ClintonWay))

∃ x.(hasAddress(Carol,x) ∧ hasNumber(x,4222) ∧ hasStreet(x,ClintonWay))

并获取结果

∃ y.hasAddress(Mike,y)

∃ y.hasAddress(Mike,y)

当然,迈克和卡罗尔有相同的地址是可能,但是从这些句子中无法确定.拥有

Sure, it's possible that Mike and Carol have the same address, but from these sentences there's no way to know for sure. It's much more helpful to have data like

hasAddress(Carol,address1267389)
hasNumber(address1267389,4222)
hasStreet(address1267389,ClintonWay))

hasAddress(Carol,address1267389)
hasNumber(address1267389,4222)
hasStreet(address1267389,ClintonWay))

并获取结果

hasAddress(Mike,address1267389)

hasAddress(Mike,address1267389)

据此,您知道他们有相同的地址,您可以询问有关该地址的信息.

From this, you know that they have the same address, and you can ask things about it.

这将在多大程度上影响您的数据及其使用者,取决于典型的用例.对于自动构造的图,可能很难事先知道以后需要引用哪种数据,因此最好为尽可能多的资源生成IRI.由于IRI是自由格式的,因此通常这样做并不难.例如,如果您有一些明智的基本" IRI,例如

How much this will affect your data and its consumers depends on what the typical use cases are. For automatically constructed graphs, it may be hard to know in advance just what kind of data you'll need to be able to refer to later, so it's a good idea to generate IRIs for as many of your resources as you can. Since IRIs are free-form, it's usually not too hard to do this. For instance, if you've got some sensible "base" IRI, e.g.,

http://example.org/myData/

然后,您可以轻松地添加后缀以标识您的资源.例如

then you can easily append suffixes to identify your resources. E.g.,

http://example.org/myData/addresses/addr1
http://example.org/myData/addresses/addr2
http://example.org/myData/addresses/addr3
http://example.org/myData/individuals/ind34
http://example.org/myData/individuals/ind35

这篇关于具有Blank节点的Sparql查询可能很复杂的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆