SPARQL查询在Fuseki中起作用,但在Jena TDB中不起作用 [英] SPARQL query works in Fuseki but not in Jena TDB

查看:131
本文介绍了SPARQL查询在Fuseki中起作用,但在Jena TDB中不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将数据整理成多个图形.保存三元组的图形很重要.数据结构很复杂,但可以这样简化:

I have my data organised in multiple graphs. The graph in which a triple is saved matters. The data structure is complicated but it can be simplified like this:

我的商店中有蛋糕,其中有不同蛋糕类型的层次结构,所有子类都是<cake>

My store contains cakes, where there's a hierarchy of different cake types, all subclasses of <cake>

<http://example.com/a1> a <http://example.com/applecake>
<http://example.com/a2> a <http://example.com/rainbowcake>
...

根据用户在UI中创建它们的方式,它们最终会出现在其他图形中.例如,如果用户烘焙"蛋糕,它会出现在<http://example.com/homemade>图中;如果用户购买"蛋糕,它会出现在<http://example.com/shopbought>图中.

Depending on how they get created by a user in a UI, they end up in a different graph. If for instance the user "bakes" a cake, it goes in the <http://example.com/homemade> graph, if they "buy" one, it goes into the <http://example.com/shopbought> graph.

从商店取回蛋糕时,我想知道每个蛋糕是自制的还是购买的.对此没有任何属性,我想纯粹基于存储三元组的图来检索信息.

When I retrieve my cakes from the store, I want to know for each cake whether it's homemade or shopbought. There is no property for this, I want to retrieve the information purely based on the graph the triple is stored in.

我尝试了多种方法来实现这一目标,但是在Jena TDB中没有一种有效.问题在于所有蛋糕都以购买"的形式返回.但是,所有查询都可以在Fuseki中使用(在精确的sae数据集上),我想知道这是否是TDB错误,或者是否还有其他方法.以下是简化的查询(无变化):

I have tried various ways of achieving this but none of them work in Jena TDB. The problem is that all cakes come back as "shopbought". All of the queries however work in Fuseki (on the exact sae dataset) and I was wondering whether this is a TDB bug or if there's another way. Here are the simplified queries (without variations):

版本1:

SELECT DISTINCT  *
FROM <http://example.com/homemade>
FROM <http://example.com/shopbought>
FROM NAMED <http://example.com/homemade>
FROM NAMED <http://example.com/shopbought>
WHERE {
    ?cake rdf:type ?caketype .
    ?caketype rdfs:subClassOf* <cake>
      {
          GRAPH <http://example.com/homemade> { ?cake rdf:type ?typeHomemade }
      } UNION {
          GRAPH <http://example.com/shopbought> { ?cake rdf:type ?typeShopbought }
      }
    BIND(str(if(bound(?typeHomemade), true, false)) AS ?homemade)
}

版本2:

SELECT DISTINCT  *
    FROM <http://example.com/homemade>
    FROM <http://example.com/shopbought>
    FROM NAMED <http://example.com/homemade>
    FROM NAMED <http://example.com/shopbought>
    WHERE {
        ?cake rdf:type ?caketype .
        ?caketype rdfs:subClassOf* <cake>
        GRAPH ?g {
          ?cake rdf:type ?caketype .
        }
        BIND(STR(IF(?g=<http://example.com/homemade>, true, false)) AS ?homemade)
    }

有什么想法可以在Fuseki中起作用,而不能在TDB中起作用吗?

Any ideas why this works in Fuseki but not in TDB?

修改: 我开始认为它与GRAPH关键字有关.以下是一些简单得多的查询(可在Fuseki和tdbquery中使用),以及使用Jena API获得的结果:

I'm beginning to think it has something to do with the GRAPH keyword. Here are some much simpler queries (which work in Fuseki and tdbquery) and the results I get using the Jena API:

SELECT * WHERE { GRAPH <http://example.com/homemade> { ?s ?p ?o }}

0个结果

SELECT * WHERE { GRAPH ?g { ?s ?p ?o }}

0个结果

SELECT * FROM <http://example.com/homemade> WHERE { ?s ?p ?o }

x个结果

SELECT * FROM <http://example.com/homemade> WHERE { GRAPH <http://example.com/homemade> { ?s ?p ?o }}

0个结果

SELECT * FROM NAMED <http://example.com/homemade> WHERE { GRAPH <http://example.com/homemade> { ?s ?p ?o }}

0个结果

推荐答案

好,所以我的解决方案实际上与执行查询的方式有关.我最初的想法是对数据集进行预过滤,以使查询仅在相关图上执行(数据集包含许多图,并且它们可能很大,这会使查询所有内容"变慢).可以通过将它们添加到SPARQL或直接在Jena中完成(尽管这不适用于其他三重存储).然而,将两种方式结合在一起安全起见"是行不通的.

OK so my solution has actually to do with the way I executed the query. My initial idea was to pre-filter the dataset so that a query only gets executed on the relevant graphs (the dataset contains many graphs and they can be quite large which would make querying "everything" slow). This can be done either by adding them to the SPARQL or directly in Jena (although this would not work for other triple stores). Combining both ways however "to be on the safe side" does not work.

此查询在整个数据集上运行并按预期工作:

This query runs on the entire dataset and works as expected:

Query query = QueryFactory.create("SELECT * WHERE { GRAPH ?g { ?s ?p ?o } }", Syntax.syntaxARQ);
QueryExecution qexec = QueryExecutionFactory.create(query, dataset);
ResultSet result = qexec.execSelect();

同一查询只能在一个特定的图形上执行,无论哪个图形都无关紧要,它不会给出任何结果:

The same query can be executed only on a specific graph, where it doesn't matter which graph that is, it does not give any results:

//run only on one graph
Model target = dataset.getNamedModel("http://example.com/homemade");
//OR run on the union of all graphs
Model target = dataset.getNamedModel("urn:x-arq:UnionGraph");
//OR run on a union of specific graphs
Model target = ModelFactory.createUnion(dataset.getNamedModel("http://example.com/shopbought"), dataset.getNamedModel("http://example.com/homemade"), ...);
[...]
QueryExecution qexec = QueryExecutionFactory.create(query, target);
[...]

我的解决方法是现在始终查询整个数据集(它支持SPARQL GRAPH关键字fine),并且对于每个查询始终指定应在其上运行的图,以避免查询整个数据集. 不确定这是否是Jena API的预期行为

My workaround was to now always query the entire dataset (which supports the SPARQL GRAPH keyword fine) and for each query always specify the graphs on which it should run to avoid having to query the entire dataset. Not sure if this is expected behaviour for the Jena API

这篇关于SPARQL查询在Fuseki中起作用,但在Jena TDB中不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆