Sparql查询永远运行 [英] Sparql query running forever

查看:146
本文介绍了Sparql查询永远运行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力在Jena中执行SPARQL查询,导致我不理解的行为......

I'm struggling with the execution of a SPARQL query in Jena, with a resulting behaviour that I don't understand...

我正在尝试查询Esco本体( https://ec.europa.eu/esco/download ),以及我正在使用TDB来加载本体并创建模型(对不起,如果我使用的术语不准确,我不是很有经验)。

I'm trying to query the Esco ontology (https://ec.europa.eu/esco/download), and I'm using TDB to load the ontology and create the model (sorry if the terms I use are not accurate, I'm not very experienced).

我的目标是在本体中找到与我之前提取的文本相匹配的工作位置:例如:提取的术语: acuponcteur - >本体标签:Acuponcteur@fr - > uri:< http://ec.europa.eu/esco/occupation/14918 >

My goal is to find a job position uri in the ontology that matches with the text I have previously extracted: ex: extracted term : "acuponcteur" -> label in ontology: "Acuponcteur"@fr -> uri: <http://ec.europa.eu/esco/occupation/14918>

我称之为奇怪的行为与我在查询时得到(或不得)的结果有关,即:

What I call the "weird behaviour" is related to the results I'm getting (or not) when excuting queries, ie.:

执行以下查询时:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#> 
PREFIX esco: <http://ec.europa.eu/esco/model#>      
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>   
SELECT ?position    
WHERE {     
    ?s rdf:type esco:Occupation. 
    { ?position skos:prefLabel ?label. } 
    UNION 
    { ?position skos:altLabel ?label. } 
    FILTER (lcase(?label)= \"acuponcteur\"@fr ) 
}
LIMIT 10 

我在1分钟后得到这些结果:

I get those results after 1 minute :

-----------------------------------------------
| position                                    |
===============================================
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
| <http://ec.europa.eu/esco/occupation/14918> |
-----------------------------------------------

但是,当我尝试添加DISTINCT关键字时,因此:

However, when I'm trying to add the DISTINCT keyword, thus :

PREFIX skos: <http://www.w3.org/2004/02/skos/core#> 
PREFIX esco: <http://ec.europa.eu/esco/model#>      
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>   
SELECT DISTINCT ?position   
WHERE {     
    ?s rdf:type esco:Occupation. 
    { ?position skos:prefLabel ?label. } 
    UNION 
    { ?position skos:altLabel ?label. } 
    FILTER (lcase(?label)= \"acuponcteur\"@fr ) 
}
LIMIT 10 

似乎查询一直在运行(我在等待20分钟后停止执行......)

it seems like the query keeps running forever (i stopped the execution after 20 minutes waiting...)

I在执行与第一个查询相同的查询(因此没有DISTINCT)时,获得相同的行为,另一个标签要匹配,我确定不在本体中的标签。期待空的结果,它(似乎它)继续运行,我必须在一段时间后杀死它(再一次,我等了20分钟到最多):

I get the same behaviour when executing the same query as the first one (thus without DISTINCT), with another label to match, a label that I'm sure is not in the ontology. While expecting empty result, it (seems like it) keeps running and i have to kill it after a while (once again, i waited 20 minutes to the most) :

PREFIX skos: <http://www.w3.org/2004/02/skos/core#> 
PREFIX esco: <http://ec.europa.eu/esco/model#>      
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>   
SELECT ?position    
WHERE {     
    ?s rdf:type esco:Occupation. 
    { ?position skos:prefLabel ?label. } 
    UNION 
    { ?position skos:altLabel ?label. } 
    FILTER (lcase(?label)= \"assistante scolaire\"@fr ) 
}
LIMIT 10 

这可能是我正在运行的代码中的一个问题吗?它是:

May it be a problem in the code I'm running? There it is:

public static void main(String[] args) {

    // Make a TDB-backed dataset
    String directory = "data/testtdb" ;
    Dataset dataset = TDBFactory.createDataset(directory) ;

    // transaction (protects a TDB dataset against data corruption, unexpected process termination and system crashes)
    dataset.begin( ReadWrite.WRITE );
    // assume we want the default model, or we could get a named model here
    Model model = dataset.getDefaultModel();

    try {

          // read the input file - only needs to be done once
          String source = "data/esco.rdf";
          FileManager.get().readModel(model, source, "RDF/XML-ABBREV");

          // run a query

          String queryString =
                    "PREFIX skos: <http://www.w3.org/2004/02/skos/core#> " +
                    "PREFIX esco: <http://ec.europa.eu/esco/model#> " +     
                    "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +  
                    "SELECT ?position " +   
                    "WHERE { "  +   
                    "   ?s rdf:type esco:Occupation. " +
                    "   { ?position skos:prefLabel ?label. } " +
                    "   UNION " +
                    "   { ?position skos:altLabel ?label. }" +
                    "   FILTER (lcase(?label)= \"acuponcteur\"@fr ) " +
                    "}" +
                    "LIMIT 1 "  ;

          Query query = QueryFactory.create(queryString) ;

          // execute the query
          QueryExecution qexec = QueryExecutionFactory.create(query, model) ;
          try {
              ResultSet results = qexec.execSelect() ;
              // taken from apache Jena tutorial 
              ResultSetFormatter.out(System.out, results, query) ;

          } finally { 
              qexec.close() ; 
          }

      } finally {
          model.close() ;
          dataset.end();
      }

}

我在这里做错了什么?有什么想法?

What am I doing wrong here? Any idea?

谢谢!

推荐答案

作为第一点,可能会或可能没有太大区别,您可以使用属性路径来简化

As a first point that may or may not make much difference, you can use a property path to simplify

{ ?position skos:prefLabel ?label. } 
UNION 
{ ?position skos:altLabel ?label. } 

as

?position skos:prefLabel|skos:altLabel ?label 

这使得查询:

SELECT ?position    
WHERE {     
    ?s rdf:type esco:Occupation.                   # (1)
    ?position skos:prefLabel|skos:altLabel ?label  # (2)
    FILTER (lcase(?label)="acuponcteur"@fr ) 
}

此查询的重点是什么?有一些 n 的?位置/?标签对匹配(2),有些数字 m 的值匹配(1)。从查询中获得的结果数是 m× n ,但您从不使用?s的值。看起来你使用DISTINCT来摆脱一些重复的值,但是你没有看到为什么你首先得到重复的值。你应该简单地删除无用的行(1),并获得查询:

What's the point of ?s in this query? There are some number n of ?position/?label pairs that match (2), and some number m values of ?s that match (1). The number of results that you get from the query is m×n, but you never use the value of ?s. It looks like you used DISTINCT to get rid of some repeated values, but you didn't look to see why you were getting repeated values in the first place. You should simply remove the useless line (1), and have the query:

SELECT DISTINCT ?position    
WHERE {     
    ?position skos:prefLabel|skos:altLabel ?label
    FILTER (lcase(?label)="acuponcteur"@fr ) 
}

如果您不再需要DISTINCT,我不会感到惊讶。

I wouldn't be surprised if, at the point, you don't even need the DISTINCT anymore.

这篇关于Sparql查询永远运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆