Fuseki索引的(Lucene)文本搜索未返回任何结果 [英] Fuseki indexed (Lucene) text search returns no results

查看:97
本文介绍了Fuseki索引的(Lucene)文本搜索未返回任何结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的本体RDF文件(将近4M实例),目前正在通过Fuseki v2.0.0进行流传输.我的汇编文件如下所示:

I have a very large ontology RDF file (almost 4M instances) that I'm currently streaming via Fuseki v2.0.0. My assembler file looks like this:

@prefix :        <#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix myprefix: <http://www.example.org/some/path/myprefix#> .

## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
# Solr index
text:TextIndexSolr    rdfs:subClassOf   text:TextIndex .

## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the text dataset.

:text_dataset rdf:type     text:TextDataset ;
    text:dataset   <#dataset> ;
    text:index     <#indexLucene> ;
    .

# A TDB datset used for RDF storage
<#dataset> rdf:type      tdb:DatasetTDB ;
    tdb:location "DB" ;
    tdb:unionDefaultGraph true ; # Optional
    .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
    text:directory <file:Lucene> ;
    ##text:directory "mem" ;
    text:entityMap <#entMap> ;
    .

# Mapping in the index
# URI stored in field "uri"
# myprefix:foo is mapped to field "text"
<#entMap> a text:EntityMap ;
    text:entityField      "uri" ;
    text:defaultField     "text" ;
    text:map (
         [ text:field "text" ; text:predicate myprefix:foo ]
         ) .

为了在合理的响应时间内对特定元素执行文本搜索,我使用文本索引导入了RDF文件:

In order to perform text searches on a particular element within a reasonable response time, I imported the RDF file using text indexing:

$ java -cp $FUSEKI_HOME/fuseki-server.jar tdb.tdbloader --tdb=run/text-config.ttl ontologies.rdf 

...和

$ java -cp $FUSEKI_HOME/fuseki-server.jar jena.textindexer --desc=run/text-config.ttl 

...然后以以下方式运行Fuseki服务器

... then running the Fuseki server as

./fuseki-server -v --debug -loc=DB /dataset

在导入过程中没有错误,我可以对此新数据集运行各种SPARQL查询而没有任何问题.但是,当我尝试执行全文查询时,我得到0个结果:

No errors during the import, and I can run various SPARQL queries against this new dataset with no issues. But when I try to perform a full-text query, I get 0 results:

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
PREFIX text: <http://jena.apache.org/text#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix myprefix: <http://www.example.org/some/path/myprefix#>

SELECT ?s ?sci_name
{ ?s text:query (myprefix:foo '123test' 10) ; 
    myprefix:foo ?sci_name 
}

我在这里缺少明显的东西吗?即使设置了详细和调试标志,我也看不到Fuseki服务器日志上的警告或错误.我可以执行常规的SPARQL查询来获得相同的结果,但是(很容易理解)它很慢:

Am I missing something obvious here? I see no warnings or errors on the Fuseki server logs, even with the verbose and debug flags set. I can perform a regular SPARQL query to get these same results, but it's (understandably) quite slow:

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
PREFIX text: <http://jena.apache.org/text#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix myprefix: <http://www.example.org/some/path/myprefix#>

SELECT ?s
{ ?s myprefix:foo ?o .
  FILTER regex(str(?o), "123test", "i")
}

在此方面提供的任何帮助将不胜感激,因为我是Fuseki/Jena的新手,而且快要死了.

Any help with this would be appreciated, as I'm new to Fuseki/Jena and I'm hitting a dead end.

推荐答案

如果使用以下命令运行服务器

If you run the server with

./fuseki-server -v --debug -loc=DB /dataset

然后它不使用您的配置文件.试试:

then it is not using your configuration file. Try:

./fuseki-server --desc text-config.ttl

或更好地使用带有服务和数据集描述的Fuseki配置文件(请参见示例)并运行:

or better have a Fuseki configuration file with service and dataset description (see examples) and run:

./fuseki-server --confg config.ttl

这篇关于Fuseki索引的(Lucene)文本搜索未返回任何结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆