使用Jena TDB在本地加载DBpedia? [英] Load DBpedia locally using Jena TDB?

查看:468
本文介绍了使用Jena TDB在本地加载DBpedia?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对DBpedia执行查询:

I need to perform a query against DBpedia:

SELECT DISTINCT ?poi ?lat ?long ?photos ?template ?type ?label WHERE {
  ?poi  <http://www.w3.org/2000/01/rdf-schema#label> ?label .
  ?poi <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
  ?poi <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
  ?poi <http://dbpedia.org/property/hasPhotoCollection> ?photos .                      
  OPTIONAL {?poi <http://dbpedia.org/property/wikiPageUsesTemplate> ?template } .
  OPTIONAL {?poi <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type } .
  FILTER ( ?lat > x && ?lat < y &&
           ?long > z && ?long < ω && 
           langMatches( lang(?label), "EN" ))
} 

我是猜测这些信息分散在不同的转储(.nt)文件中,并且SPARQL端点以某种方式为我们提供结果集。我需要在本地下载这些不同的.nt文件(不是所有的DBpedia),只执行一次我的查询并在本地存储结果(我不想使用SPARQL端点)。

I'm guessing this information is scattered among different dumps (.nt) files and somehow the SPARQL endpoint serves us with a result set. I need to download these different .nt files locally (not all DBpedia), perform only once my query and store the results locally (I don't want to use the SPARQL endpoint).


  • Jena应该在这一次运行中使用哪些部分?

我对这篇文章


因此,您可以将整个DBPedia数据加载到磁盘上的单个TDB位置(即单个目录)。这样,您就可以对它运行SPARQL查询。

So, you can load the entire DBPedia data into a single TDB location on disk (i.e. a single directory). This way, you can run SPARQL queries over it.




  • 如何加载DBpedia如果我们有三个.nt DBpedia文件,以耶拿的名义进入单个TDB位置?我们如何在这些.nt文件上应用上述查询? (任何代码都会有所帮助。)

    • How do I load the DBpedia into a single TDB location, in Jena terms, if we got three .nt DBpedia files? How do we apply the above query on those .nt files? (Any code would help.)

      示例,这是错误的吗?

       String tdbDirectory = "C:\\TDB";
       String dbdump1 = "C:\\Users\\dump1_en.nt";
       String dbdump2 = "C:\\Users\\dump2_en.nt";
       String dbdump3 = "C:\\Users\\dump3_en.nt";
       Dataset dataset = TDBFactory.createDataset(tdbDirectory);
       Model tdb = dataset.getDefaultModel(); //<-- What is the default model?Should I care?
       //Model tdb = TDBFactory.createModel(tdbdirectory) ;//<--is this prefered?
       FileManager.get().readModel( tdb, dbdump1, "N-TRIPLES" );
       FileManager.get().readModel( tdb, dbdump2, "N-TRIPLES" );
       FileManager.get().readModel( tdb, dbdump3, "N-TRIPLES" );
       String q = "my big fat query";
       Query query = QueryFactory.create(q);
              QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
              ResultSet results = qexec.execSelect();
               while (results.hasNext()) {
               //do something significant with it
       }
      qexec.close()
      tdb.close() ;
      dataset.close();
      




      • 在上面的代码中我们使用dataset .getDefaultModel(将默认图形设为Jena 模型)。这个陈述有效吗?我们是否需要创建数据集来执行查询,还是应该使用 TDBFactory.createModel(tdbdirectory)

        • In the above code we used "dataset.getDefaultModel" (to get the default graph as a Jena Model). Is this statement valid? Do we need to create a dataset to perform the query, or should we go with TDBFactory.createModel(tdbdirectory)?
        • 推荐答案

          让Jena在本地索引:

          To let Jena index locally :

          /** The Constant tdbDirectory. */
          public static final String tdbDirectory = "C:\\TDBLoadGeoCoordinatesAndLabels"; 
          
          /** The Constant dbdump0. */
          public static final String dbdump0 = "C:\\Users\\Public\\Documents\\TDB\\dbpedia_3.8\\dbpedia_3.8.owl";
          
          /** The Constant dbdump1. */
          public static final String dbdump1 = "C:\\Users\\Public\\Documents\\TDB\\geo_coordinates_en\\geo_coordinates_en.nt";
          
           ...
          
          Model tdbModel = TDBFactory.createModel(tdbDirectory);<\n>
          
          /*Incrementally read data to the Model, once per run , RAM > 6 GB*/
          FileManager.get().readModel( tdbModel, dbdump0);
          FileManager.get().readModel( tdbModel, dbdump1, "N-TRIPLES");
          FileManager.get().readModel( tdbModel, dbdump2, "N-TRIPLES");
          FileManager.get().readModel( tdbModel, dbdump3, "N-TRIPLES");
          FileManager.get().readModel( tdbModel, dbdump4, "N-TRIPLES");
          FileManager.get().readModel( tdbModel, dbdump5, "N-TRIPLES");
          FileManager.get().readModel( tdbModel, dbdump6, "N-TRIPLES");
          tdbModel.close();
          

          查询耶拿:

          String queryStr = "dbpedia query ";
          
          Dataset dataset = TDBFactory.createDataset(tdbDirectory);
          Model tdb = dataset.getDefaultModel();
          
          Query query = QueryFactory.create(queryStr);
          QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
          
          /*Execute the Query*/
          ResultSet results = qexec.execSelect();
          
          while (results.hasNext()) {
              // Do something important
          }
          
          qexec.close();
          tdb.close() ;
          

          这篇关于使用Jena TDB在本地加载DBpedia?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆