如何在 ruby-on-rails 应用程序上使用 dBpedia 设置 neo4j? [英] How to setup neo4j with dBpedia ontop of ruby-on-rails application?
问题描述
我正在尝试在 ruby on rails
之上使用 dBpedia
和 neo4j
.
I am trying to use dBpedia
with neo4j
ontop of ruby on rails
.
假设我已经安装了 neo4j 并下载了 dBpedia 数据集.
Assuming I have installed neo4j and downloaded one of the dBpedia datasets.
如何将 dbpedia
数据集导入到 neo4j
中?
How do I import the dbpedia
dataset into neo4j
?
推荐答案
将 dbpedia 加载到 Neo4j 中的最简单方法是使用 dbpedia4neo 库.这是一个 Java 库,但您不需要了解任何 Java,因为您需要做的就是运行可执行文件.
The simplest way to load dbpedia into Neo4j is to use the dbpedia4neo library. This is a Java library, but you don't need to know any Java because all you need to do is run the executable.
如果需要,您可以在 JRuby 中重写它,但常规 Ruby 将无法工作,因为它依赖于 蓝图,一个没有 Ruby 等价物的 Java 库.
You could rewrite this in JRuby if you want, but regular Ruby won't work because it relies on Blueprints, a Java library with no Ruby equivalent.
这是两个关键文件,它们提供了加载过程.
Here are the two key files, which provide the loading procedure.
- https://github.com/oleiade/dbpedia4neo/blob/master/src/main/java/org/acaro/dbpedia4neo/inserter/DBpediaLoader.java
- https://github.com/oleiade/dbpedia4neo/blob/master/src/main/java/org/acaro/dbpedia4neo/inserter/TripleHandler.java
这是所涉及内容的说明.
Blueprints 正在将 RDF 数据转换为图形表示.要了解幕后发生的事情,请参阅Blueprints Sail Ouplementation:
Blueprints is translating the RDF data to a graph representation. To understand what's going on under the hood, see Blueprints Sail Ouplementation:
下载 dbpedia 转储文件后,您应该能够构建 dbpedia4neo Java 库并运行它无需修改 Java 代码.
After you download the dbpedia dump files, you should be able to build the dbpedia4neo Java library and run it without modifying the Java code.
首先克隆GitHub仓库的oleiade的fork,切换到dbpedia4neo
目录:
First, clone the oleiade's fork of the GitHub repository and change to the dbpedia4neo
directory:
$ git clone https://github.com/oleiade/dbpedia4neo.git
$ cd dbpedia4neo
(Oleiade 的 fork 包括一个小蓝图更新,它执行 sail.initialize();
参见 https://groups.google.com/d/msg/gremlin-users/lfpNcOwZ49Y/WI91ae-UzKQJ).
(Oleiade's fork includes a minor Blueprints update that does sail.initialize();
See https://groups.google.com/d/msg/gremlin-users/lfpNcOwZ49Y/WI91ae-UzKQJ).
在构建之前,您需要更新 pom.xml
以使用更多当前的蓝图版本和当前的蓝图存储库 (Sonatype).
Before you build it, you will need to update the pom.xml
to use more current Blueprints versions and the current Blueprints repository (Sonatype).
为此,打开 pom.xml
并在 dependencies
部分的顶部,从 0.6
更改所有 TinkerPop 蓝图版本到 0.9
.
To do this, open pom.xml
and at the top of the dependencies
section, change all of the TinkerPop Blueprints versions from 0.6
to 0.9
.
当您在文件中时,将 Sonatype 存储库添加到文件末尾的 repositories
部分:
While you are in the file, add the Sonatype repository to the repositories
section at the end of the file:
<repository>
<id>sonatype-nexus-snapshots</id>
<name>Sonatype Nexus Snapshots</name>
<url>https://oss.sonatype.org/content/repositories/releases</url>
</repository>
保存文件,然后使用 maven 构建它:
Save the file and then build it using maven:
$ mvn clean install
这将为您下载并安装所有依赖项,并在 target
目录中创建一个 jar 文件.
This will download and install all the dependencies for you and create a jar file in the target
directory.
加载dbpedia,使用maven运行可执行文件:
To load dbpedia, use maven to run the executable:
$ mvn exec:java
-Dexec.mainClass=org.acaro.dbpedia4neo.inserter.DBpediaLoader
-Dexec.args="/path/to/dbpedia-dump.nt"
dbpedia 转储很大,因此加载需要一段时间.
The dbpedia dump is large so this will take a while to load.
既然数据已加载,您可以通过以下两种方式之一访问图表:
Now that the data is loaded, you can access the graph in one of two ways:
- 直接使用 JRuby 和 Blueprints-Neo4j API.
- 使用常规 Ruby 和 Rexster REST 服务器,它类似于 Neo4j 服务器,不同之处在于它支持多个图形数据库.
- Use JRuby and the Blueprints-Neo4j API directly.
- Use regular Ruby and the Rexster REST server, which is similar to Neo4j Server except that it supports multiple graph databases.
有关如何创建 Rexster 客户端的示例,请参阅 Bulbs,这是我编写的 Python 框架,同时支持 Neo4j Server 和 Rexster.
For an example of how to create a Rexster client, see Bulbs, a Python framework I wrote that supports both Neo4j Server and Rexster.
- http://bulbflow.com/
- https://github.com/espeed/bulbs
- https://github.com/espeed/bulbs/tree/master/灯泡/雷克斯特
另一种方法是在 Ruby 中处理 dbpedia RDF 转储文件,将节点和关系写出到 CSV 文件,并使用 Neo4j 批量导入器 加载它.但这需要您手动将 RDF 数据转换为 Neo4j 关系.
Another approach to all this would be to process the dbpedia RDF dump file in Ruby, write out the nodes and relationships to a CSV file, and use the Neo4j batch importer to load it. But this will require that you manually translate the RDF data into Neo4j relationships.
这篇关于如何在 ruby-on-rails 应用程序上使用 dBpedia 设置 neo4j?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!