Jena / ARQ:模型,图形和数据集之间的差异 [英] Jena/ARQ: Difference between Model, Graph and DataSet
问题描述
我开始使用Jena Engine,我想我已经掌握了语义是什么。
但是我很难理解在Jena和ARQ中代表一堆三元组的不同方式:
I'm starting to work with the Jena Engine and I think I got a grasp of what semantics are. However I'm having a hard time understanding the different ways to represent a bunch of triples in Jena and ARQ:
- 你开始时偶然发现的第一件事是
Model
,文档说明它的Jenas名称为RDF图。 - 但是还有
Graph
这似乎是我想查询模型联合的必要工具,但它似乎与Model $共享一个公共接口c $ c>,虽然可以从
模型中获得
Graph
- 然后在ARQ中有
DataSet
,它似乎也是某种三元组的集合。
- The first thing you stumble upon when starting is
Model
and the documentation says its Jenas name for RDF graphs. - However there is also
Graph
which seemed to be the necessary tool when I want to query a union of models, however it does not seem to share a common interface withModel
, although one can get theGraph
out of aModel
- Then there is
DataSet
in ARQ, which also seems to be a collection of triples of some sort.
当然,有些人在API中查找,我找到了以某种方式从一个转换为另一个的方法。但是我怀疑它有3个不同的界面可以用于同样的事情。
Sure, afer some looking around in the API, I found ways to somehow convert from one into another. However I suspect there is more to it than 3 different interfaces for the same thing.
所以,问题是:这三者之间的关键设计差异是什么?我什么时候应该使用哪一个?特别是:当我想要保持单个三元组但是将它们视为一大堆(联合)时,我应该使用哪些数据结构(以及为什么)?
另外,当我从一个转换到另一个时,我松散任何东西(例如 model.getGraph()
包含的信息少于 model
)?
So, question is: What are the key design differences between these three? When should I use which one ? Especially: When I want to hold individual bunches of triples but query them as one big bunch (union), which of these datastructures should I use (and why)?
Also, do I "loose" anything when "converting" from one into another (e.g. does model.getGraph()
contain less information in some way than model
)?
推荐答案
Jena分为API,面向应用程序开发人员,以及用于系统开发人员的SPI,例如制作存储引擎,reasoners等的人。
Jena is divided into an API, for application developers, and an SPI for systems developers, such as people making storage engines, reasoners etc.
DataSet
,型号
,声明
,资源
和 Literal
是API接口,为应用程序开发人员提供了许多便利。
DataSet
, Model
, Statement
, Resource
and Literal
are API interfaces and provide many conveniences for application developers.
DataSetGraph
, Graph
, Triple
, Node
是SPI接口。它们非常简洁,易于实现(如果你必须实现这些东西,你希望如此)。
DataSetGraph
, Graph
, Triple
, Node
are SPI interfaces. They're pretty spartan and simple to implement (as you'd hope if you've got to implement the things).
各种各样的API操作都解决了到SPI电话。举例来说, Model
interface 有四种不同的包含
方法。每个内部都会产生一个调用:
The wide variety of API operations all resolve down to SPI calls. To give an example the Model
interface has four different contains
methods. Internally each results in a call:
Graph#contains(Node, Node, Node)
例如
graph.contains(nodeS, nodeP, nodeO); // model.contains(s, p, o) or model.contains(statement)
graph.contains(nodeS, nodeP, Node.ANY); // model.contains(s, p)
关于丢失信息的问题,模型
和图形
你没有(据我记得)。更有趣的情况是资源
与节点
。 资源
知道它们属于哪个模型,这样你就可以(在api中)写 resource.addProperty(...)
最终成为 Graph #add
。 节点
没有这样的便利,并且与特定的 Graph
无关。因此资源#asNode
是有损的。
Concerning your question about losing information, with Model
and Graph
you don't (as far as I recall). The more interesting case is Resource
versus Node
. Resources
know which model they belong to, so you can (in the api) write resource.addProperty(...)
which becomes a Graph#add
eventually. Node
has no such convenience, and is not associated with a particular Graph
. Hence Resource#asNode
is lossy.
最后:
当我想要保存单个三元组但是将它们作为一大堆(联合)查询时,我应该使用哪些数据结构(以及为什么)?
When I want to hold individual bunches of triples but query them as one big bunch (union), which of these datastructures should I use (and why)?
您显然是普通用户,因此您需要API。您想存储三元组,因此请使用模型
。现在您要将模型作为一个联合查询:您可以:
You're clearly a normal user, so you want the API. You want to store triples, so use Model
. Now you want to query the models as one union: You could:
-
Model#union()
所有内容,它会将所有三元组复制到一个新模型中。 -
ModelFactory.createUnion()
所有内容,创建动态联合(即不复制)。 - 将模型作为命名模型存储在TDB或SDB数据集存储中,并使用
unionDefaultGraph
选项。
Model#union()
everything, which will copy all the triples into a new model.ModelFactory.createUnion()
everything, which will create a dynamic union (i.e. no copying).- Store your models as named models in a TDB or SDB dataset store, and use the
unionDefaultGraph
option.
这些最后一个适用于大量模型和大型模型,但更多涉及到设置。
The last of these works best for large numbers of models, and large model, but is a little more involved to set up.
这篇关于Jena / ARQ:模型,图形和数据集之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!