如何通过推理在Apache Jena中实现名称空间之间的映射? [英] Ho to achieve Mapping between namespaces in Apache Jena thru Reasoning?
问题描述
我不会实现本体之间的基于规则的映射,以完成数据迁移的常见任务.
I wan´t to achieve a rule based Mapping between ontologies in order to fulfill a common task of data migration.
为此,我开发了一种抽象的数据结构,该结构能够存储任何数据类型的xml表示形式提供的所有信息.然后,我编写了一个解析器,该解析器根据目标文档类型的定义构建了本体.现在,当我读取其中的数据时,该数据首先与 abstractDatatype 命名空间相关联,可以称之为 aS . 目标数据结构位于名称空间 tS 中.
To achieve this i developed a abstract data structure which is capable to store all information provided by the xml representation of any datatype. Then i wrote a parser, which constructs a ontology out of targeted document-type definition. Now when i read the data in it is first associated to the abstractDatatype namespace, lets call it aS. The targeted data structure lies in the namespace tS.
如果我尝试通过类似的规则在具有相同名称但名称空间不同的两个资源之间表达类型平等:
If i try to express type equity between two Resources with same name but different namespace via a rule like that:
[mappingRule1: (aS:?a rdf:type aS:?b) (tS:?c rdf:type tS:?b) -> (aS:?a rdf:type tS:?b)]
推理机无法理解.规则中可能存在错误,应将其解释为:如果存在与aS中相同的类型名称映射到不同名称空间tS,则 aS 的所有个人也将获得 tS 中的相同类型 另一个问题是,如果没有某种类型的个人,这种规则可能行不通,而且我被告知像那样表达规则可能还不够.几乎可以选择的是,我也可以创建SubClassOf规则,在所有组合之间进行映射,但这会在模型中产生很多 dirt ,并且我希望能够添加更多的过滤条件,而不是更一般.
the reasoner does not get it. Maybe there is a mistake in the rule, which should be interpreted as: if there is the same typename mapped to the different namespace tS as it is in aS, all individuals of aS get also the same type in tS The other problem is that this kind of rule might not work if there are no individuals of a type and i´ve been told that expressing it like that might not be sufficient. Nearly alternatively i could also create SubClassOf rules which do the mapping between all combinations, but that would produce a lot of dirt in the model and i would like to be able to add even more filtering conditions instead of making more general.
但是,如果有人对基于规则的本体映射有一定的经验,我将很高兴获得一些见解.
However if, someone has some experience with rule based ontology mapping, i will be very glad to get some insights.
这是一个Java单元测试,它演示了不起作用映射问题:
Here is a java unit test that demonstrates the not working mapping problem:
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertTrue;
import java.io.BufferedOutputStream;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.junit.Before;
import org.junit.Test;
import com.hp.hpl.jena.rdf.model.InfModel;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.Statement;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.reasoner.Derivation;
import com.hp.hpl.jena.reasoner.Reasoner;
import com.hp.hpl.jena.reasoner.ReasonerRegistry;
import com.hp.hpl.jena.reasoner.rulesys.GenericRuleReasoner;
import com.hp.hpl.jena.reasoner.rulesys.Rule;
import com.hp.hpl.jena.util.PrintUtil;
import com.hp.hpl.jena.vocabulary.RDF;
import com.hp.hpl.jena.vocabulary.RDFS;
public class ReasonerTest {
String aS = "http://www.custom.eu/abstractDatascheme#";
String tS = "http://www.custom.eu/targetDatascheme#";
Model model = ModelFactory.createDefaultModel();
InfModel inf;
Resource AA = model.createResource(aS + "A");
Resource AB = model.createResource(aS + "B");
Resource AC = model.createResource(aS + "C");
Resource AD = model.createResource(aS + "D");
Resource TA = model.createResource(tS + "A");
Resource TB = model.createResource(tS + "B");
Property p = model.createProperty(aS, "p");
Property q = model.createProperty(aS, "q");
@Before
public void init() {
PrintUtil.registerPrefix("aS", aS);
PrintUtil.registerPrefix("tS", tS);
AA.addProperty(p, "foo");
// Get an RDFS reasoner
GenericRuleReasoner rdfsReasoner = (GenericRuleReasoner) ReasonerRegistry.getRDFSReasoner();
// Steal its rules, and add one of our own, and create a reasoner with these rules
List<Rule> rdfRules = new ArrayList<>( rdfsReasoner.getRules() );
List<Rule> rules = new ArrayList<>();
String customRules = "[transitiveRule: (?a aS:p ?b) (?b aS:p ?c) -> (?a aS:p ?c)] \n" +
"[mappingRule1: (aS:?a rdf:type aS:?b) (tS:?c rdf:type tS:?b) -> (aS:?a rdf:type tS:?b)] \n" +
"[mappingRule2a: -> (aS:?a rdfs:subClassOf tS:?a)] \n" +
"[mappingRule2b: -> (tS:?a rdfs:subClassOf aS:?a)]";
rules.addAll(rdfRules);
rules.add(Rule.parseRule(customRules));
Reasoner reasoner = new GenericRuleReasoner(rules);
reasoner.setDerivationLogging(true);
inf = ModelFactory.createInfModel(reasoner, model);
}
@Test
public void mapping() {
AA.addProperty(RDF.type, model.createResource(aS + "CommonType"));
TA.addProperty(RDF.type, model.createResource(tS + "CommonType"));
String trace = null;
trace = getDerivations(trace, AA, RDF.type, TA);
assertNotNull(trace);
}
private String getDerivations(String trace, Resource subject, Property predicate, Resource object) {
PrintWriter out = new PrintWriter(System.out);
for (StmtIterator i = inf.listStatements(subject, predicate, object); i.hasNext(); ) {
Statement s = i.nextStatement();
System.out.println("Statement is " + s);
for (Iterator<Derivation> id = inf.getDerivation(s); id.hasNext(); ) {
Derivation deriv = (Derivation) id.next();
deriv.printTrace(out, true);
trace += deriv.toString();
}
}
out.flush();
return trace;
}
@Test
public void subProperty() {
// Hierarchy
model.add(p, RDFS.subPropertyOf, q);
StmtIterator stmts = inf.listStatements(AA, q, (RDFNode) null);
assertTrue(stmts.hasNext());
while (stmts.hasNext()) {
System.out.println("Statement: " + stmts.next());
}
}
@Test
public void derivation() {
// Derivations
AA.addProperty(p, AB);
AB.addProperty(p, AC);
AC.addProperty(p, AD);
String trace = null;
trace = getDerivations(trace, AA, p, AD);
assertNotNull(trace);
}
@Test
public void derivations() {
String trace = null;
PrintWriter out = new PrintWriter(System.out);
for (StmtIterator i = inf.listStatements(); i.hasNext(); ) {
Statement s = i.nextStatement();
System.out.println("Statement is " + s);
for (Iterator<Derivation> id = inf.getDerivation(s); id.hasNext(); ) {
Derivation deriv = (Derivation) id.next();
deriv.printTrace(out, true);
trace += deriv.toString();
}
}
out.flush();
assertNotNull(trace);
}
@Test
public void listStatements() {
StmtIterator stmtIterator = inf.listStatements();
while (stmtIterator.hasNext()) {
System.out.println(stmtIterator.nextStatement());
}
}
@Test
public void listRules() {
List<Rule> rules = ((GenericRuleReasoner) inf.getReasoner()).getRules();
for (Rule rule : rules) {
System.out.println(rule.toString());
}
}
@Test
public void saveDerivation() {
DataOutputStream out1;
try {
out1 = new DataOutputStream(new BufferedOutputStream(new FileOutputStream("target/test-output/testOnto.owl")));
inf.write(out1);
}
catch (IOException ex) {
Logger.getLogger(ReasonerTest.class.getName()).log(Level.SEVERE, null, ex);
}
}
@Test
public void printRdfRules() {
GenericRuleReasoner rdfsReasoner = (GenericRuleReasoner) ReasonerRegistry.getRDFSReasoner();
List<Rule> customRules = new ArrayList<>(rdfsReasoner.getRules());
PrintWriter writer = null;
try {
File directory = new File("target/test-output/");
if (!directory.exists()) {
directory.mkdir();
}
writer = new PrintWriter("target/test-output/rfd.rules", "UTF-8");
}
catch (IOException ex) {
Logger.getLogger(ReasonerTest.class.getName()).log(Level.SEVERE, null, ex);
}
for (Rule customRule : customRules) {
writer.println(customRule.toString());
}
writer.close();
}
}
推荐答案
您不能只执行ns:?x
并期望它与字符串形式以ns:
所代表的URI资源匹配并绑定m:Person
被用作类型,并且x:a a n:Person
在数据中,并且m:Person
和n:Person
具有相同的后缀,前缀为n:
和m:
,并推断出结果是x:a a m:Person
.
You can't just do ns:?x
and expect it to match URI resource whose string form begins with whatever ns:
stand for, and to bind ?x
to the remainder (or to the whole thing). If you want to use a rule that looks at the string forms of URIs, you'll have to get their string form with strConcat, and do some matching and extraction with regex. Here's an example that sees that m:Person
is used as type, and that x:a a n:Person
is in the data, and that m:Person
and n:Person
have the same suffix with prefixes n:
and m:
, and infers that x:a a m:Person
as a result.
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import com.hp.hpl.jena.rdf.model.InfModel;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.reasoner.Reasoner;
import com.hp.hpl.jena.reasoner.rulesys.GenericRuleReasoner;
import com.hp.hpl.jena.reasoner.rulesys.Rule;
import com.hp.hpl.jena.util.PrintUtil;
public class TypeMappingExample {
public static void main(String[] args) throws IOException {
PrintUtil.registerPrefix( "n", "urn:ex:n/" );
PrintUtil.registerPrefix( "m", "urn:ex:m/" );
String content = "\n" +
"@prefix n: <urn:ex:n/>.\n" +
"@prefix m: <urn:ex:m/>.\n" +
"@prefix x: <urn:ex:x/>" +
"\n" +
"x:a a n:Person.\n" +
"x:b a m:Person.\n" +
"";
Model model = ModelFactory.createDefaultModel();
try ( InputStream in = new ByteArrayInputStream( content.getBytes() )) {
model.read( in, null, "TTL" );
}
String rule = "\n" +
"[strConcat(n:,'(.*)',?nprefix),\n" +
" strConcat(m:,'(.*)',?mprefix),\n" +
" (?x rdf:type ?ntype), strConcat(?ntype,?ntypestr),\n" +
" (?y rdf:type ?mtype), strConcat(?mtype,?mtypestr)," +
" regex(?ntypestr,?nprefix,?nsuffix),\n" +
" regex(?mtypestr,?mprefix,?msuffix),\n" +
" equal(?nsuffix,?msuffix)\n" +
" -> \n" +
"(?x rdf:type ?mtype)]";
Reasoner reasoner = new GenericRuleReasoner( Rule.parseRules( rule ));
InfModel imodel = ModelFactory.createInfModel( reasoner, model );
imodel.write( System.out, "TTL" );
}
}
@prefix n: <urn:ex:n/> .
@prefix m: <urn:ex:m/> .
@prefix x: <urn:ex:x/> .
x:a a m:Person , n:Person .
x:b a m:Person .
如您所见,字符串处理相当粗糙;耶拿(Jena)的内建函数实际上是为从URI等获取字符串而设计的.某些SPARQL函数会使此操作变得容易一些,但仍会有点不雅致,因为IRI实际上应该是 opaque 标识符.
As you can see, the string processing is rather rough; Jena's builtins are really designed for getting strings from URIs, etc. Some of the SPARQL functions would make this a bit easier, but it'll still be a bit inelegant, because IRIs are really supposed to be opaque identifiers.
一个简单得多的解决方案是确保所有类都具有标签,并说两个类具有相同的标签,然后一个实例是另一个实例.如果您充分利用了rdfs:isDefinedBy,则可以使用以下代码使其变得非常漂亮:
A much easier solution would be to make sure that all the classes have labels, and say that two classes have the same label, then instances of one are instances of the other. If you've made good use of rdfs:isDefinedBy, you can make this very slick, with something like:
[(?c1 a rdfs:Class) (?c1 rdfs:isDefinedBy ?ont1) (?c1 rdfs:label ?name)
(?c2 a rdfs:Class) (?c2 rdfs:isDefinedBy ?ont2) (?c2 rdfs:label ?name)
->
[(?x rdf:type ?c1) -> (?x rdf:type ?c2)]]
这篇关于如何通过推理在Apache Jena中实现名称空间之间的映射?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!