解析 RDF 项 [英] Parsing RDF items

查看:57
本文介绍了解析 RDF 项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几行(我认为)RDF 数据

I have a couple lines of (I think) RDF data

<http://www.test.com/meta#0001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> 
<http://www.test.com/meta#0002> <http://www.test.com/meta#CONCEPT_hasType> "BEAR"^^<http://www.w3.org/2001/XMLSchema#string>

每行有 3 个项目.我想在 URL 之前和之后拉出项目.所以这将导致:

Each line has 3 items in it. I want to pull out the item before and after the URL. So that would result in:

0001, type, Class
0002, CONCEPT_hasType, (BEAR, string)

是否有一个库(java 或 scala)可以为我进行这种拆分?还是我只需要在我的代码中加入 string.splits 和假设?

Is there a library out there (java or scala) that would do this split for me? Or do I just need to shove string.splits and assumptions in my code?

推荐答案

大多数 RDF 库都会有一些东西来促进这一点.例如,如果您使用 Eclipse RDF4JRio 解析器,您将返回每一行作为 org.eclipse.rdf4j.model.Statement,带有主语、谓语和宾语值.两行中的主题都是 org.eclipse.rdf4j.model.IRI,它有一个 getLocalName() 方法,您可以使用它来获取最后一个后面的部分#.有关更多详细信息,请参阅 Javadocs.

Most RDF libraries will have something to facilitate this. For example, if you parse your RDF data using Eclipse RDF4J's Rio parser, you will get back each line as a org.eclipse.rdf4j.model.Statement, with a subject, predicate and object value. The subject in both your lines will be an org.eclipse.rdf4j.model.IRI, which has a getLocalName() method you can use to get the part behind the last #. See the Javadocs for more details.

假设您的数据采用 N-Triples 语法(您向我们展示的示例似乎已经给出了这种语法),下面是一段简单的代码,可以执行此操作并将其打印到 STDOUT:

Assuming your data is in N-Triples syntax (which it seems to be given the example you showed us), here's a simple bit of code that does this and prints it out to STDOUT:

  // parse the file into a Model object
  InputStream in = new FileInputStream(new File("/path/to/rdf-data.nt"));
  org.eclipse.rdf4j.model.Model model = Rio.parse(in, RDFFormat.NTRIPLES);

  for (org.eclipse.rdf4j.model.Statement st: model) {
       org.eclipse.rdf4j.model.Resource subject = st.getSubject();
       if (subject instanceof org.eclipse.rdf4j.model.IRI) {
              System.out.print(((IRI)subject).getLocalName());
       }
       else {
              System.out.print(subject.stringValue());
       }
       // ... etc for predicate and object (the 2nd and 3rd elements in each RDF statement)
  }

Update 如果您不想从文件中读取数据而只是使用 String,您可以只使用 java.io.StringReader 而不是 InputStream:

Update if you don't want to read data from a file but simply use a String, you could just use a java.io.StringReader instead of an InputStream:

 StringReader r = new StringReader("<http://www.test.com/meta#0001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .");
 org.eclipse.rdf4j.model.Model model = Rio.parse(r, RDFFormat.NTRIPLES);

或者,如果你根本不想解析数据,只想做String处理,有一个org.eclipse.rdf4j.model,URIUtil 类,你可以只提供一个字符串,它可以给你返回本地名称部分的索引:

Alternatively, if you don't want to parse the data at all and just want to do String processing, there is a org.eclipse.rdf4j.model,URIUtil class which you can just feed a string and it can give you back the index of the local name part:

  String uri = "http://www.test.com/meta#0001";
  String localpart = uri.substring(URIUtil.getLocalNameIndex(uri));  // will be "0001" 

(披露:我在 RDF4J 开发团队)

(disclosure: I am on the RDF4J development team)

这篇关于解析 RDF 项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆