以 Apache Jena Fuseki 为目标的 Python 中 listOfDict 到 RDF 的转换 [英] listOfDict to RDF conversion in python targeting Apache Jena Fuseki
问题描述
要从 python 将一些数据存储在 Apache Jena 中,我希望进行从 Dicts 列表到 RDF 的通用转换,并可能返回查询.
To store some data in Apache Jena from python I'd like to have a generic conversion from a list of Dicts to RDF and possibly back on query.
对于 Dict 到 RDF 部分的列表,我尝试实现insertListofDicts";(见下文)并用testListOfDictInsert"测试它(见下文).结果如下,当使用 Apache Jena Fuseki 服务器尝试时会导致 400: Bad Request.
For the list of Dict to RDF part I tried implementing "insertListofDicts" (see below) and tested it with "testListOfDictInsert" (see below). The result is below which leads to a 400: Bad Request when tried with an Apache Jena Fuseki server.
需要为简单字符串类型修复什么 - 其他原始 Python 类型可能需要修复什么才能使其正常工作?
另请在以下位置找到源代码:
Please also find the source code at:
- https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/dg/jena.py
- https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/tests/testJena.py
@prefix foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#name "Elizabeth Alexandra Mary Windsor".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#born "1926-04-21".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q9682".
foaf:Person/George+of+Cambridge foaf:Person#name "George of Cambridge".
foaf:Person/George+of+Cambridge foaf:Person#born "2013-07-22".
foaf:Person/George+of+Cambridge foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q1359041".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#name "Harry Duke of Sussex".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#born "1984-09-15".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q152316".
}
testListOfDictInsert
def testListOfDictInsert(self):
'''
test inserting a list of Dicts using FOAF example
https://en.wikipedia.org/wiki/FOAF_(ontology)
'''
listofDicts=[
{'name': 'Elizabeth Alexandra Mary Windsor', 'born': '1926-04-21', 'age': 94, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
{'name': 'George of Cambridge', 'born': '2013-07-22', 'age': 7, 'ofAge': False, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
{'name': 'Harry Duke of Sussex', 'born': '1984-09-15', 'age': 36, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
]
jena=self.getJena(mode='update',debug=True)
jena.insertListOfDicts(listofDicts,'foaf:Person','name','@prefix foaf: <http://xmlns.com/foaf/0.1/>')
insertListofDicts
def insertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
'''
insert the given list of dicts mapping datatypes according to
https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
mapped from
https://docs.python.org/3/library/stdtypes.html
compare to
https://www.w3.org/2001/sw/rdb2rdf/directGraph/
http://www.bobdc.com/blog/json2rdf/
https://www.w3.org/TR/json-ld11-api/#data-round-tripping
https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
'''
errors=[]
insertCommand='%s\nINSERT DATA {\n' % prefixes
for index,record in enumerate(listOfDicts):
if not primaryKey in record:
errors.append["missing primary key %s in record %d",index]
else:
primaryValue=record[primaryKey]
encodedPrimaryValue=urllib.parse.quote_plus(primaryValue)
tSubject="%s/%s" %(entityType,encodedPrimaryValue)
for keyValue in record.items():
key,value=keyValue
valueType=type(value)
if self.debug:
print("%s(%s)=%s" % (key,valueType,value))
tPredicate="%s#%s" % (entityType,key)
tObject=value
if valueType == str:
insertCommand+=' %s %s "%s".\n' % (tSubject,tPredicate,tObject)
insertCommand+="\n}"
if self.debug:
print (insertCommand)
self.insert(insertCommand)
return errors
推荐答案
+
是 HTTP Form 编码中用于空格的特殊字符,但它只能用于 application/x-www-form-urlencoded
.
+
is the special character in HTTP Form encoding for a space but it should only be used in application/x-www-form-urlencoded
.
对于 URI,请使用 %20
或决定替换字符,例如 _
代替空格,因为它看起来有点像空格.
For URIs, use %20
or decide on a replacement character such as _
for space because it looks a bit like a space.
在所有这些情况下,URI 中都没有空格字符 - 有 +
、%20
(三个字符)或 _代码>.它是编码,而不是转义机制.
In all these cases, there is not a space character in the URI - there is a +
, %20
(three characters) or _
. It is encoding, not an escape mechanism.
这篇关于以 Apache Jena Fuseki 为目标的 Python 中 listOfDict 到 RDF 的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!