将CSV转换为RDF,其中一列是一组值 [英] Converting a CSV to RDF where one column is a set of values

查看:100
本文介绍了将CSV转换为RDF,其中一列是一组值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将CSV转换为RDF.

I want to convert a CSV to RDF.

实际上,该CSV列中的一列是一组带有分隔符(在我的情况下为空格字符)的值.

One of the column of that CSV is, in fact, a set of values joined with a separator character (in my case, the space character).

这是示例CSV(带标题):

Here is a sample CSV (with header):

col1,col2,col3
"A","B C D","John"
"M","X Y Z","Jack"

我希望转换过程创建类似于以下内容的RDF:

I would like the conversion process to create a RDF similar to this:

:A :aProperty :B, :C, :D; :anotherProperty "John".
:M :aProperty :X, :Y, :Z; :anotherProperty "Jack".

我通常使用Tarql进行CSV转换.
每行都可以进行迭代.
但是它没有功能可以在列值的内部"进行子迭代.

I usually use Tarql for CSV conversion.
It is fine to iterate per row.
But it has no feature to sub-iterate "inside" a column value.

SPARQL-Generate可能会有所帮助(据我所知,使用iter:regex和sub-generate).但是我找不到与我的用例匹配的示例.

SPARQL-Generate may help (with iter:regex and sub-generate, as far as a I understand). But I cannot find any example that matches my use case.

PS:也许RML也可以提供帮助.但是我对此技术没有任何了解.

PS: may be RML can help too. But I have no prior knowledge of this technology.

推荐答案

您可以使用 RML FnO .

首先,我们需要访问RML可以完成的每一行.RML允许您使用 LogicalSource .指定迭代器( rml:iterator )不需要,因为RML中的默认迭代器是基于行的迭代器.这将导致以下RDF(海龟):

First, we need to access each row which can be accomplished with RML. RML allows you to iterate over each row of the CSV file (ql:CSV) with a LogicalSource. Specifying the iterator (rml:iterator) is not needed since the default iterator in RML is a row-based iterator. This results into the following RDF (Turtle):

<#LogicalSource>
    a rml:LogicalSource;
    rml:source "data.csv";
    rml:referenceFormulation ql:CSV.

实际的三元组是在 TriplesMap 的帮助下生成的使用LogicalSource从每个CSV行中检索数据:

The actually triples are generated with the help of a TriplesMap which uses the LogicalSource to retrieve the data from each CSV row:

<#MyTriplesMap>
    a rr:TriplesMap;
    rml:logicalSource <#LogicalSource>;

    rr:subjectMap [
        rr:template "http://example.org/{col1}";
    ];

    rr:predicateObjectMap [
        rr:predicate ex:aProperty;
        rr:objectMap <#FunctionMap>;
    ];

    rr:predicateObjectMap [
        rr:predicate ex:anotherProperty;
        rr:objectMap [
            rml:reference "col3";
        ];
    ].

col3 CSV列用于创建以下三元组:

The col3 CSV column be used to create the following triple:

<http://example.org/A> <http://example.org/ns#anotherProperty> "John".

但是,CSV列 col2 中的字符串需要首先分割.这可以通过Fno(功能本体)和RML处理器来实现.支持FnO功能的执行.这样的RML处理器可以是 RML映射器,但其他处理器可以也被使用.需要以下RDF来调用FnO函数,该函数将输入分割字符串,以空格作为分隔符,我们的LogicalSource作为输入数据:

However, the string in the CSV column col2 needs to be split first. This can be achieved with Fno (Function Ontology) and an RML processor which supports the execution of FnO functions. Such RML processor can be the RML Mapper, but other processors can be used too. The following RDF is needed to invoke an FnO function which splits the input string with a space as separator with our LogicalSource as input data:

<#FunctionMap>
    fnml:functionValue [
        rml:logicalSource <#LogicalSource>; # our LogicalSource
        rr:predicateObjectMap [
            rr:predicate fno:executes; 
            rr:objectMap [ 
                rr:constant grel:string_split # function to use
            ];
        ];
        rr:predicateObjectMap [
            rr:predicate grel:valueParameter;
            rr:objectMap [ 
                rml:reference "col2" # input string
            ];
        ];
        rr:predicateObjectMap [
            rr:predicate grel:p_string_sep;
            rr:objectMap [ 
                rr:constant " "; # space separator
            ];
        ];
    ].

RML映射器支持的FnO功能在此处可用: https://rml.io/docs/rmlmapper/default-functions/您可以在该页面上找到函数名称及其参数.

The supported FnO functions by the RML mapper are available here: https://rml.io/docs/rmlmapper/default-functions/ You can find the function name and its parameters on that page.

映射规则

@base <http://example.org> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#> .
@prefix fno: <https://w3id.org/function/ontology#> .
@prefix grel: <http://users.ugent.be/~bjdmeest/function/grel.ttl#> .
@prefix ex: <http://example.org/ns#> .

<#LogicalSource>
    a rml:LogicalSource;
    rml:source "data.csv";
    rml:referenceFormulation ql:CSV.


<#MyTriplesMap>
    a rr:TriplesMap;
    rml:logicalSource <#LogicalSource>;

    rr:subjectMap [
        rr:template "http://example.org/{col1}";
    ];

    rr:predicateObjectMap [
        rr:predicate ex:aProperty;
        rr:objectMap <#FunctionMap>;
    ];

    rr:predicateObjectMap [
        rr:predicate ex:anotherProperty;
        rr:objectMap [
            rml:reference "col3";
        ];
    ].

<#FunctionMap>
    fnml:functionValue [
        rml:logicalSource <#LogicalSource>;
        rr:predicateObjectMap [
            rr:predicate fno:executes; 
            rr:objectMap [ 
                rr:constant grel:string_split 
            ];
        ];
        rr:predicateObjectMap [
            rr:predicate grel:valueParameter;
            rr:objectMap [ 
                rml:reference "col2" 
            ];
        ];
        rr:predicateObjectMap [
            rr:predicate grel:p_string_sep;
            rr:objectMap [ 
                rr:constant " ";
            ];
        ];
    ].

输出

<http://example.org/A> <http://example.org/ns#aProperty> "B".
<http://example.org/A> <http://example.org/ns#aProperty> "C".
<http://example.org/A> <http://example.org/ns#aProperty> "D".
<http://example.org/A> <http://example.org/ns#anotherProperty> "John".
<http://example.org/M> <http://example.org/ns#aProperty> "X".
<http://example.org/M> <http://example.org/ns#aProperty> "Y".
<http://example.org/M> <http://example.org/ns#aProperty> "Z".
<http://example.org/M> <http://example.org/ns#anotherProperty> "Jack".

注意:我为RML及其技术做出了贡献.

Note: I contribute to RML and its technologies.

这篇关于将CSV转换为RDF,其中一列是一组值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆