使用XSLT将CSV转换为分层XML [英] Converting CSV to hierarchichal XML using XSLT

查看:80
本文介绍了使用XSLT将CSV转换为分层XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要创建一个XSLT来将CSV(逗号分隔的文件)转换为分层XML.

I need to create an XSLT to convert a CSV (comma separated file) into hierarchical XML.

这是输入文件:

<root>
L11,L12,L21,L22,L31,L32
1,A,1,C,1,G
1,A,1,C,2,H
1,A,2,D,1,I
1,A,2,D,2,J
2,B,1,E,1,K
2,B,1,E,2,L
2,B,2,F,1,M
2,B,2,F,2,N
</root>

这是所需的输出XML:

This is desired output XML:

<?xml version="1.0" encoding="utf-8"?>
<Document>
  <Level1>
    <L11>1</L11>
    <L12>A</L12>
    <Level2>
      <L21>1</L11>
      <L22>C</L12>
      <Level3>
        <L31>1</L31>
        <L32>G</L32>
      </Level3>
      <Level3>
        <L31>2</L31>
        <L32>H</L32>
      </Level3>
    </Level2>
    <Level2>
      <L21>2</L11>
      <L22>D</L12>
      <Level3>
        <L31>1</L31>
        <L32>I</L32>
      </Level3>
      <Level3>
        <L31>2</L31>
        <L32>J</L32>
      </Level3>
    </Level2>
  </Level1>
  <Level1>
    <L11>2</L11>
    <L12>B</L12>
    <Level2>
      <L21>1</L11>
      <L22>E</L12>
      <Level3>
        <L31>1</L31>
        <L32>K</L32>
      </Level3>
      <Level3>
        <L31>2</L31>
        <L32>L</L32>
      </Level3>
    </Level2>
    <Level2>
      <L21>2</L11>
      <L22>F</L12>
      <Level3>
        <L31>1</L31>
        <L32>M</L32>
      </Level3>
      <Level3>
        <L31>2</L31>
        <L32>N</L32>
      </Level3>
    </Level2>
  </Level1>
</Document>

我一直试图在网上找到一些例子,但是找不到类似的东西.我以前从未进行过XSLT转换,因此,如果您能指出正确的方向,我将不胜感激.

I've been trying to find some example online, however couldn't find anything similar. I've never done XSLT transformations before so I'd appreciate if you could point me in the right direction.

更新1 :我正在考虑进行两步转换.例如.第一步是将CSV转换为XML:

Update 1: I am thinking of a 2-step transformation. E.g. first step is to transform CSV to XML:

<?xml version="1.0" encoding="utf-8"?>
<Document>
  <row><L11>1</L11><L12>A</L12><L21>1</L12><L31>C</L31><L32>1</L31><L32>G</L32></row>
  <row><L11>1</L11><L12>A</L12><L21>1</L12><L31>C</L31><L32>2</L31><L32>H</L32></row>
  <row><L11>1</L11><L12>A</L12><L21>2</L12><L31>D</L31><L32>1</L31><L32>I</L32></row>
  <row><L11>1</L11><L12>A</L12><L21>2</L12><L31>D</L31><L32>2</L31><L32>J</L32></row>
  <row><L11>2</L11><L12>B</L12><L21>1</L12><L31>E</L31><L32>1</L31><L32>K</L32></row>
  <row><L11>2</L11><L12>B</L12><L21>1</L12><L31>E</L31><L32>2</L31><L32>L</L32></row>
  <row><L11>2</L11><L12>B</L12><L21>2</L12><L31>F</L31><L32>1</L31><L32>M</L32></row>
  <row><L11>2</L11><L12>B</L12><L21>2</L12><L31>F</L31><L32>2</L31><L32>N</L32></row>   
</Document>

第二步是使用某种分组将XML转换为所需的格式. 如果没有其他方法可以实现,那么我不介意进行2次转换.

And a second step is to transform that XML into the desired format using some sort of grouping. I don't mind having 2 transformations if there's no other way to achieve that.

有什么建议吗?

更新2 :将使用Microsoft .NET Framework XSLT处理器.

Update 2: Microsoft .NET Framework XSLT processor will be used.

如果难以理解抽象示例,则可以在此处查看所需转换的真实示例:

If the abstract example is hard to read you can see a real-life example of the required transformation here: http://servingxml.sourceforge.net/examples/#timesheets-eg

据我了解,使用单一转换是不可能的,因此,如果有人可以向我展示如何将XML从 Update 1 格式转换为所需的XML格式,则将完成一半的工作,并且我会接受这个答案.

As I understand, using a single transformation is impossible, so if someone could show me how to transform an XML from the Update 1 format to the desired XML format, half of the job would done and I will accept that answer.

推荐答案

为清楚起见,我对输入内容进行了一些修改,以使标签有意义:

For clarity, I have modified your input slightly, so that the labels make some sense:

XML

<root>
GroupName,GroupValue,SubGroupName,SubGroupValue,ItemName,ItemValue
1,A,1,C,1,G
1,A,1,C,2,H
1,A,2,D,1,I
1,A,2,D,2,J
2,B,1,E,1,K
2,B,1,E,2,L
2,B,2,F,1,M
2,B,2,F,2,N
</root>

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:key name="k1" match="row" use="cell[1]"/>
<xsl:key name="k2" match="row" use="concat(cell[1], '|', cell[3])"/>

<xsl:template match="/">
    <!-- tokenize csv -->
    <xsl:variable name="rows">
        <xsl:call-template name="tokenize">
            <xsl:with-param name="text" select="root"/>
        </xsl:call-template>
    </xsl:variable>
    <xsl:variable name="data">
        <xsl:for-each select="exsl:node-set($rows)/row[position() > 1]">
            <row>
                <xsl:call-template name="tokenize">
                    <xsl:with-param name="text" select="."/>
                    <xsl:with-param name="delimiter" select="','"/>
                    <xsl:with-param name="name" select="'cell'"/>
                </xsl:call-template>
            </row>
        </xsl:for-each>
    </xsl:variable>
    <!-- output -->
    <document>
        <xsl:for-each select="exsl:node-set($data)/row[count(. | key('k1', cell[1])[1]) = 1]">
            <group>
                <name>
                    <xsl:value-of select="cell[1]"/>
                </name>
                <value>
                    <xsl:value-of select="cell[2]"/>
                </value>
                <xsl:for-each select="key('k1', cell[1])[count(. | key('k2', concat(cell[1], '|', cell[3]))[1]) = 1]">
                    <subgroup>
                        <name>
                            <xsl:value-of select="cell[3]"/>
                        </name>
                        <value>
                            <xsl:value-of select="cell[4]"/>
                        </value>
                        <items>
                            <xsl:for-each select="key('k2', concat(cell[1], '|', cell[3]))">
                                <item>
                                    <name>
                                        <xsl:value-of select="cell[5]"/>
                                    </name>
                                    <value>
                                        <xsl:value-of select="cell[6]"/>
                                    </value>
                                </item>
                            </xsl:for-each>
                        </items>
                    </subgroup>
                </xsl:for-each>
            </group>
        </xsl:for-each>
    </document>
</xsl:template>

<xsl:template name="tokenize">
    <xsl:param name="text"/>
    <xsl:param name="delimiter" select="'&#10;'"/>
    <xsl:param name="name" select="'row'"/>
    <xsl:variable name="token" select="substring-before(concat($text, $delimiter), $delimiter)" />
    <xsl:if test="$token">
        <xsl:element name="{$name}">
            <xsl:value-of select="$token"/>
        </xsl:element>
    </xsl:if>
    <xsl:if test="contains($text, $delimiter)">
        <!-- recursive call -->
        <xsl:call-template name="tokenize">
            <xsl:with-param name="text" select="substring-after($text, $delimiter)"/>
            <xsl:with-param name="delimiter" select="$delimiter"/>
            <xsl:with-param name="name" select="$name"/>
        </xsl:call-template>
    </xsl:if>
</xsl:template>

</xsl:stylesheet>

结果

<?xml version="1.0" encoding="UTF-8"?>
<document>
   <group>
      <name>1</name>
      <value>A</value>
      <subgroup>
         <name>1</name>
         <value>C</value>
         <items>
            <item>
               <name>1</name>
               <value>G</value>
            </item>
            <item>
               <name>2</name>
               <value>H</value>
            </item>
         </items>
      </subgroup>
      <subgroup>
         <name>2</name>
         <value>D</value>
         <items>
            <item>
               <name>1</name>
               <value>I</value>
            </item>
            <item>
               <name>2</name>
               <value>J</value>
            </item>
         </items>
      </subgroup>
   </group>
   <group>
      <name>2</name>
      <value>B</value>
      <subgroup>
         <name>1</name>
         <value>E</value>
         <items>
            <item>
               <name>1</name>
               <value>K</value>
            </item>
            <item>
               <name>2</name>
               <value>L</value>
            </item>
         </items>
      </subgroup>
      <subgroup>
         <name>2</name>
         <value>F</value>
         <items>
            <item>
               <name>1</name>
               <value>M</value>
            </item>
            <item>
               <name>2</name>
               <value>N</value>
            </item>
         </items>
      </subgroup>
   </group>
</document>

注意:

  1. 元素名称被硬编码到样式表中,而不是从输入中获取(尽管也可以花更多的精力来完成);

  1. The element names are hard-coded into the stylesheet and not taken from the input (although that too would be possible with more effort);

您可能必须使用msxsl:node-set()函数而不是EXSLT函数.

You may have to use the msxsl:node-set() function instead of the EXSLT one.

这篇关于使用XSLT将CSV转换为分层XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆