XSL - 从文本文件创建格式良好的 xml [英] XSL - create well formed xml from text file

查看:24
本文介绍了XSL - 从文本文件创建格式良好的 xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个管道分隔的文本文件,如下所示,我需要使用 xsl 将其转换为格式良好的 xml 结构(如下所示的示例).下面的 xsl 是我解决这个问题的(最新)尝试 - 但是我似乎无法找到一种方法将 002 级元素封装在 001 级中,即在逐行遍历文件时保持父子关系.有人可以帮忙吗?

I have a pipe delimited text file as shown below, which I need to transform into a well formed xml structure (example shown below) using xsl. The xsl below is my (latest) attempt at solving this - however I cannot seem to find a way to encapsulate the level 002 elements in level 001, i.e. maintain the parent-child relationship, when iterating through the file line by line. Could anyone help here ?

管道分隔文件 - 输入

Pipe delimited file - input

001|XXX|YYY
002|AAA|BBB
002|CCC|DD
001|EEF|XXX
002|HHH|GGG

XML 文件 - 所需的输出

XML File - desired output

<root>
   <level001>
            <elem name="field1">001</elem>
            <elem name="field2">XXX</elem>
            <elem name="field3">YYY</elem>
            <level002>
                           <elem name="field1">002</elem>
                           <elem name="field2">AAA</elem>
                           <elem name="field3">BBB</elem>
             </level002>
             <level002>
                        <elem name="field1">002</elem>
                        <elem name="field2">CCC</elem>
                        <elem name="field3">DD</elem>
              </level002>
    </level001>
    <level001>
                 <elem name="field1">001</elem>
                 <elem name="field2">XXX</elem>
                <elem name="field3">YYY</elem>
                <level002>
                         <elem name="field1">002</elem>
                         <elem name="field2">HHH</elem>
                         <elem name="field3">GG</elem>
               </level002>
    </level001>
</root>

当前的 XSL

<xsl:variable name="Cols">
<col>field1,1</col>
<col>field2,2</col>
<col>field3,3</col> 
</xsl:variable>


 <xsl:template match="/" name="main">
<xsl:choose>
    <xsl:when test="unparsed-text-available($pathToCSV, $encoding)">
       <xsl:variable name="csv" select="unparsed-text($pathToCSV, $encoding)" />
       <xsl:variable name="lines" select="tokenize($csv, '\n')" as="xs:string+" />
       <root>
       <xsl:for-each select="$lines[position() &gt; 0]">
        <xsl:if test="translate(., '&#160; &#9;&#10;&#13;',  '') != ''">
            <level001>
            <xsl:variable name="line" select="." />
            <xsl:variable name="columns" select="tokenize(.,'\|')" as="xs:string+"/>    
            <xsl:choose>
                <xsl:when test="$columns[1]='001'">
                    <xsl:for-each select="$Cols/col">
                        <xsl:variable name="column" select="number(substring-after(.,','))"/>
                        <elem name="{substring-before(.,',')}">
                            <!-- trims the whitespace from the beginning and the ending of the value -->
                            <xsl:value-of select="replace(replace($columns[$column],'\s+$',''),'^\s+','')"/>
                        </elem>
                    </xsl:for-each>
                </xsl:when>
                <xsl:when test="$columns[1]='002'">
                    <level002>
                    <xsl:for-each select="$Cols/col">
                        <xsl:variable name="column" select="number(substring-after(.,','))"/>
                        <elem name="{substring-before(.,',')}">
                            <!-- trims the whitespace from the beginning and the ending of the value -->
                            <xsl:value-of select="replace(replace($columns[$column],'\s+$',''),'^\s+','')"/>
                        </elem>
                    </xsl:for-each>
                    </level002>
                </xsl:when>
            </xsl:choose>                               
            </level001>
        </xsl:if>
       </xsl:for-each>
       </root>
    </xsl:when>         
</xsl:choose>

推荐答案

我会首先将平面文本转换为平面 XML 结构,然后使用 for-each-group group-starting-with,如以下代码示例所示:

I would first transform the flat text into a flat XML structure and then group that with for-each-group group-starting-with, as in the following code sample:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.com/mf"
  exclude-result-prefixes="mf xs"
  version="2.0">

<xsl:param name="text-url" as="xs:string" select="'test2012090401.txt'"/>
<xsl:param name="sep" as="xs:string" select="'\|'"/>
<xsl:param name="field" as="xs:string" select="'field'"/>

<xsl:output indent="yes"/>

<xsl:function name="mf:group" as="node()*">
  <xsl:param name="nodes" as="node()*"/>
  <xsl:param name="level" as="xs:integer"/>
  <xsl:for-each-group select="$nodes" group-starting-with="line[xs:integer(elem[1]) eq $level]">
    <xsl:element name="level{*[1]}">
      <xsl:copy-of select="*"/>
      <xsl:sequence select="mf:group(current-group() except ., $level + 1)"/>
    </xsl:element>
  </xsl:for-each-group>
</xsl:function>

<xsl:template name="main">
  <xsl:variable name="flat">
    <xsl:for-each select="tokenize(unparsed-text($text-url), '\r?\n')">
      <line>
        <xsl:for-each select="tokenize(., $sep)">
          <elem name="{$field}{position()}">
            <xsl:value-of select="."/>
          </elem>
        </xsl:for-each>
      </line>
    </xsl:for-each>
  </xsl:variable>
  <root>
    <xsl:sequence select="mf:group($flat/line, 1)"/>
  </root>
</xsl:template>

</xsl:stylesheet>

当我使用 java -jar saxon9he.jar -it:main -xsl:sheet.xsl 在 Saxon 9 中应用该样式表时,我得到的结果是

When I apply that stylesheet with Saxon 9 using java -jar saxon9he.jar -it:main -xsl:sheet.xsl, the result I get is

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <level001>
      <elem name="field1">001</elem>
      <elem name="field2">XXX</elem>
      <elem name="field3">YYY</elem>
      <level002>
         <elem name="field1">002</elem>
         <elem name="field2">AAA</elem>
         <elem name="field3">BBB</elem>
      </level002>
      <level002>
         <elem name="field1">002</elem>
         <elem name="field2">CCC</elem>
         <elem name="field3">DD</elem>
      </level002>
   </level001>
   <level001>
      <elem name="field1">001</elem>
      <elem name="field2">EEF</elem>
      <elem name="field3">XXX</elem>
      <level002>
         <elem name="field1">002</elem>
         <elem name="field2">HHH</elem>
         <elem name="field3">GGG</elem>
         <level/>
      </level002>
   </level001>
</root>

样式表有一个名为 text-url 的参数,指向运行样式表时可以设置的纯文本文件.

The stylesheet has a parameter named text-url to the plain text file you can set when running the stylesheet.

这篇关于XSL - 从文本文件创建格式良好的 xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆