如何使用此XSLT文件解析Excel Excel导出文件? [英] How can I parse this Excel XML export file with this XSLT file?

查看:157
本文介绍了如何使用此XSLT文件解析Excel Excel导出文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



测试XML:

 <?xml version =1.0encoding =UTF-8?> 
<?xml-stylesheet href =newrows.xsltype =text / xsl?>
<工作簿>
<工作表>
<表>
< Row>
< Cell>< / Cell>
< Cell>(info ...)< / Cell>
< Cell>< / Cell>
< / Row>
< Row>
<单元>名字< / Cell>
<单元格>姓< / Cell>
< Cell> age< / Cell>
< / Row>
< Row>
< Cell> Jim< / Cell>
<单元> Smith< / Cell>
<单元> 34< /单元>
< / Row>
< Row>
< Cell> Roy< / Cell>
< Cell> Rogers< / Cell>
< Cell> 22< / Cell>
< / Row>
< Row>
< Cell>(info ...)< / Cell>
< Cell>< / Cell>
< Cell>(info ...)< / Cell>
< / Row>

< Row>
< Cell> Sally< / Cell>
<单元>云< /单元>
< Cell> 26< / Cell>
< / Row>

< Row>
< Cell> John< / Cell>
<单元> Randall< / Cell>
< Cell> 44< / Cell>
< / Row>

< / Table>
< / Worksheet>
< / Workbook>

XSL:

 code><?xml version =1.0encoding =UTF-8?> 
< xsl:stylesheet xmlns:xsl =http://www.w3.org/1999/XSL/Transformversion =1.0>

< xsl:output method =xmlindent =yes/>

< xsl:param name =range-1-beginselect =1/>
< xsl:param name =range-1-endselect =3/>

< xsl:param name =range-2-beginselect =5/>
< xsl:param name =range-2-endselect =6/>

< xsl:template match =Table>
< test>
< xsl:for-each select =Row>
< xsl:if test =(position()& gt; = $ range-1-begin and position()& lt; = $ range-1-end)
or ()& gt; = $ range-2-begin和position()& lt; = $ range-2-end)>
< Row>
< xsl:for-each select =Cell>
< xsl:if test =position()= 1或position()= 3>
<单元格>
< xsl:value-of select =。/>
< / Cell>
< / xsl:if>
< / xsl:for-each>
< / Row>
< / xsl:if>
< / xsl:for-each>
< / test>
< / xsl:template>

< / xsl:stylesheet>

但是,当我们尝试解析从Excel导出的类似XML文件 ,它会导出不含XML元素标签的每个字段的内容。我们甚至可以输入 kksljflskdjf 而不是 Table ,并输出每个XML元素的内容。



我必须在XML / XSL文件中更改哪些内容,以便XSL文件正确解析XML?



Excel XML(exceprts):

 <?xml version =1.0?> 
<?xml-stylesheet href =blackbox.xsltype =text / xsl?>
< Workbook
xmlns =urn:schemas-microsoft-com:office:spreadsheetxmlns:o =urn:schemas-microsoft-com:office:officexmlns:x =urn: schemas-microsoft-com:office:excel
xmlns:ss =urn:schemas-microsoft-com:office:spreadsheetxmlns:html =http://www.w3.org/TR/REC- HTML40\" >
< DocumentProperties xmlns =urn:schemas-microsoft-com:office:office>
<作者> MM< / Author>
< LastAuthor> xx< / LastAuthor>
...
<工作表ss:Name =OFFSET Individual>
<名称>
< NamedRange ss:Name =_ FilterDatabasess:RefersTo =='OFFSET Individual'!R3C2:R3C12ss:Hidden =1/>
< NamedRange ss:Name =Print_Areass:RefersTo =='OFFSET Individual'!R4C2:R435C15/>
< NamedRange ss:Name =Musterss:RefersTo =='OFFSET Individual'!C1:C9/>
< NamedRange ss:Name =PAPss:RefersTo =='OFFSET Individual'!C2/>
< / Names>
< Table ss:ExpandedColumnCount =31ss:ExpandedRowCount =443x:FullColumns =1x:FullRows =1ss:StyleID =s90ss:DefaultColumnWidth =59ss :DefaultRowHeight = 15 >
< Column ss:StyleID =s416ss:Hidden =1ss:AutoFitWidth =0ss:Width =61/>
<列ss:StyleID =s91ss:AutoFitWidth =0ss:Width =287/>
<列ss:StyleID =s547ss:AutoFitWidth =0ss:Width =216/>
<列ss:StyleID =s91ss:AutoFitWidth =0ss:Width =87/>
<列ss:StyleID =s92ss:AutoFitWidth =0ss:Width =202/>
<列ss:StyleID =s90ss:AutoFitWidth =0ss:Width =87/>
<列ss:StyleID =s101ss:AutoFitWidth =0ss:Width =284/>
<列ss:StyleID =s132ss:Hidden =1ss:AutoFitWidth =0ss:Width =52/>
<列ss:StyleID =s137ss:Hidden =1ss:AutoFitWidth =0ss:Width =47/>
<列ss:StyleID =s90ss:Hidden =1ss:AutoFitWidth =0ss:Width =42/>
<列ss:StyleID =s90ss:Hidden =1ss:AutoFitWidth =0ss:Width =39/>
<列ss:StyleID =s90ss:Hidden =1ss:AutoFitWidth =0ss:Width =37/>
<列ss:StyleID =s113ss:AutoFitWidth =0ss:Width =47/>
<列ss:StyleID =s87ss:Hidden =1ss:AutoFitWidth =0ss:Width =275/>
<列ss:StyleID =s458ss:AutoFitWidth =0ss:Width =89/>
<列ss:StyleID =s179ss:AutoFitWidth =0ss:Span =1/>
< Column ss:Index =18ss:StyleID =s168ss:Hidden =1ss:AutoFitWidth =0/>
<列ss:StyleID =s90ss:Hidden =1ss:AutoFitWidth =0/>
<列ss:StyleID =s377ss:AutoFitWidth =0ss:Width =202ss:Span =2/>
< Column ss:Index =23ss:StyleID =s377ss:AutoFitWidth =0ss:Width =203/>
< Row ss:AutoFitHeight =0ss:Height =23>
<单元格ss:索引=2ss:样式ID =s142>
<数据ss:Type =String>纸张概述< / Data>
< NamedCell ss:Name =PAP/>
< NamedCell ss:Name =Muster/>
< / Cell>
< / Row>
< Row ss:AutoFitHeight =0>
<单元格ss:索引=2ss:样式ID =s141>
<数据ss:Type =String> Stand:10.03.2011; 13:00 Uhr< / Data>
< NamedCell ss:Name =PAP/>
< NamedCell ss:Name =Muster/>
< / Cell>
< / Row>
...

以下是生成的XML文件的示例: p>



附录



这是现在完整的解决方案,谢谢@Dimitre!

 < xsl:stylesheet version =1.0
xmlns:xsl =http://www.w3.org/1999/XSL/Transform
xmlns =urn:schemas-microsoft-com:office:spreadsheet
xmlns:y =urn:schemas-microsoft-com:office:spreadsheet
xmlns:o =urn:schemas-microsoft -com:office:office
xmlns:x =urn:schemas-microsoft-com:office:excel
xmlns:ss =urn:schemas-microsoft-com:office:spreadsheet
xmlns:html =http://www.w3.org/TR/REC-html40
exclude-result-prefixes =yox ss html
>

< xsl:strip-space elements =*/>
< xsl:output method =xmlindent =yes/>

< xsl:param name =range-1-beginselect =1/>
< xsl:param name =range-1-endselect =3/>

< xsl:param name =range-2-beginselect =5/>
< xsl:param name =range-2-endselect =6/>

< xsl:template match =text()/>

< xsl:template match =y:Table>
< test>
< xsl:for-each select =y:Row>
< xsl:if test =(position()& gt; = $ range-1-begin and position()& lt; = $ range-1-end)
or ()& gt; = $ range-2-begin和position()& lt; = $ range-2-end)>
< Row>
< xsl:for-each select =y:Cell>
< xsl:if test =position()= 1或position()= 3>
<单元格>
< xsl:value-of select =。/>
< / Cell>
< / xsl:if>
< / xsl:for-each>
< / Row>
< / xsl:if>
< / xsl:for-each>
< / test>
< / xsl:template>

< / xsl:stylesheet>


解决方案


更改
XML / XSL文件,以便XSL文件
正确解析XML?


首先总而言之,你的术语是不正确的。 XSLT转换应用于已解析的XML文档。解析(由XML解析器)是可以应用转换的先决条件。



这是XML,XPath和XSLT中最常见的问题



第二个文档无法按名称选择任何元素的原因是因为它中定义了一个默认的名称( xmlns =urn:schemas-microsoft-com:office:spreadsheet)。



在XPath中,任何未经修饰的名称都被认为是没有命名空间。因此,模板匹配< xsl:for-each> 选择 / code>元素将不匹配/选择任何元素,因为在XML文档中没有no namespace中的这样的元素。



最可读的解决方案是在XSLT样式表中定义相同的命名空间,并在任何XPath表达式/匹配模式中使用前缀名。



因此,在修正的XSLT样式表中,您将具有

 < xsl:stylesheet version =1.0 
xmlns:xsl =http://www.w3.org/1999/XSL/Transform
xmlns:y =urn:schemas-microsoft-com:office:spreadsheet
xmlns:o =urn:schemas-microsoft-com:office:office
xmlns:x =urn:schemas-microsoft-com:office:excel
xmlns:ss =urn:schemas -microsoft-com:office:spreadsheet
xmlns:html =http://www.w3.org/TR/REC-html40
exclude-result-prefixes =yox ss html
>
< xsl:output method =xmlindent =yes/>

< xsl:param name =range-1-beginselect =1/>
< xsl:param name =range-1-endselect =3/>
< xsl:param name =range-2-beginselect =5/>
< xsl:param name =range-2-endselect =6/>

< xsl:template match =y:Table>
< test>
< xsl:for-each select =y:Row>
< xsl:if test =(position()& gt; = $ range-1-begin和position()& lt; = $ range-1-end)或(position()& gt; = $ range-2-begin和position()& lt; = $ range-2-end)>
< Row>
< xsl:for-each select =Cell>
< xsl:if test =position()= 1或position()= 3>
<单元格>
< xsl:value-of select =。/>
< / Cell>
< / xsl:if>
< / xsl:for-each>
< / Row>
< / xsl:if>
< / xsl:for-each>
< / test>
< / xsl:template>
< / xsl:stylesheet>


We can parse this test XML file with this XSL file fine:

Test XML:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="newrows.xsl" type="text/xsl"?>
<Workbook>
    <Worksheet>
        <Table>
            <Row>
                <Cell></Cell>
                <Cell>(info...)</Cell>
                <Cell></Cell>
            </Row>
            <Row>
                <Cell>first name</Cell>
                <Cell>last name</Cell>
                <Cell>age</Cell>
            </Row>
            <Row>
                <Cell>Jim</Cell>
                <Cell>Smith</Cell>
                <Cell>34</Cell>
            </Row>
            <Row>
                <Cell>Roy</Cell>
                <Cell>Rogers</Cell>
                <Cell>22</Cell>
            </Row>
            <Row>
                <Cell>(info...)</Cell>
                <Cell></Cell>
                <Cell>(info...)</Cell>
            </Row>

            <Row>
                <Cell>Sally</Cell>
                <Cell>Cloud</Cell>
                <Cell>26</Cell>
            </Row>

            <Row>
                <Cell>John</Cell>
                <Cell>Randall</Cell>
                <Cell>44</Cell>
            </Row>  

        </Table>
    </Worksheet>
</Workbook>

XSL:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  version="1.0">

    <xsl:output method="xml" indent="yes"/>

    <xsl:param name="range-1-begin"  select="1"/>
    <xsl:param name="range-1-end"  select="3"/>

    <xsl:param name="range-2-begin"  select="5"/>
    <xsl:param name="range-2-end"  select="6"/>

    <xsl:template match="Table">
        <test>
            <xsl:for-each select="Row">
                <xsl:if test="(position() &gt;= $range-1-begin and position() &lt;= $range-1-end)
                    or (position() &gt;= $range-2-begin and position() &lt;= $range-2-end)">
                    <Row>
                       <xsl:for-each select="Cell">
                            <xsl:if test="position() = 1 or position() = 3">
                                <Cell>
                                    <xsl:value-of select="."/>
                                </Cell>
                            </xsl:if>
                        </xsl:for-each>
                    </Row>
                </xsl:if>
            </xsl:for-each>
        </test>
    </xsl:template>

</xsl:stylesheet>

However, when we try to parse this similar XML file exported from Excel, it exports the content of every field with no XML element tags. We can even type in kksljflskdjf instead of Table and it outputs the content of every XML element.

What do I have to change in the XML/XSL file so that the XSL file correctly parses the XML?

Excel XML (exceprts):

<?xml version="1.0"?>
<?xml-stylesheet href="blackbox.xsl" type="text/xsl"?>
<Workbook 
xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" 
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
    <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
        <Author>MM</Author>
        <LastAuthor>xx</LastAuthor>
        ...
<Worksheet ss:Name="OFFSET Individual">
        <Names>
            <NamedRange ss:Name="_FilterDatabase" ss:RefersTo="='OFFSET Individual'!R3C2:R3C12" ss:Hidden="1"/>
            <NamedRange ss:Name="Print_Area" ss:RefersTo="='OFFSET Individual'!R4C2:R435C15"/>
            <NamedRange ss:Name="Muster" ss:RefersTo="='OFFSET Individual'!C1:C9"/>
            <NamedRange ss:Name="PAP" ss:RefersTo="='OFFSET Individual'!C2"/>
        </Names>
        <Table ss:ExpandedColumnCount="31" ss:ExpandedRowCount="443" x:FullColumns="1" x:FullRows="1" ss:StyleID="s90" ss:DefaultColumnWidth="59" ss:DefaultRowHeight="15">
            <Column ss:StyleID="s416" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="61"/>
            <Column ss:StyleID="s91" ss:AutoFitWidth="0" ss:Width="287"/>
            <Column ss:StyleID="s547" ss:AutoFitWidth="0" ss:Width="216"/>
            <Column ss:StyleID="s91" ss:AutoFitWidth="0" ss:Width="87"/>
            <Column ss:StyleID="s92" ss:AutoFitWidth="0" ss:Width="202"/>
            <Column ss:StyleID="s90" ss:AutoFitWidth="0" ss:Width="87"/>
            <Column ss:StyleID="s101" ss:AutoFitWidth="0" ss:Width="284"/>
            <Column ss:StyleID="s132" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="52"/>
            <Column ss:StyleID="s137" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="47"/>
            <Column ss:StyleID="s90" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="42"/>
            <Column ss:StyleID="s90" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="39"/>
            <Column ss:StyleID="s90" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="37"/>
            <Column ss:StyleID="s113" ss:AutoFitWidth="0" ss:Width="47"/>
            <Column ss:StyleID="s87" ss:Hidden="1" ss:AutoFitWidth="0" ss:Width="275"/>
            <Column ss:StyleID="s458" ss:AutoFitWidth="0" ss:Width="89"/>
            <Column ss:StyleID="s179" ss:AutoFitWidth="0" ss:Span="1"/>
            <Column ss:Index="18" ss:StyleID="s168" ss:Hidden="1" ss:AutoFitWidth="0"/>
            <Column ss:StyleID="s90" ss:Hidden="1" ss:AutoFitWidth="0"/>
            <Column ss:StyleID="s377" ss:AutoFitWidth="0" ss:Width="202" ss:Span="2"/>
            <Column ss:Index="23" ss:StyleID="s377" ss:AutoFitWidth="0" ss:Width="203"/>
            <Row ss:AutoFitHeight="0" ss:Height="23">
                <Cell ss:Index="2" ss:StyleID="s142">
                    <Data ss:Type="String">Paper Overview</Data>
                    <NamedCell ss:Name="PAP"/>
                    <NamedCell ss:Name="Muster"/>
                </Cell>
            </Row>
            <Row ss:AutoFitHeight="0">
                <Cell ss:Index="2" ss:StyleID="s141">
                    <Data ss:Type="String">Stand: 10.03.2011; 13:00 Uhr</Data>
                    <NamedCell ss:Name="PAP"/>
                    <NamedCell ss:Name="Muster"/>
                </Cell>
            </Row>
                        ...

Here is an example of the resulting "XML" file:

Addendum

This is the full solution which now works, thanks @Dimitre!

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="urn:schemas-microsoft-com:office:spreadsheet" 
    xmlns:y="urn:schemas-microsoft-com:office:spreadsheet" 
    xmlns:o="urn:schemas-microsoft-com:office:office" 
    xmlns:x="urn:schemas-microsoft-com:office:excel" 
    xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" 
    xmlns:html="http://www.w3.org/TR/REC-html40"
  exclude-result-prefixes="y o x ss html"
 >

 <xsl:strip-space elements="*"/>
    <xsl:output method="xml" indent="yes"/>

    <xsl:param name="range-1-begin"  select="1"/>
    <xsl:param name="range-1-end"  select="3"/>

    <xsl:param name="range-2-begin"  select="5"/>
    <xsl:param name="range-2-end"  select="6"/>

    <xsl:template match="text()"/> 

    <xsl:template match="y:Table">
        <test>
            <xsl:for-each select="y:Row">
                <xsl:if test="(position() &gt;= $range-1-begin and position() &lt;= $range-1-end)
                    or (position() &gt;= $range-2-begin and position() &lt;= $range-2-end)">
                    <Row>
                       <xsl:for-each select="y:Cell">
                            <xsl:if test="position() = 1 or position() = 3">
                                <Cell>
                                    <xsl:value-of select="."/>
                                </Cell>
                            </xsl:if>
                        </xsl:for-each>
                    </Row>
                </xsl:if>
            </xsl:for-each>
        </test>
    </xsl:template>

</xsl:stylesheet>

解决方案

What do I have to change in the XML/XSL file so that the XSL file correctly parses the XML?

First of all, your terminology is quite incorrect. An XSLT transformation is applied on an already parsed XML document. The parsing (by an XML parser) is a prerequisit for being able to apply a transformation.

This is the most FAQ on XML, XPath and in XSLT:

The reason for not being able to select any element by name the second document is because there is a default namesace defined in it (xmlns="urn:schemas-microsoft-com:office:spreadsheet").

In XPath any unprefixed name is considered to be in "no namespace". Therefore the template matching Table and the <xsl:for-each> selecting Row elements will not match/select any element, because in the XML document there are no such elements that are in "no namespace".

The most readable solution is to define the same namespaces in the XSLT stylesheet and to use prefixed names in any XPath expression/match-pattern.

Thus, in the corrected XSLT stylesheet you will have:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:y="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40"
  exclude-result-prefixes="y o x ss html"
 >
    <xsl:output method="xml" indent="yes"/>

    <xsl:param name="range-1-begin"  select="1"/>
    <xsl:param name="range-1-end"  select="3"/>
    <xsl:param name="range-2-begin"  select="5"/>
    <xsl:param name="range-2-end"  select="6"/>

    <xsl:template match="y:Table">
        <test>
            <xsl:for-each select="y:Row">
                <xsl:if test="(position() &gt;= $range-1-begin and position() &lt;= $range-1-end)                     or (position() &gt;= $range-2-begin and position() &lt;= $range-2-end)">
                    <Row>
                        <xsl:for-each select="Cell">
                            <xsl:if test="position() = 1 or position() = 3">
                                <Cell>
                                    <xsl:value-of select="."/>
                                </Cell>
                            </xsl:if>
                        </xsl:for-each>
                    </Row>
                </xsl:if>
            </xsl:for-each>
        </test>
    </xsl:template>
</xsl:stylesheet>

这篇关于如何使用此XSLT文件解析Excel Excel导出文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆