使用 xslt-1.0 对具有相同属性的值进行分组 [英] Grouping values with same attributes using xslt-1.0

查看:37
本文介绍了使用 xslt-1.0 对具有相同属性的值进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

鉴于此输入 XML:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<agrisResources xmlns:ags="http://purl.org/agmes/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <agrisResource bibliographicLevel="AM" ags:ARN="^aSF17^b00003">
        <dc:subject xml:lang="en">Penaeidae</dc:subject>
        <dc:subject xml:lang="en">Vibrio harveyi</dc:subject>
        <dc:subject xml:lang="en">Vibrio parahaemolyticus</dc:subject>
        <dc:subject>
            <ags:subjectClassification scheme="ags:ASC">ASFA-1</ags:subjectClassification>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases</ags:subjectThesaurus>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Fish diseases</ags:subjectThesaurus>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Genes</ags:subjectThesaurus>
        </dc:subject>
    </agrisResource>
</agrisResources>

我想对具有相同属性的项目进行分组,因此输出如下:

I would like to group items with the same attributes, so the output would be like this:

<dc:subject xml:lang="en">Penaeidae||Vibrio harveyi||Vibrio parahaemolyticus</dc:subject>
<dc:subject>
    <ags:subjectClassification scheme="ags:ASC">ASFA-1</ags:subjectClassification>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases||Fish diseases||Genes</ags:subjectThesaurus>
</dc:subject>

基本上,我的分组规则是,如果节点具有多个值,则组合节点的值,例如 dc:subjectags:subjectThesaurus.我在标题中指定对具有相同属性的值进行分组,因为我不确定是否可以仅按标签对它们进行分组而不指定它们的属性来区分它们.

Basically, my rule for the grouping is to combine the values of the nodes if that node have multiple values, eg dc:subject, and ags:subjectThesaurus. I specify in my title to group values with same attributes because I'm not really sure if it is possible to just group them by their tags without specifying their attributes to differentiate them.

换句话说,区分

<dc:subject>Penaeidae</dc:subject>

来自

<dc:subject>
    <ags:subjectThesaurus>Bacterial diseases</ags:subjectThesaurus>
</dc:subject>

更新

输入 XML

<?xml version="1.0" encoding="ISO-8859-1" ?>
<agrisResources xmlns:ags="http://purl.org/agmes/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <agrisResource bibliographicLevel="AM" ags:ARN="^aSF17^b00003">
        <dc:creator>
            <ags:creatorPersonal>Doe, John</ags:creatorPersonal>
            <ags:creatorPersonal>Smith, Jason T.</ags:creatorPersonal>
            <ags:creatorPersonal>Doe, Jane E.</ags:creatorPersonal>
        </dc:creator>
        <dc:subject xml:lang="en">Penaeidae</dc:subject>
        <dc:subject xml:lang="en">Vibrio harveyi</dc:subject>
        <dc:subject xml:lang="en">Vibrio parahaemolyticus</dc:subject>
        <dc:subject>
            <ags:subjectClassification scheme="ags:ASC">ASFA-1</ags:subjectClassification>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases</ags:subjectThesaurus>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Fish diseases</ags:subjectThesaurus>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Genes</ags:subjectThesaurus>
        </dc:subject>
    </agrisResource>
</agrisResources>

期望输出

分组规则:使用双管道||作为重复元素的分隔符组合值,例如.不符合该规则的其他元素保持原样.

Rules on grouping: Combine the values using double pipe || as separator for repeating elements, eg <ags:creatorPersonal>, <dc:subject xml:lang="en"> and <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">. Leave other elements as is that does not meet that rule.

<?xml version="1.0" encoding="ISO-8859-1" ?>
<agrisResources xmlns:ags="http://purl.org/agmes/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <agrisResource bibliographicLevel="AM" ags:ARN="^aSF17^b00003">
        <dc:creator>
            <ags:creatorPersonal>Doe, John||Smith, Jason T.||Doe, Jane E.</ags:creatorPersonal>
        </dc:creator>
        <dc:subject xml:lang="en">Penaeidae||Vibrio harveyi||Vibrio parahaemolyticus</dc:subject>
        <dc:subject>
            <ags:subjectClassification scheme="ags:ASC">ASFA-1</ags:subjectClassification>
            <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases||Fish diseases||Genes</ags:subjectThesaurus>
        </dc:subject>
    </agrisResource>
</agrisResources>

以下是我基于此答案的代码:

Below is my code based from this answer:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dc="http://purl.org/dc/terms/"
            xmlns:ags="http://purl.org/agmes/1.1/"
            xmlns:agls="http://www.naa.gov.au/recordkeeping/gov_online/agls/1.2"
            xmlns:dcterms="http://purl.org/dc/terms/">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="ags:subjectThesaurus|dc:subject">
        <xsl:copy>
            <xsl:apply-templates select="@* | text()"/>
                <xsl:call-template name="NextSibling"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="ags:subjectThesaurus[@scheme = preceding-sibling::*[1][self::ags:subjectThesaurus]/@scheme]|dc:subject[@xml:lang = preceding-sibling::*[1][self::dc:subject]/@xml:lang]"/>

    <xsl:template match="ags:subjectThesaurus|dc:subject" mode="includeSib">
        <xsl:value-of select="concat('||', .)"/>
            <xsl:call-template name="NextSibling"/>
        </xsl:template>

    <xsl:template name="NextSibling">
        <xsl:apply-templates select="following-sibling::*[1][self::ags:subjectThesaurus and @scheme = current()/@scheme]|following-sibling::*[1][self::dc:subject and @xml:lang = current()/@xml:lang]" mode="includeSib"/>
    </xsl:template>
</xsl:stylesheet>

我唯一的问题是它只转换了 ags:subjectThesaurus 而不是 dc:subject 节点.我的输出如下所示:

My only problem is that it is only transforming the ags:subjectThesaurus but not the dc:subject node. My output looks like this:

<dc:subject xml:lang="en">Penaeidae</dc:subject>
<dc:subject xml:lang="en">Vibrio harveyi</dc:subject>
<dc:subject xml:lang="en">Vibrio parahaemolyticus</dc:subject>
<dc:subject>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases||Fish diseases||Genes</ags:subjectThesaurus>
</dc:subject>

如何修改我的代码,使其也将具有相同 xml:lang 属性的 dc:subject 节点分组?

How can I modify my code such that it will also group the dc:subject node with the same xml:lang attribute?

编辑

基于 michael.hor257k 的建议和来自这个 回答 使用 Muenchian 方法,下面是我尝试的:

Based on the suggestion of michael.hor257k and from this answer to use the Muenchian method, below is what I tried:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dc="http://purl.org/dc/terms/"
            xmlns:ags="http://purl.org/agmes/1.1/"
            xmlns:agls="http://www.naa.gov.au/recordkeeping/gov_online/agls/1.2"
            xmlns:dcterms="http://purl.org/dc/terms/">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:key name="kNodeSubject" match="dc:subject[@xml:lang]" use="@xml:lang"/>
    <xsl:key name="subjectThesaurus" match="dc:subject/ags:subjectThesaurus" use="@scheme"/>
    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="dc:subject[generate-id() = generate-id(key('kNodeSubject', @xml:lang)[1])]">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="key('kNodeSubject', @xml:lang)" mode="concat"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="dc:subject/ags:subjectThesaurus[generate-id() = generate-id(key('subjectThesaurus', @scheme)[1])]">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="key('subjectThesaurus', @scheme)" mode="concat"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="dc:subject|subjectThesaurus" mode="concat">
        <xsl:value-of select="."/>
            <xsl:if test="position() != last()">
                <xsl:text>||</xsl:text>
            </xsl:if>
    </xsl:template>

    <xsl:template match="dc:subject"/>
    <xsl:template match="ags:subjectThesaurus"/>
</xsl:stylesheet>

当我应用上面的代码时,节点 ags:subjectThesaurus 消失了,<dc:subject xml:lang="en"> 的值是也不分组.我不知道我的匹配是否正确,我使用了 match="dc:subject[@xml:lang]" 作为 <xsl:key name="kNodeSubject" 因为节点 ags:subjectThesaurus 的子节点.

When I applied the code above, the nodes ags:subjectThesaurus are gone and the values of <dc:subject xml:lang="en"> are not grouped either. I don't know if I have the match right, I used the match="dc:subject[@xml:lang]" for the <xsl:key name="kNodeSubject" because the node ags:subjectThesaurus is the child of <dc:subject>.

提前致谢.

推荐答案

考虑以下示例:

XML

<root xmlns:dc="http://purl.org/dc/terms/" xmlns:ags="http://purl.org/agmes/1.1/">
  <dc:subject xml:lang="en">Penaeidae</dc:subject>
  <dc:subject xml:lang="en">Vibrio harveyi</dc:subject>
  <dc:subject xml:lang="fr">Franca premier</dc:subject>
  <dc:subject xml:lang="fr">Franca deux</dc:subject>
  <dc:subject xml:lang="en">Vibrio parahaemolyticus</dc:subject>
  <dc:subject>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases</ags:subjectThesaurus>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Fish diseases</ags:subjectThesaurus>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Genes</ags:subjectThesaurus>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:B">Bees</ags:subjectThesaurus>
    <ags:subjectThesaurus xml:lang="en" scheme="ags:B">Birds</ags:subjectThesaurus>
  </dc:subject>
</root>

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:ags="http://purl.org/agmes/1.1/">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="subj-by-lang" match="dc:subject[@xml:lang]" use="@xml:lang"/>
<xsl:key name="thes-by-scheme" match="ags:subjectThesaurus" use="@scheme"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="root">
    <xsl:copy>
        <!-- group subjects by lang -->
        <xsl:for-each select="dc:subject[@xml:lang][count(. | key('subj-by-lang', @xml:lang)[1]) = 1]">
             <dc:subject xml:lang="{@xml:lang}">
                <xsl:for-each select="key('subj-by-lang', @xml:lang)">
                    <xsl:value-of select="."/>
                    <xsl:if test="position() != last()">
                        <xsl:text>||</xsl:text>
                    </xsl:if>
                </xsl:for-each>
             </dc:subject>  
        </xsl:for-each>
        <!-- process other nodes -->
        <xsl:apply-templates select="node()[not(self::dc:subject[@xml:lang])]"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="dc:subject">
    <xsl:copy>
        <!-- group thesauri by scheme    -->
        <xsl:for-each select="ags:subjectThesaurus[count(. | key('thes-by-scheme', @scheme)[1]) = 1]">
             <dc:subjectThesaurus xml:lang="{@xml:lang}" scheme="{@scheme}">
                <xsl:for-each select="key('thes-by-scheme', @scheme)">
                    <xsl:value-of select="."/>
                    <xsl:if test="position() != last()">
                        <xsl:text>||</xsl:text>
                    </xsl:if>
                </xsl:for-each>
             </dc:subjectThesaurus> 
        </xsl:for-each>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

结果

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:dc="http://purl.org/dc/terms/" xmlns:ags="http://purl.org/agmes/1.1/">
  <dc:subject xml:lang="en">Penaeidae||Vibrio harveyi||Vibrio parahaemolyticus</dc:subject>
  <dc:subject xml:lang="fr">Franca premier||Franca deux</dc:subject>
  <dc:subject>
    <dc:subjectThesaurus xml:lang="en" scheme="ags:ASFAT">Bacterial diseases||Fish diseases||Genes</dc:subjectThesaurus>
    <dc:subjectThesaurus xml:lang="en" scheme="ags:B">Bees||Birds</dc:subjectThesaurus>
  </dc:subject>
</root>

<小时>

添加:

根据您的说明,我怀疑您想做更简单的事情:只需将一些叶节点(即没有子元素的节点)连接在一起,而其他保持原样.


Added:

Based on your clarifications, I suspect you want to do something much simpler: just join together some leaf nodes (i.e. nodes with no child elements) and leave the others as is.

这是一个在 agrisResource 中加入 dc:subject 叶节点的示例:

Here's an example joining the dc:subject leaf nodes within agrisResource:

<xsl:template match="agrisResource">
    <xsl:copy>
        <!-- join subjects with no children -->
        <dc:subject>
            <!-- copy the attributes of the first subject with no children -->
            <xsl:copy-of select="dc:subject[not(*)][1]/@*"/>
            <!-- concat the values of all subjects with any attributes -->
            <xsl:for-each select="dc:subject[not(*)]">
                <xsl:value-of select="."/>
                <xsl:if test="position() != last()">
                    <xsl:text>||</xsl:text>
                </xsl:if>
            </xsl:for-each>
         </dc:subject>  
        <!-- process other nodes -->
        <xsl:apply-templates select="node()[not(self::dc:subject[not(*)])]"/>
    </xsl:copy>
</xsl:template>

这可以通过使用基于元素名称的键来概括.

This could be generalized by using a key based on an element's name.

这篇关于使用 xslt-1.0 对具有相同属性的值进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆