如何一次处理多个 xpath(基于提要结构)或创建我自己的具有相同结构的提要 [英] How to handle multiple xpath at once (based on feed structure) or create my own feeds with the same structure

查看:17
本文介绍了如何一次处理多个 xpath(基于提要结构)或创建我自己的具有相同结构的提要的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的代码已经过测试并且可以正常工作,它会打印具有这种结构的提要的内容.

the code below is tested and working, it prints the contents of a feed that has this structure.

<rss>
    <channel>
        <item>
            <pubDate/>
            <title/>
            <description/>
            <link/>
            <author/>
        </item>
    </channel>
</rss>

我没有成功做的是打印遵循下面这个结构的提要(区别在于 ),即使我改变了/feed//entry 的 xpath.你可以在页面源码上看到结构.

What I didn't manage to succesfully do is to print feeds that follow this structure below (the difference is on <feed><entry><published> ) even though I changed the xpath to /feed//entry. you can see the structure on the page source.

<feed>
    <entry>
        <published/>
        <title/>
        <description/>
        <link/>
        <author/>
    </entry>
</feed>

我不得不说,代码根据它的 pubDate 对所有 item 进行排序.在第二个结构提要中,我想它应该根据其 published 对所有 entry 进行排序.

I have to say that the code sorts all item based on its pubDate. In the second structure feed I guess it should sort all entry based on its published.

我可能在找不到的 xPath 上犯了一个错误.但是,如果最后我设法正确打印该提要,我该如何修改代码以同时处理不同的结构?

I probably make a mistake on the xPath I can't find. However, if at the end of this I manage to print that feed right, how can I modify the code to handle different structures all at once ?

是否有任何服务允许我基于这些提要创建和托管我自己的提要,以便我将拥有与所有人相同的结构?我希望我说清楚了...谢谢.

Is there any service that allow me to create and host my own feeds based on those feeds, so I will have the same structure to all? I hope I made my self clear... Thank you.

<?php

$feeds = array();

// Get all feed entries
$entries = array();
foreach ($feeds as $feed) {
    $xml = simplexml_load_file($feed);
    $entries = array_merge($entries, $xml->xpath(''));
}

?>

推荐答案

这个答案的主要贡献是一个解决方案(最后),可以使用无限多种格式,只需指定外部(全局)参数 $postElements 中的所有条目"替代名称和外部(全局)参数 $pub-dateElements 中的所有发布日期"替代名称.

The main contribution of this answer is a solution (at the end) that can be used with infinite number of formats, just specifying all "entry" alternative names in the external (global) parameter $postElements and all "published-date" alternative names in the external (global) parameter $pub-dateElements.

除此之外,这里是如何指定选择所有/rss//item和所有/feed//entry的XPath表达式元素.

Besides this, here is how to specify an XPath expression that selects all /rss//item and all /feed//entry elements.

在只有两种可能的文档格式的简单情况下这(由@Josh Davis 提出)Xpath 表达式正确工作:

In the simple case of just two possible document formats this (as proposed by @Josh Davis) Xpath expression correctly works:

/rss//item  |   /feed//entry

更通用的 XPath 表达式允许从一组无限数量的文档格式中选择所需元素:

/*[contains($topElements, concat('|',name(),'|'))]
    //*[contains($postElements, concat('|',name(),'|'))]

其中变量 $topElements 应该被一个顶部元素的所有可能名称的管道分隔的字符串替换,并且 $postElements 应该被一个管道替换 -条目"元素的所有可能名称的分隔字符串.我们还允许条目"元素在不同的文档格式中处于不同的深度.

where the variable $topElements should be substituted by a pipe-delimited string of all possible names for a top element, and $postElements should be substituted by a pipe-delimited string of all possible names for a "entry" element. We also allow the "entry" elements to be at different depths in the different document formats.

特别是,对于这种具体情况,XPath 表达式将是;

In particular, for this concrete case the XPath expression will be;

/*[contains('|feed|rss|', concat('|',name(),'|'))]
    //*[contains('|item|entry|', concat('|',name(),'|'))]

本文的其余部分展示了如何完全在 XSLT 中完成所需的完整处理——轻松而优雅.

我.温和介绍

I. A gentle introduction

使用 XSLT 进行此类处理非常简单:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
  <myFeed>
   <xsl:apply-templates/>
  </myFeed>
 </xsl:template>

 <xsl:template match="channel|feed">
  <xsl:apply-templates select="*">
   <xsl:sort select="pubDate|published" order="descending"/>
  </xsl:apply-templates>
 </xsl:template>

 <xsl:template match="item|entry">
  <post>
    <xsl:apply-templates mode="identity"/>
  </post>
 </xsl:template>

 <xsl:template match="pubDate|published" mode="identity">
  <publicationDate>
   <xsl:apply-templates/>
  </publicationDate>
 </xsl:template>

  <xsl:template match="node()|@*" mode="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*" mode="identity"/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

当此转换应用于此 XML 文档时(格式 1):

<rss>
    <channel>
        <item>
            <pubDate>2011-06-05</pubDate>
            <title>Title1</title>
            <description>Description1</description>
            <link>Link1</link>
            <author>Author1</author>
        </item>
        <item>
            <pubDate>2011-06-06</pubDate>
            <title>Title2</title>
            <description>Description2</description>
            <link>Link2</link>
            <author>Author2</author>
        </item>
        <item>
            <pubDate>2011-06-07</pubDate>
            <title>Title3</title>
            <description>Description3</description>
            <link>Link3</link>
            <author>Author3</author>
        </item>
    </channel>
</rss>

以及当它应用于此等效文档时(格式 2):

<feed>
        <entry>
            <published>2011-06-05</published>
            <title>Title1</title>
            <description>Description1</description>
            <link>Link1</link>
            <author>Author1</author>
        </entry>
        <entry>
            <published>2011-06-06</published>
            <title>Title2</title>
            <description>Description2</description>
            <link>Link2</link>
            <author>Author2</author>
        </entry>
        <entry>
            <published>2011-06-07</published>
            <title>Title3</title>
            <description>Description3</description>
            <link>Link3</link>
            <author>Author3</author>
        </entry>
</feed>

在两种情况下都需要相同的结果,但会产生正确的结果:

<myFeed>
   <post>
      <publicationDate>2011-06-07</publicationDate>
      <title>Title3</title>
      <description>Description3</description>
      <link>Link3</link>
      <author>Author3</author>
   </post>
   <post>
      <publicationDate>2011-06-06</publicationDate>
      <title>Title2</title>
      <description>Description2</description>
      <link>Link2</link>
      <author>Author2</author>
   </post>
   <post>
      <publicationDate>2011-06-05</publicationDate>
      <title>Title1</title>
      <description>Description1</description>
      <link>Link1</link>
      <author>Author1</author>
   </post>
</myFeed>

二.完整的解决方案

这可以推广到参数化解决方案:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:param name="postElements" select=
 "'|entry|item|'"/>
 <xsl:param name="pub-dateElements" select=
  "'|published|pubDate|'"/>

  <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*" mode="identity"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/">
  <myFeed>
   <xsl:apply-templates select=
   "//*[contains($postElements, concat('|',name(),'|'))]">
    <xsl:sort order="descending" select=
     "*[contains($pub-dateElements, concat('|',name(),'|'))]"/>
   </xsl:apply-templates>
  </myFeed>
 </xsl:template>

 <xsl:template match="*">
  <xsl:choose>
   <xsl:when test=
    "contains($postElements, concat('|',name(),'|'))">
    <post>
      <xsl:apply-templates/>
    </post>
   </xsl:when>
   <xsl:when test=
   "contains($pub-dateElements, concat('|',name(),'|'))">
    <publicationDate>
     <xsl:apply-templates/>
    </publicationDate>
   </xsl:when>
   <xsl:otherwise>
    <xsl:call-template name="identity"/>
   </xsl:otherwise>
  </xsl:choose>
 </xsl:template>

</xsl:stylesheet>

此转换可用于无限多种格式,只需在外部(全局)参数 $postElements 和所有已发布-date"外部(全局)参数中的替代名称$pub-dateElements.

This transformation can be used with infinite number of formats, just specifying all "entry" alternative names in the external (global) parameter $postElements and all "published-date" alternative names in the external (global) parameter $pub-dateElements.

任何人都可以尝试这种转换,以验证当应用于上面的两个 XML 文档时,它再次产生相同的、想要的和正确的结果.

Anyone can try this transformation to verify that when applied on the two XML documents above it again produces the same, wanted and correct result.

这篇关于如何一次处理多个 xpath(基于提要结构)或创建我自己的具有相同结构的提要的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆