转换复杂和可变的 xml [英] Transform complex and variable xml

查看:27
本文介绍了转换复杂和可变的 xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个复杂的 XML,我想在 HTML 中转换它.html标签中有些标签需要替换.

XML 是这样的:

<div><p><em>bol text</em>,一些普通文本</p>

<列表><列表项>列表中的普通文本 列表中的粗体</listitem><列表项>列表中的另一个文本...</listitem></list><p>示例段落</p>

元素里面的文字是可变的,也就是说我解析的其他xml完全可以改变.

我想要的输出是这个(对于这个场景):

<div><p><strong>bol text</strong>,一些普通文本</p>

<ul><li>列表中的普通文本<strong>列表中的粗体</strong><li>列表中的另一个文本...<p>示例段落</p></root>

我创建了一个递归函数来解析 xml 的任何单个节点并将其替换为 HTML 标记(但不起作用):

$doc = new DOMDocument();$doc->preserveWhiteSpace = false;$doc->load('section.xml');echo $doc->saveHTML();函数 printHtml(DOMNode $node){if ($node->hasChildNodes()){foreach ($node->childNodes as $child){printHtml($child);}}if ($node->nodeName == 'em'){$newNode = $node->ownerDocument->createElement('strong', $node->nodeValue);$node->parentNode->replaceChild($newNode, $node);}if ($node->nodeName == 'listitem'){$newNode = $node->ownerDocument->createElement('li', $node->nodeValue);$node->parentNode->replaceChild($newNode, $node);}}

有人可以帮我吗?

这是一个完整的 xml 示例:

<div><p><em>bol text</em>,一些普通文本</p>

<列表><列表项>列表中的普通文本 列表中的粗体</listitem><列表项>列表中的另一个文本...</listitem></list><媒体><info isVisible="false"><标题><p>图片标题<em>粗体</em>不是粗体</p></信息><file isVisible="true"><参考>路径/到/file.jpg"</文件></媒体><p>示例段落</p></root>

必须转化为:

<div><p><strong>bol text</strong>,一些普通文本</p>

<ul><li>列表中的普通文本 列表中的粗体<li>列表中的另一个文本...<!-- 媒体标签可以以两种模式呈现:标题可见和标题隐藏 --><!-- 隐藏标题时就是这种情况--><img src="path/to/file.jpg";/><!-- 当标题可见时就是这种情况--><!-- info 标签(在媒体标签内)有一个属性 isVisible="false";这意味着它不必显示.--><!-- 如果信息标签有visible=true,媒体标签必须被翻译成<div><img src="path/to/file.jpg";/><p>图片标题<strong>粗体</strong>不是粗体</p><div>--><p>示例段落</p></root>

解决方案

有一种专门为此任务设计的语言:它称为 XSLT,您可以轻松地在 XSLT 中表达所需的转换并从 PHP 程序中调用它.当然,这是一个学习曲线,但它比编写低级 DOM 代码要好得多.

在 XSLT 中,您编写了一组模板规则,说明应如何处理各个元素.您示例中的许多元素都是通过不变的方式复制的,因此您可以从执行此操作的默认规则开始:

<xsl:copy><xsl:apply-templates/></xsl:copy></xsl:模板>

匹配"部分表示您匹配输入的哪一部分;规则的主体说明要产生什么输出.xsl:apply-templates 执行递归下降来处理当前元素的子元素.

你的一些元素只是简单地重命名,例如

<li><xsl:apply-templates/></li></xsl:模板>

有些规则有点复杂,但仍然很容易表达:

<img src="{href}"/></xsl:模板>

我希望您同意这种基于规则的声明性方法比您的程序代码更清晰;其他人在六个月内更改规则也容易得多.

I've a complex XML that I want to transform in HTML. Some tags need to be replaced in html tags.

The XML is this:

<root>
<div>
    <p>
        <em>bol text</em>, some normale text
    </p>
</div>
<list>
    <listitem>
        normal text inside list <em>bold inside list</em>
    </listitem>
    <listitem>
        another text in list...
    </listitem>
</list>
<p>
    A sample paragraph
</p>

The text inside the element is variable, which means that the other xml that I parse can completely change.

The output I want is this (for this scenario):

<root>
    <div>
        <p>
            <strong>bol text</strong>, some normale text
        </p>
    </div>
    <ul>
        <li>
            normal text inside list <strong>bold inside list</strong>
        </li>
        <li>
            another text in list...
        </li>
    </ul>
    <p>
        A sample paragraph
    </p>
</root>

I make a recursive function for parse any single node of xml and replace it in HTML tag (but doesn't work):

$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->load('section.xml');
echo $doc->saveHTML();

function printHtml(DOMNode $node)
{
    if ($node->hasChildNodes())
    {
        foreach ($node->childNodes as $child)
        {
            printHtml($child);
        }
    }

    if ($node->nodeName == 'em')
    {
        $newNode = $node->ownerDocument->createElement('strong', $node->nodeValue);
        $node->parentNode->replaceChild($newNode, $node);
    }

    if ($node->nodeName == 'listitem')
    {
        $newNode = $node->ownerDocument->createElement('li', $node->nodeValue);
        $node->parentNode->replaceChild($newNode, $node);
    }
}

Can anyone help me?

This is an example of a complete xml:

<root>
    <div>
        <p>
            <em>bol text</em>, some normale text
        </p>
    </div>
    <list>
        <listitem>
            normal text inside list <em>bold inside list</em>
        </listitem>
        <listitem>
            another text in list...
        </listitem>
    </list>
    <media>
        <info isVisible="false">
            <title>
                <p>Image title <em>in bold</em> not in bold</p>
            </title>
        </info>
        <file isVisible="true">
            <href>
                "path/to/file.jpg"
            </href>
        </file>
    </media>
    <p>
        A sample paragraph
    </p>
</root>

Which has to be transformed into:

<root>
    <div>
        <p>
            <strong>bol text</strong>, some normale text
        </p>
    </div>
    <ul>
        <li>
            normal text inside list <em>bold inside list</em>
        </li>
        <li>
            another text in list...
        </li>
    </ul>
    <!-- the media tag can be presented in two mode: with title visible, and title hidden -->
    <!-- this is the case when the title is hidden -->
    <img src="path/to/file.jpg" />
    
    <!-- this is the case when the title is visible -->
    <!-- the info tag (inside media tag) has an attribute isVisible="false" which means it doesn't have to be shown. -->
    <!-- if the info tag has visible=true, the media tag must be translated into
     <div>
        <img src="path/to/file.jpg" />
        <p>Image title <strong>in bold</strong> not in bold</p>
     <div>
     -->
    <p>
        A sample paragraph
    </p>
</root>

解决方案

There's a language specially designed for this task: it's called XSLT, and you can easily express your desired transformation in XSLT and invoke it from your PHP program. There's a learning curve, of course, but it's a much better solution than writing low-level DOM code.

In XSLT you write a set of template rules saying how individual elements should be handled. Many elements in your example are copied through unchanged, so you can start with a default rule that does this:

<xsl:template match="*">
  <xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>

The "match" part says what part of the input you are matching; the body of the rule says what output to produce. The xsl:apply-templates does a recursive descent to process the children of the current element.

Some of your elements are simply renamed, for example

<xsl:template match="listitem">
 <li><xsl:apply-templates/></li>
</xsl:template>

Some of the rules are a little bit more complex, but still easily expressed:

<xsl:tempate match="media/file[@isVisible='true']">
  <img src="{href}"/>
</xsl:template>

I hope you agree that this declarative rule-based approach is much clearer than your procedural code; it's also much easier for someone else to change the rules in six months' time.

这篇关于转换复杂和可变的 xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
PHP最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆