转换复杂和可变的 xml
[英] Transform complex and variable xml
本文介绍了转换复杂和可变的 xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个复杂的 XML,我想在 HTML 中转换它.html标签中有些标签需要替换.
XML 是这样的:
<div><p><em>bol text</em>,一些普通文本</p>
<列表><列表项>列表中的普通文本 列表中的粗体</listitem><列表项>列表中的另一个文本...</listitem></list><p>示例段落</p>
元素里面的文字是可变的,也就是说我解析的其他xml完全可以改变.
我想要的输出是这个(对于这个场景):
<div><p><strong>bol text</strong>,一些普通文本</p>
<ul><li>列表中的普通文本<strong>列表中的粗体</strong><li>列表中的另一个文本...<p>示例段落</p></root>
我创建了一个递归函数来解析 xml 的任何单个节点并将其替换为 HTML 标记(但不起作用):
$doc = new DOMDocument();$doc->preserveWhiteSpace = false;$doc->load('section.xml');echo $doc->saveHTML();函数 printHtml(DOMNode $node){if ($node->hasChildNodes()){foreach ($node->childNodes as $child){printHtml($child);}}if ($node->nodeName == 'em'){$newNode = $node->ownerDocument->createElement('strong', $node->nodeValue);$node->parentNode->replaceChild($newNode, $node);}if ($node->nodeName == 'listitem'){$newNode = $node->ownerDocument->createElement('li', $node->nodeValue);$node->parentNode->replaceChild($newNode, $node);}}
有人可以帮我吗?
这是一个完整的 xml 示例:
<div><p><em>bol text</em>,一些普通文本</p>
<列表><列表项>列表中的普通文本 列表中的粗体</listitem><列表项>列表中的另一个文本...</listitem></list><媒体><info isVisible="false"><标题><p>图片标题<em>粗体</em>不是粗体</p></信息><file isVisible="true"><参考>路径/到/file.jpg"</文件></媒体><p>示例段落</p></root>
必须转化为:
<div><p><strong>bol text</strong>,一些普通文本</p>
<ul><li>列表中的普通文本 列表中的粗体<li>列表中的另一个文本...<!-- 媒体标签可以以两种模式呈现:标题可见和标题隐藏 --><!-- 隐藏标题时就是这种情况--><img src="path/to/file.jpg";/><!-- 当标题可见时就是这种情况--><!-- info 标签(在媒体标签内)有一个属性 isVisible="false";这意味着它不必显示.--><!-- 如果信息标签有visible=true,媒体标签必须被翻译成<div><img src="path/to/file.jpg";/><p>图片标题<strong>粗体</strong>不是粗体</p><div>--><p>示例段落</p></root>
解决方案
有一种专门为此任务设计的语言:它称为 XSLT,您可以轻松地在 XSLT 中表达所需的转换并从 PHP 程序中调用它.当然,这是一个学习曲线,但它比编写低级 DOM 代码要好得多.
在 XSLT 中,您编写了一组模板规则,说明应如何处理各个元素.您示例中的许多元素都是通过不变的方式复制的,因此您可以从执行此操作的默认规则开始:
<xsl:copy><xsl:apply-templates/></xsl:copy></xsl:模板>
匹配"部分表示您匹配输入的哪一部分;规则的主体说明要产生什么输出.xsl:apply-templates 执行递归下降来处理当前元素的子元素.
你的一些元素只是简单地重命名,例如
<li><xsl:apply-templates/></li></xsl:模板>
有些规则有点复杂,但仍然很容易表达:
<img src="{href}"/></xsl:模板>
我希望您同意这种基于规则的声明性方法比您的程序代码更清晰;其他人在六个月内更改规则也容易得多.
I've a complex XML that I want to transform in HTML. Some tags need to be replaced in html tags.
The XML is this:
<root>
<div>
<p>
<em>bol text</em>, some normale text
</p>
</div>
<list>
<listitem>
normal text inside list <em>bold inside list</em>
</listitem>
<listitem>
another text in list...
</listitem>
</list>
<p>
A sample paragraph
</p>
The text inside the element is variable, which means that the other xml that I parse can completely change.
The output I want is this (for this scenario):
<root>
<div>
<p>
<strong>bol text</strong>, some normale text
</p>
</div>
<ul>
<li>
normal text inside list <strong>bold inside list</strong>
</li>
<li>
another text in list...
</li>
</ul>
<p>
A sample paragraph
</p>
</root>
I make a recursive function for parse any single node of xml and replace it in HTML tag (but doesn't work):
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->load('section.xml');
echo $doc->saveHTML();
function printHtml(DOMNode $node)
{
if ($node->hasChildNodes())
{
foreach ($node->childNodes as $child)
{
printHtml($child);
}
}
if ($node->nodeName == 'em')
{
$newNode = $node->ownerDocument->createElement('strong', $node->nodeValue);
$node->parentNode->replaceChild($newNode, $node);
}
if ($node->nodeName == 'listitem')
{
$newNode = $node->ownerDocument->createElement('li', $node->nodeValue);
$node->parentNode->replaceChild($newNode, $node);
}
}
Can anyone help me?
This is an example of a complete xml:
<root>
<div>
<p>
<em>bol text</em>, some normale text
</p>
</div>
<list>
<listitem>
normal text inside list <em>bold inside list</em>
</listitem>
<listitem>
another text in list...
</listitem>
</list>
<media>
<info isVisible="false">
<title>
<p>Image title <em>in bold</em> not in bold</p>
</title>
</info>
<file isVisible="true">
<href>
"path/to/file.jpg"
</href>
</file>
</media>
<p>
A sample paragraph
</p>
</root>
Which has to be transformed into:
<root>
<div>
<p>
<strong>bol text</strong>, some normale text
</p>
</div>
<ul>
<li>
normal text inside list <em>bold inside list</em>
</li>
<li>
another text in list...
</li>
</ul>
<!-- the media tag can be presented in two mode: with title visible, and title hidden -->
<!-- this is the case when the title is hidden -->
<img src="path/to/file.jpg" />
<!-- this is the case when the title is visible -->
<!-- the info tag (inside media tag) has an attribute isVisible="false" which means it doesn't have to be shown. -->
<!-- if the info tag has visible=true, the media tag must be translated into
<div>
<img src="path/to/file.jpg" />
<p>Image title <strong>in bold</strong> not in bold</p>
<div>
-->
<p>
A sample paragraph
</p>
</root>
解决方案
There's a language specially designed for this task: it's called XSLT, and you can easily express your desired transformation in XSLT and invoke it from your PHP program. There's a learning curve, of course, but it's a much better solution than writing low-level DOM code.
In XSLT you write a set of template rules saying how individual elements should be handled. Many elements in your example are copied through unchanged, so you can start with a default rule that does this:
<xsl:template match="*">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>
The "match" part says what part of the input you are matching; the body of the rule says what output to produce. The xsl:apply-templates does a recursive descent to process the children of the current element.
Some of your elements are simply renamed, for example
<xsl:template match="listitem">
<li><xsl:apply-templates/></li>
</xsl:template>
Some of the rules are a little bit more complex, but still easily expressed:
<xsl:tempate match="media/file[@isVisible='true']">
<img src="{href}"/>
</xsl:template>
I hope you agree that this declarative rule-based approach is much clearer than your procedural code; it's also much easier for someone else to change the rules in six months' time.
这篇关于转换复杂和可变的 xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!