PHP SimpleXML不会在XML属性中保留换行符 [英] PHP SimpleXML doesn't preserve line breaks in XML attributes

查看:52
本文介绍了PHP SimpleXML不会在XML属性中保留换行符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须解析外部提供的XML,该XML具有在其中带有换行符的属性.使用SimpleXML,换行符似乎丢失了.根据另一个stackoverflow问题,换行符应有效(甚至远远不够理想!).

I have to parse externally provided XML that has attributes with line breaks in them. Using SimpleXML, the line breaks seem to be lost. According to another stackoverflow question, line breaks should be valid (even though far less than ideal!) for XML.

为什么他们迷路了? [edit] 如何保存它们? [/edit]

Why are they lost? [edit] And how can I preserve them? [/edit]

这是一个演示文件脚本(请注意,当换行符不在属性中时,它们将被保留).

Here is a demo file script (note that when the line breaks are not in an attribute they are preserved).

具有嵌入式XML的PHP​​文件

$xml = <<<XML
<?xml version="1.0" encoding="utf-8"?>
<Rows>
    <data Title='Data Title' Remarks='First line of the row.
Followed by the second line.
Even a third!' />
    <data Title='Full Title' Remarks='None really'>First line of the row.
Followed by the second line.
Even a third!</data>
</Rows>
XML;

$xml = new SimpleXMLElement( $xml );
print '<pre>'; print_r($xml); print '</pre>';

print_r的输出

SimpleXMLElement Object
(
    [data] => Array
        (
            [0] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [Title] => Data Title
                            [Remarks] => First line of the row. Followed by the second line. Even a third!
                        )

                )

            [1] => First line of the row.
Followed by the second line.
Even a third!
        )

)

推荐答案

新行的实体为&#10;.在找到能解决问题的方法之前,我一直在处理您的代码.这不是很优雅,我警告您:

The entity for a new line is &#10;. I played with your code until I found something that did the trick. It's not very elegant, I warn you:

//First remove any indentations:
$xml = str_replace("     ","", $xml);
$xml = str_replace("\t","", $xml);

//Next replace unify all new-lines into unix LF:
$xml = str_replace("\r","\n", $xml);
$xml = str_replace("\n\n","\n", $xml);

//Next replace all new lines with the unicode:
$xml = str_replace("\n","&#10;", $xml);

Finally, replace any new line entities between >< with a new line:
$xml = str_replace(">&#10;<",">\n<", $xml);

根据您的示例,假设是节点或属性内出现的任何新行在下一行将具有更多文本,而不是<来打开新元素.

The assumption, based on your example, is that any new lines that occur inside a node or attribute will have more text on the next line, not a < to open a new element.

如果您的下一行将某些文本包裹在行级元素中,那么这当然会失败.

This of course would fail if your next line had some text that was wrapped in a line-level element.

这篇关于PHP SimpleXML不会在XML属性中保留换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆