使用SimpleXMLElement读取`<![CDATA [...]]>`中的文本 [英] Reading text in `<![CDATA[...]]>` with SimpleXMLElement
问题描述
我正在PHP中使用SimpleXMLElement
导入RSS feed.我在标题和说明上遇到了麻烦.出于某种原因,我从中获取供稿的网站将标题和说明放在<![CDATA[...]]>
中:
I'm importing an RSS feed with SimpleXMLElement
in PHP. I'm having trouble with the title and description. For some reason, the website I get the feed from puts the title and description in <![CDATA[...]]>
:
<item>
<title><![CDATA[...title...]]></title>
<link>...url...</link>
<description><![CDATA[...title...]]></description>
<pubDate>...date...</pubDate>
<guid>...link...</guid>
</item>
当我在SimpleXMLElement上执行var_dump()
时,我得到(对于这一部分):
When I do a var_dump()
on the SimpleXMLElement, I get (for this part):
[2]=>
object(SimpleXMLElement)#5 (5) {
["title"]=>
object(SimpleXMLElement)#18 (0) {
}
["link"]=>
string(95) "...link..."
["description"]=>
object(SimpleXMLElement)#19 (0) {
}
["pubDate"]=>
string(31) "...date..."
["guid"]=>
string(48) "...link..."
}
如何获取 <![CDATA[...]]>
中的值,以从供稿中读取标题和说明?
How can I get the value in <![CDATA[...]]>
to read the title and description from the feed?
推荐答案
SimpleXML绝对可以读取CDATA节点.您遇到的唯一问题是print_r
,var_dump
和类似函数无法准确表示SimpleXML对象,因为它们没有在PHP中完全实现.
SimpleXML reads CDATA nodes absolutely fine. The only problem you're having is that print_r
, var_dump
, and similar functions don't give an accurate representation of SimpleXML objects, because they are not implemented fully in PHP.
如果运行echo $myNode->description
,您会看到CDATA部分的内容很好.原因是当您要求将SimpleXMLElement转换为字符串时,它会自动为您组合所有文本和CDATA内容-但直到您这样做时,它都会记住区别.
If you run echo $myNode->description
you will see the content of the CDATA section just fine. The reason is that when you ask for a SimpleXMLElement to be converted to a string, it automatically combines all the text and CDATA content for you - but until you do, it remembers the distinction.
通常,要提取SimpleXML中任何元素或属性的字符串内容,请使用(string)$myNode
强制转换为字符串.这还可以防止其他问题,例如函数在期望字符串时抱怨获取对象,或者在保存到会话时无法序列化.
As a general case, to extract the string content of any element or attribute in SimpleXML, cast to string with (string)$myNode
. This also prevents other issues, such as functions complaining about getting an object when they were expecting a string, or failure to serialize when saving to a session.
另请参阅我以前的回答,网址为 https://stackoverflow.com/a/13830559/157957
See also my previous answer at https://stackoverflow.com/a/13830559/157957
这篇关于使用SimpleXMLElement读取`<![CDATA [...]]>`中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!