使用SimpleXMLElement读取`<![CDATA [...]]>`中的文本 [英] Reading text in `<![CDATA[...]]>` with SimpleXMLElement

查看:180
本文介绍了使用SimpleXMLElement读取`<![CDATA [...]]>`中的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在PHP中使用SimpleXMLElement导入RSS feed.我在标题和说明上遇到了麻烦.出于某种原因,我从中获取供稿的网站将标题和说明放在<![CDATA[...]]>中:

I'm importing an RSS feed with SimpleXMLElement in PHP. I'm having trouble with the title and description. For some reason, the website I get the feed from puts the title and description in <![CDATA[...]]>:

<item>
<title><![CDATA[...title...]]></title>
<link>...url...</link>
<description><![CDATA[...title...]]></description>
<pubDate>...date...</pubDate>
<guid>...link...</guid>
</item>

当我在SimpleXMLElement上执行var_dump()时,我得到(对于这一部分):

When I do a var_dump() on the SimpleXMLElement, I get (for this part):

  [2]=>
  object(SimpleXMLElement)#5 (5) {
    ["title"]=>
    object(SimpleXMLElement)#18 (0) {
    }
    ["link"]=>
    string(95) "...link..."
    ["description"]=>
    object(SimpleXMLElement)#19 (0) {
    }
    ["pubDate"]=>
    string(31) "...date..."
    ["guid"]=>
    string(48) "...link..."
  }

如何获取 <![CDATA[...]]>中的值,以从供稿中读取标题和说明?

How can I get the value in <![CDATA[...]]> to read the title and description from the feed?

推荐答案

SimpleXML绝对可以读取CDATA节点.您遇到的唯一问题是print_rvar_dump和类似函数无法准确表示SimpleXML对象,因为它们没有在PHP中完全实现.

SimpleXML reads CDATA nodes absolutely fine. The only problem you're having is that print_r, var_dump, and similar functions don't give an accurate representation of SimpleXML objects, because they are not implemented fully in PHP.

如果运行echo $myNode->description,您会看到CDATA部分的内容很好.原因是当您要求将SimpleXMLElement转换为字符串时,它会自动为您组合所有文本和CDATA内容-但直到您这样做时,它都会记住区别.

If you run echo $myNode->description you will see the content of the CDATA section just fine. The reason is that when you ask for a SimpleXMLElement to be converted to a string, it automatically combines all the text and CDATA content for you - but until you do, it remembers the distinction.

通常,要提取SimpleXML中任何元素或属性的字符串内容,请使用(string)$myNode强制转换为字符串.这还可以防止其他问题,例如函数在期望字符串时抱怨获取对象,或者在保存到会话时无法序列化.

As a general case, to extract the string content of any element or attribute in SimpleXML, cast to string with (string)$myNode. This also prevents other issues, such as functions complaining about getting an object when they were expecting a string, or failure to serialize when saving to a session.

另请参阅我以前的回答,网址为 https://stackoverflow.com/a/13830559/157957

See also my previous answer at https://stackoverflow.com/a/13830559/157957

这篇关于使用SimpleXMLElement读取`&lt;![CDATA [...]]&gt;`中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆