如何使用PHP(即SimpleXmlElement)从XML数据中提取所有文本? [英] How can I extract all text from XML data using PHP (i.e. SimpleXmlElement)?

查看:279
本文介绍了如何使用PHP(即SimpleXmlElement)从XML数据中提取所有文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的XML数据:

$data = '<title>Report of the <org reg="International Foo and Bar Conference, 5th">Fifth International Foo and Bar Conference</org>, <org>Foobar Hall</org>, London, July 14 to 16, 1908.</title>'; 

我可以加载它:

$xml = simplexml_load_string( $data ); 
print_r( $xml );

这将返回:

SimpleXMLElement Object (
    [org] => Array (
        [0] => Fifth International Foo and Bar Conference
        [1] => Foobar Hall ) )

但是现在我可以尝试再次以字符串形式获取它:

But now I can try to get it in a string again:

$flat = (string) $xml;
print_r( $flat ); 

这就是我看到的:

Report of the , , London, July 14 to 16, 1908.

但我宁愿是这样:

Report of the Fifth International Foo and Bar Conference, Foobar Hall, London, July 14 to 16, 1908.

有没有一种简单的方法可以用PHP做到这一点,而无需显式地遍历每个节点?就是说,有没有一种方法可以使XML扁平化并从中提取所有文本,而不管标签如何?

Is there an easy way to do that with PHP, without explicitly recursing through every node? That's to say, is there a way to just flatten the XML and extract all the text from it, regardless of tags?

推荐答案

这可以在DOM中轻松完成. DOM元素节点具有$ textContent属性,该属性将返回其文本内容,包括所有后代文本节点.

This can be easily done in DOM. DOM element nodes have a property $textContent, that will return its text content including all descendant text nodes.

$document = new DOMDocument();
$document->loadXml($data);
var_dump($document->documentElement->textContent);

输出:

string(99) "Report of the Fifth International Foo and Bar Conference, Foobar Hall, London, July 14 to 16, 1908."

如果尚未在变量中包含element节点,则使用XPath会更加方便.

If you do not have the element node already in a variable, it will be more convenient to use XPath.

$document = new DOMDocument();
$document->loadXml($data);
$xpath = new DOMXpath($document);
var_dump($xpath->evaluate('string(/title)'));

甚至可以将SimpleXMLElement转换为DOM元素节点.

It is even possible to convert a SimpleXMLElement into a DOM element node.

$element = new SimpleXMLElement($data);
$node = dom_import_simplexml($element);
var_dump($node->textContent);

这篇关于如何使用PHP(即SimpleXmlElement)从XML数据中提取所有文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆