SimpleXML 获取子元素之间的元素内容 [英] SimpleXML get Element Content between Child Elements

查看:25
本文介绍了SimpleXML 获取子元素之间的元素内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 SimpleXML 解析 PHP 中的 XML 并有一个像这样的 XML:

I am parsing XML in PHP with SimpleXML and have an XML like this:

<xml>
    <element>
        textpart1
            <subelement>subcontent1</subelement>
        textpart2
            <subelement>subcontent2</subelement>
        textpart3
    </element>
</xml>

当我执行 $xml->element 时,它自然会为我提供整个元素,就像在所有三个文本部分中一样.

When I do $xml->element it naturally gives me the whole element, as in all three textparts.

因此,如果我将其解析为一个数组(为孩子使用 foreach),我会得到:

So if I parse this into an array (with a foreach for the children) I get:

0 => textpart1textpart2textpart3, 1 => subcontent1, 2 => subcontent2

我需要一种方法来解析 <element> 节点,以便在子元素之后停止或开始的每个文本部分都被视为它自己的元素.

I need a way to parse the <element> node so that each textpart that stops at, or begins after a subelement is treated as its own element.

因此,我正在寻找一个可以用这样的数组表示的有序列表:

As a result I am looking for an ordered list that could be express in an array like this:

0 => textpart1, 1 => subcontent1, 2 => textpart2, 3 => subcontent2, 4 => textpart3

在不改变 XML 文件的情况下这可能吗?提前感谢您的任何提示!

Is that possible without altering the XML file? Thanks in advance for any hints!

推荐答案

正如其他人所说,SimpleXML 不支持将单个文本节点作为单独的实体访问,因此您需要用一些 DOM 方法对其进行补充.幸运的是,您可以使用 dom_import_simplexmlsimplexml_import_dom.

As others have said, SimpleXML doesn't have any support for accessing individual text nodes as separate entities, so you will need to supplement it with some DOM methods. Thankfully, you can switch between the two at will using dom_import_simplexml and simplexml_import_dom.

您需要的 DOM 功能的关键部分是:

The key pieces of DOM functionality you need are:

  • DOMElement->childNodes 成员变量,用于作为可迭代列表直接访问特定元素下的所有节点
  • 用于确定特定子节点是文本节点还是元素的 DOMNode->nodeType 变量
  • 用于获取实际文本的 DOMNode->nodeValue 变量

鉴于这些,您可以编写一个函数,该函数返回一个数组,其中包含用于子元素的 SimpleXML 对象和用于子文本节点的字符串,如下所示:

Given those, you can write a function which returns an array with a mixture of SimpleXML objects for child elements, and strings for child text nodes, something like this:

function get_child_elements_and_text_nodes($sx_element)
{
    $return = array();

    $dom_element = dom_import_simplexml($sx_element);
    foreach ( $dom_element->childNodes as $dom_child )
    {
        switch ( $dom_child->nodeType )
        {
            case XML_TEXT_NODE:
                $return[] = $dom_child->nodeValue;
            break;
            case XML_ELEMENT_NODE:
                $return[] = simplexml_import_dom($dom_child);
            break;
        }
    }

    return $return;
}

在您的情况下,您需要向下递归树,如果您随时混合使用 DOM 和 SimpleXML,这会让人有点困惑,因此您可以改为完全在 DOM 中编写递归并在运行之前转换 SimpleXML 对象:

In your case, you need to recurse down the tree, which makes it a little confusing if you mix DOM and SimpleXML as you go, so you could instead write the recursion entirely in DOM and convert the SimpleXML object before running it:

function recursively_find_text_nodes($dom_element)
{
    $return = array();

    foreach ( $dom_element->childNodes as $dom_child )
    {
        switch ( $dom_child->nodeType )
        {
            case XML_TEXT_NODE:
                $return[] = $dom_child->nodeValue;
            break;
            case XML_ELEMENT_NODE:
                $return = array_merge($return, recursively_find_text_nodes($dom_child));
            break;
        }
    }

    return $return;
}

$text_nodes = recursively_find_text_nodes(dom_import_simplexml($simplexml->element));

这是最后一个功能的现场演示.

这篇关于SimpleXML 获取子元素之间的元素内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆