如何通过Domdocument获取第一级dom元素? [英] How get first level of dom elements by Domdocument?

查看:283
本文介绍了如何通过Domdocument获取第一级dom元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何通过Domdocument PHP获取第一级dom元素?

How get first level of dom elements by Domdocument PHP?

示例代码无效-摘自Q& A:如何使用PHP DOMDocument获取第一级的节点?

Example with code that not works - tooken from Q&A:How to get nodes in first level using PHP DOMDocument?

<?php
$str=<<< EOD
<div id="header">
</div>
<div id="content">
    <div id="sidebar">
    </div>
    <div id="info">
    </div>
</div>
<div id="footer">
</div>
EOD;

$doc = new DOMDocument();
$doc->loadHTML($str);
$xpath = new DOMXpath($doc);
$entries = $xpath->query("/");
foreach ($entries as $entry) {
    var_dump($entry->firstChild->nodeValue);
}
?>


推荐答案

根节点下方的第一级元素可以是

The first level of elements below the root node can be accessed with

$dom->documentElement->childNodes

childNodes属性包含 DOMNodeList ,您可以使用 foreach 进行迭代。

The childNodes property contains a DOMNodeList, which you can iterate with foreach.

请参见 DOMDocument :: documentElement


这是一个方便属性,它允许直接访问子节点是文档的文档元素。

This is a convenience attribute that allows direct access to the child node that is the document element of the document.

DOMNode :: childNodes


一个连续的DOMNodeList属于此节点的所有子级。如果没有子代,则这是一个空的DOMNodeList。

A DOMNodeList that contains all children of this node. If there are no children, this is an empty DOMNodeList.

由于 childNodes DOMNode 的属性,任何扩展 DOMNode (这是DOM中的大多数类)的类都具有此属性,因此在 DOMElement 下获得第一级元素是访问该DOMElement的childNode属性。

Since childNodes is a property of DOMNode any class extending DOMNode (which is most of the classes in DOM) have this property, so to get the first level of elements below a DOMElement is to access that DOMElement's childNode property.

请注意,如果对无效的HTML或部分文档使用 DOMDocument :: loadHTML(),则HTML解析器模块将添加HTML框架,其中包含html和正文标记,因此在DOM树中,示例中的HTML将为

Note that if you use DOMDocument::loadHTML() on invalid HTML or partial documents, the HTML parser module will add an HTML skeleton with html and body tags, so in the DOM tree, the HTML in your example will be

<!DOCTYPE html … ">
<html><body><div id="header">
</div>
<div id="content">
    <div id="sidebar">
    </div>
    <div id="info">
    </div>
</div>
<div id="footer">
</div></body></html>

在遍历或使用XPath时必须考虑到这一点。因此,使用

which you have to take into account when traversing or using XPath. Consequently, using

$dom = new DOMDocument;
$dom->loadHTML($str);
foreach ($dom->documentElement->childNodes as $node) {
    echo $node->nodeName; // body
}

仅会迭代< body> ; DOMElement节点。知道libxml将添加骨骼,因此您必须遍历< body> 元素的childNodes以从示例代码中获取div元素,例如

will only iterate the <body> DOMElement node. Knowing that libxml will add the skeleton, you will have to iterate over the childNodes of the <body> element to get the div elements from your example code, e.g.

$dom->getElementsByTagName('body')->item(0)->childNodes

但是,这样做还将考虑所有空白节点,因此您必须确保设置 reserveWhiteSpace 为false或查询正确的元素 nodeType ,如果您只想获取 DOMElement 节点,例如

However, doing so will also take into account any whitespace nodes, so you either have to make sure to set preserveWhiteSpace to false or query for the right element nodeType if you only want to get DOMElement nodes, e.g.

foreach ($dom->getElementsByTagName('body')->item(0)->childNodes as $node) {
    if ($node->nodeType === XML_ELEMENT_NODE) {
        echo $node->nodeName;
    }
}

或使用XPath

$dom->loadHTML($str);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('/html/body/*') as $node) {
    echo $node->nodeName;
}

其他信息:

  • DOMDocument in php
  • Printing content of a XML file using XML DOM

这篇关于如何通过Domdocument获取第一级dom元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆