如何通过Domdocument获取第一级dom元素? [英] How get first level of dom elements by Domdocument?
问题描述
如何通过Domdocument PHP获取第一级dom元素?
How get first level of dom elements by Domdocument PHP?
示例代码无效-摘自Q& A:如何使用PHP DOMDocument获取第一级的节点?
Example with code that not works - tooken from Q&A:How to get nodes in first level using PHP DOMDocument?
<?php
$str=<<< EOD
<div id="header">
</div>
<div id="content">
<div id="sidebar">
</div>
<div id="info">
</div>
</div>
<div id="footer">
</div>
EOD;
$doc = new DOMDocument();
$doc->loadHTML($str);
$xpath = new DOMXpath($doc);
$entries = $xpath->query("/");
foreach ($entries as $entry) {
var_dump($entry->firstChild->nodeValue);
}
?>
推荐答案
根节点下方的第一级元素可以是
The first level of elements below the root node can be accessed with
$dom->documentElement->childNodes
childNodes属性包含 DOMNodeList
,您可以使用 foreach
进行迭代。
The childNodes property contains a DOMNodeList
, which you can iterate with foreach
.
请参见 DOMDocument :: documentElement
这是一个方便属性,它允许直接访问子节点是文档的文档元素。
This is a convenience attribute that allows direct access to the child node that is the document element of the document.
一个连续的DOMNodeList属于此节点的所有子级。如果没有子代,则这是一个空的DOMNodeList。
A DOMNodeList that contains all children of this node. If there are no children, this is an empty DOMNodeList.
由于 childNodes
是 DOMNode
的属性,任何扩展 DOMNode
(这是DOM中的大多数类)的类都具有此属性,因此在 DOMElement
下获得第一级元素是访问该DOMElement的childNode属性。
Since childNodes
is a property of DOMNode
any class extending DOMNode
(which is most of the classes in DOM) have this property, so to get the first level of elements below a DOMElement
is to access that DOMElement's childNode property.
请注意,如果对无效的HTML或部分文档使用 DOMDocument :: loadHTML()
,则HTML解析器模块将添加HTML框架,其中包含html和正文标记,因此在DOM树中,示例中的HTML将为
Note that if you use DOMDocument::loadHTML()
on invalid HTML or partial documents, the HTML parser module will add an HTML skeleton with html and body tags, so in the DOM tree, the HTML in your example will be
<!DOCTYPE html … ">
<html><body><div id="header">
</div>
<div id="content">
<div id="sidebar">
</div>
<div id="info">
</div>
</div>
<div id="footer">
</div></body></html>
在遍历或使用XPath时必须考虑到这一点。因此,使用
which you have to take into account when traversing or using XPath. Consequently, using
$dom = new DOMDocument;
$dom->loadHTML($str);
foreach ($dom->documentElement->childNodes as $node) {
echo $node->nodeName; // body
}
仅会迭代< body> ;
DOMElement节点。知道libxml将添加骨骼,因此您必须遍历< body>
元素的childNodes以从示例代码中获取div元素,例如
will only iterate the <body>
DOMElement node. Knowing that libxml will add the skeleton, you will have to iterate over the childNodes of the <body>
element to get the div elements from your example code, e.g.
$dom->getElementsByTagName('body')->item(0)->childNodes
但是,这样做还将考虑所有空白节点,因此您必须确保设置 reserveWhiteSpace
为false或查询正确的元素 nodeType ,如果您只想获取 DOMElement
节点,例如
However, doing so will also take into account any whitespace nodes, so you either have to make sure to set preserveWhiteSpace
to false or query for the right element nodeType if you only want to get DOMElement
nodes, e.g.
foreach ($dom->getElementsByTagName('body')->item(0)->childNodes as $node) {
if ($node->nodeType === XML_ELEMENT_NODE) {
echo $node->nodeName;
}
}
或使用XPath
$dom->loadHTML($str);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('/html/body/*') as $node) {
echo $node->nodeName;
}
其他信息:
- DOMDocument in php
- Printing content of a XML file using XML DOM
这篇关于如何通过Domdocument获取第一级dom元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!