使用PHP的DOM提取HTML页面 [英] Extract HTML of a Scraped Page Using PHP's DOM
问题描述
是否可以通过PHP的DOM工具(例如$ div = $ dom-> getElementsByTagName('table') - > item(0);)提取的HTML片段的内容中创建HTML输出,以便创建的HTML只包含具有指定标签名称的元素及其后代?
Is it possible to create HTML output from the contents of an HTML snippet that has been extracted via PHP's DOM tools (e.g. $div = $dom->getElementsByTagName('table')->item(0);) such that the HTML created contains just the elements with specified tag name, and their descendants?
否则,还有其他方法可以从页面的完整HTML轻松提取HTML片段吗?我只是想提取一个页面的第一张表,并且只显示该表格及其内容。
Otherwise, are there perhaps any other ways to easily extract a snippet of HTML from the full HTML of a page? I just want to extract the first table of a page I scraped, and display just that table and its content.
推荐答案
您可以将节点传递到 DOMDocument :: saveXML()
Yes, you can pass a node to DOMDocument::saveXML()
echo $dom->saveXml($div);
然后会给你节点的outerHTML
which will then give you the outerHTML of the node
这篇关于使用PHP的DOM提取HTML页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!