如何解析部分HTML？ [英] How do I parse partial HTML?

查看：100 发布时间：2017/6/24 22:01:10 php html dom parsing

本文介绍了如何解析部分HTML？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在PHP中使用DOM解析一些HTML，但是我遇到了一些问题。首先，如果这样改变了解决方案，我所拥有的HTML不是一个整页，而是它只是其中的一部分。

I'm trying to parse some HTML with DOM in PHP, but I'm having some problems. First, in case this change the solution, the HTML that I have is not a full page, rather, it's only part of it.

<!-- This is the HTML that I have --><a href='/games/'>
<div id='game'>
<img src='http://images.example.com/games.gif' width='300' height='137' border='0'>
<br><b> Game </b>
</div>
<div id='double'>
<img src='http://images.example.com/double.gif' width='300' height='27' border='0' alt='' title=''>
</div>
</a>

现在我只想得到id为$ code> double 。我尝试过以下代码，但似乎没有正常工作。我可能会做错什么？

Now I'm trying to get only the div with the id double. I've tried the following code, but it doesn't seem to be working properly. What might I be doing wrong?

//The HTML has been loaded into the variable $html
$dom=new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false; 
$keepme = $dom->getElementById('double'); 

$contents = '<div style="text-align:center">'.$keepme.'</a></div>';
echo $contents;

推荐答案

我认为 DOMDocument :: getElementById 将无法在您的情况下工作：（引用）

要使此功能正常工作，您将
需要设置一些ID属性
与 DOMElement :: setIdAttribute 或
DTD，它定义一个属性为
的类型ID。
在后一种情况下，您需要
验证您的文档
与 DOMDocument :: validate 或
DOMDocument-> validateOnParse 在
之前使用此功能。

For this function to work, you will need either to set some ID attributes with DOMElement::setIdAttribute or a DTD which defines an attribute to be of type ID.
In the later case, you will need to validate your document with DOMDocument::validate or DOMDocument->validateOnParse before using this function.

可能有效的解决方案是使用一些 XPath查询来提取元素你正在寻找。

A solution that might work is using some XPath query to extract the element you are looking for.

首先，我们像你第一次一样加载HTML部分：

First of all, let's load the HTML portion, like you first did :

$dom=new domDocument;
$dom->loadHTML($html);
var_dump($dom->saveHTML());

var_dump 只是为了证明HTML部分已经成功加载 - 从其输出判断它有。

The var_dump is here only to prove that the HTML portion has been loaded successfully -- judging from its output, it has.

然后，将 DOMXPath 类，并使用它来查询要获取的元素：

Then, instanciate the DOMXPath class, and use it to query for the element you want to get :

$xpath = new DOMXpath($dom);
$result = $xpath->query("//*[@id = 'double']");
$keepme = $result->item(0);

现在我们需要你想要的元素; - ）

We now have to element you want ;-)

但是，为了将HTML内容注入另一个HTML段，我们必须先获取其HTML内容。

But, in order to inject its HTML content in another HTML segment, we must first get its HTML content.

不要记住任何容易的方式来做到这一点，但是这样的东西可以做到这一点：

I don't remember any "easy" way to do that, but something like this sould do the trick :

$tempDom = new DOMDocument();
$tempImported = $tempDom->importNode($keepme, true);
$tempDom->appendChild($tempImported);
$newHtml = $tempDom->saveHTML();
var_dump($newHtml);

而且...我们拥有您的双重 < div> ：

And... We have the HTML content of your double <div> :

string '<div id="double">
<img src="http://images.example.com/double.gif" width="300" height="27" border="0" alt="" title="">
</div>
' (length=125)

现在，你只是必须做任何你想要的; - ）

Now, you just have to do whatever you want with it ;-)

这篇关于如何解析部分HTML？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何解析部分HTML？ [英] How do I parse partial HTML?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

如何解析部分HTML？ [英] How do I parse partial HTML?

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭