解析 HTML 以使用 PHP 获取所有 Option 标签 [英] Parsing HTML to get all Option tags with PHP
问题描述
我正在解析包含以下内容的 HTML 页面:
而且我需要从那里获取一些值和一些数据.
解决这个问题最简单的方法是什么?需要注意的是 somevalue 和 Somedata 总是不同的(可以这么说)
它的形式如下:
请注意,名称是ALWAYS attrib1!
好的,因为我看不到完整的 HTML,我不确定它是否格式正确,所以我会尝试使用更多宽容 DOM 函数.首先,我将使用这个最小的 html 文件作为示例:
test.html
<身体><select name="attrib1" class="Input"><option value="0"> </option><option value="140">140</option><option value="141">150</option><option value="142">160</option></选择></html>
现在,我们需要做的第一件事是创建一个 DOM 解析器.我们会这样做:
$doc = new DOMDocument();$doc->loadHTMLFile("test.html");
<块引用>
好的,接下来我们需要看看要求:
我正在解析和 HTML 页面包含一个:
而且我需要同时获得 somevalue 和一些数据.
您还提到:
<块引用>请注意,名称始终是 attrib1!
根据这些要求,我将选择名称为attrib1"的选择子项的所有选项标记.为此,我将使用一种叫做 XPath 的东西.这是一种非常灵活的方式,可以根据特定条件选择 dom 元素.让我们慢慢构建它:
*/选择所有元素*/选择选择所有属于选择元素的元素*/选择[@name='attrib1']选择名称为 attrib1 的所有元素*/select[@name='attrib1']/option 全选选择所有名称为 attrib1 的 select 元素下的所有 option 元素
现在,我们需要进行查找,因此我们使用 XPath 函数:
$xpath = new DOMXpath($doc);$options = $xpath->query("*/select[@name='attrib1']/option");foreach ($options as $option) {}
现在我们需要 value 属性和里面的文本.我们将首先获得 value 属性:
$optionValue = $option->getAttribute('value');
然后我们得到选项标签中的内容:
$optionContent = $option->nodeValue;
一旦我们把这一切放在一起:
$doc = new DOMDocument();$doc->loadHTMLFile("test.html");$xpath = new DOMXpath($doc);$options = $xpath->query("*/select[@name='attrib1']/option");foreach ($options as $option) {$optionValue = $option->getAttribute('value');$optionContent = $option->nodeValue;echo "$optionValue 和 $optionContent\n";}
我们将得到以下输出:
0 和140 和 140141 和 150142 和 160
这就给你了.
I'm parsing and HTML page that contains a:
<select>
<option value="somevalue">Somedata</option>
</select>
And I need to get both somevalue and somedata out of there.
What's the easiest way to go about this? It should be noted that somevalue and Somedata is always different (So to speak)
It is formed like:
<select name="attrib1" class="Input">
<option value="0"> </option>
<option value="140">140</option>
<option value="141">150</option>
<option value="142">160</option>
</select>
Please note, the name is ALWAYS attrib1!
Okay, since I can't see the full HTML, I'm not really sure if it's well formed, so I'll attempt to do this using more forgiving DOM functions. First off, I'm going to use this minimal html file as a sample:
test.html
<html>
<body>
<select name="attrib1" class="Input">
<option value="0"> </option>
<option value="140">140</option>
<option value="141">150</option>
<option value="142">160</option>
</select>
</body>
</html>
Now then, the first thing we need to do is create a DOM parser. We'll do this like so:
$doc = new DOMDocument();
$doc->loadHTMLFile("test.html");
Okay, next we'll need to look at the requirements:
I'm parsing and HTML page that contains a:
<select> <option value="somevalue">Somedata</option> </select>
And I need to get both somevalue and somedata out of there.
You also mention:
Please note, the name is ALWAYS attrib1!
Based on these requirements, I'm going to select all option tags that are a child of selects with the name "attrib1". To do so, I'm going to use something called XPath. This is a very flexible way to select dom elements based on specific conditions. Let's slowly build this out:
*/
select all elements
*/select
select all elements that are select elements
*/select[@name='attrib1']
select all elements that are select elements with the name of attrib1
*/select[@name='attrib1']/option select all
select all option elements under all select elements with the name of attrib1
Now then, we need to do this lookup, so we use the XPath functions:
$xpath = new DOMXpath($doc);
$options = $xpath->query("*/select[@name='attrib1']/option");
foreach ($options as $option) {
}
Now we need the value attribute, and the text inside. We'll first get the value attribute:
$optionValue = $option->getAttribute('value');
Then we get what's inside the option tag:
$optionContent = $option->nodeValue;
And once we put this all together:
$doc = new DOMDocument();
$doc->loadHTMLFile("test.html");
$xpath = new DOMXpath($doc);
$options = $xpath->query("*/select[@name='attrib1']/option");
foreach ($options as $option) {
$optionValue = $option->getAttribute('value');
$optionContent = $option->nodeValue;
echo "$optionValue and $optionContent\n";
}
We'll get the following output:
0 and
140 and 140
141 and 150
142 and 160
And there you have it.
这篇关于解析 HTML 以使用 PHP 获取所有 Option 标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!