如何使用HtmlAgilityPack获取表单中的所有输入元素而不会收到空引用错误 [英] How to get all input elements in a form with HtmlAgilityPack without getting a null reference error
问题描述
< html>< body>
< form id =form1>
< input name =foo1value =bar1/>
<! - 其他元素 - >
< / form>
< form id =form2>
< input name =foo2value =bar2/>
<! - 其他元素 - >
< / form>
< / body>< / html>
测试代码:
HtmlDocument doc = new HtmlDocument();
doc.Load(@D:\test.html);
foreach(doc.GetElementbyId(form2)中的HtmlNode节点。)SelectNodes(.// input))
{
Console.WriteLine(node.Attributes [value]。值);
语句 doc.GetElementbyId(form2) .SelectNodes(.// input)
给了我一个空引用。
我做错了什么?感谢。
您可以执行以下操作:
HtmlNode.ElementsFlags.Remove( 形式);
HtmlDocument doc = new HtmlDocument();
doc.Load(@D:\test.html);
HtmlNode secondForm = doc.GetElementbyId(form2);
$ b foreach(secondForm.Elements(input)中的HtmlNode节点)
{
HtmlAttribute valueAttribute = node.Attributes [value];
if(valueAttribute!= null)
{
Console.WriteLine(valueAttribute.Value);
$ b $ p
$ b 默认情况下,HTML Agility Pack会将表单解析为空节点,因为他们被允许重叠其他HTML元素。第一行( HtmlNode.ElementsFlags.Remove(form);
)禁用了这种行为,允许您在第二个表单中获取输入元素。
更新:
表单元素重叠示例:
<表>
< form>
<! - 其他元素 - >
< / table>
< / form>
元素从表格开始,但在表格元素之外关闭。这是HTML规范允许的,HTML敏捷包必须处理它。
Example HTML:
<html><body>
<form id="form1">
<input name="foo1" value="bar1" />
<!-- Other elements -->
</form>
<form id="form2">
<input name="foo2" value="bar2" />
<!-- Other elements -->
</form>
</body></html>
Test code:
HtmlDocument doc = new HtmlDocument();
doc.Load(@"D:\test.html");
foreach (HtmlNode node in doc.GetElementbyId("form2").SelectNodes(".//input"))
{
Console.WriteLine(node.Attributes["value"].Value);
}
The statement doc.GetElementbyId("form2").SelectNodes(".//input")
gives me a null reference.
Anything I did wrong? thanks.
解决方案 You can do the following:
HtmlNode.ElementsFlags.Remove("form");
HtmlDocument doc = new HtmlDocument();
doc.Load(@"D:\test.html");
HtmlNode secondForm = doc.GetElementbyId("form2");
foreach (HtmlNode node in secondForm.Elements("input"))
{
HtmlAttribute valueAttribute = node.Attributes["value"];
if (valueAttribute != null)
{
Console.WriteLine(valueAttribute.Value);
}
}
By default HTML Agility Pack parses forms as empty node because they are allowed to overlap other HTML elements. The first line, (HtmlNode.ElementsFlags.Remove("form");
) disables this behavior allowing you to get the input elements inside the second form.
Update:
Example of form elements overlap:
<table>
<form>
<!-- Other elements -->
</table>
</form>
The element begins inside a table but is closed outside the table element. This is allowed in the HTML specification and HTML Agility Pack has to deal with it.
这篇关于如何使用HtmlAgilityPack获取表单中的所有输入元素而不会收到空引用错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!