HtmlAgilityPack-是否< form>由于某种原因关闭自己? [英] HtmlAgilityPack -- Does <form> close itself for some reason?

查看:49
本文介绍了HtmlAgilityPack-是否< form>由于某种原因关闭自己?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是写了这个测试,看我是否发疯了……

I just wrote up this test to see if I was crazy...

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using HtmlAgilityPack;

namespace HtmlAgilityPackFormBug
{
    class Program
    {
        static void Main(string[] args)
        {
            var doc = new HtmlDocument();
            doc.LoadHtml(@"
<!DOCTYPE html>
<html>
    <head>
        <title>Form Test</title>
    </head>
    <body>
        <form>
            <input type=""text"" />
            <input type=""reset"" />
            <input type=""submit"" />
        </form>
    </body>
</html>
");
            var body = doc.DocumentNode.SelectSingleNode("//body");
            foreach (var node in body.ChildNodes.Where(n => n.NodeType == HtmlNodeType.Element))
                Console.WriteLine(node.XPath);
            Console.ReadLine();
        }
    }
}

其输出:

/html[1]/body[1]/form[1]
/html[1]/body[1]/input[1]
/html[1]/body[1]/input[2]
/html[1]/body[1]/input[3]

但是,如果我将< form> 更改为< xxx> 它给了我:

But, if I change <form> to <xxx> it gives me:

/html[1]/body[1]/xxx[1]

(应如此)。因此...看起来这些输入元素不是包含在表单中,而是直接包含在主体中,就像< form> 刚刚关闭自己。那是怎么回事?这是一个错误吗?

(As it should). So... it looks like those input elements are not contained within the form, but directly within the body, as if the <form> just closed itself off immediately. What's up with that? Is this a bug?

翻阅源代码,我看到:

ElementsFlags.Add("form", HtmlElementFlag.CanOverlap | HtmlElementFlag.Empty);

它具有空标志,例如META和IMG。为什么??表单绝对是 not 应该是空的。

It has the "empty" flag, like META and IMG. Why?? Forms are most definitely not supposed to be empty.

推荐答案

这也记录在此工作项。它包含DarthObiwan建议的解决方法。

This is also reported in this workitem. It contains a suggested workaround from DarthObiwan.


您可以更改它而无需重新编译。 ElementFlags列表是HtmlNode类的
静态属性。可以使用

You can change this without recompiling. The ElementFlags list is a static property on the HtmlNode class. It can be removed with

    HtmlNode.ElementsFlags.Remove("form");

在加载文档之前

这篇关于HtmlAgilityPack-是否&lt; form&gt;由于某种原因关闭自己?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆