幽灵HtmlAgilityPack [英] Ghosty HtmlAgilityPack

查看:61
本文介绍了幽灵HtmlAgilityPack的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这里,我真的有令人毛骨悚然的效果. 我尝试替换一个img节点.如果我将文档html打印一次,将不会发生任何事情. 如果我不打印html文档,则可以成功替换img标签. 真的很奇怪,有人可以解释吗?

I have got really ghosty effect here. I try to replace an img node. and if I print out the document html once, nothing will happen. If I don't print out the document html, the img tag can be successfully replaced. It's really strange, can anyone explain?

我的html代码

<!DOCTYPE html>

<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta charset="utf-8" />
    <title></title>
</head>
<body>
    <div id="swap"></div>
</body>
</html>

和我的C#代码

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using HtmlAgilityPack;
using System.IO;
namespace htmlagile
{
    class Program
    {
        static void Main(string[] args)
        {
            HtmlDocument htmldoc = new HtmlDocument();
            string htmlstring;
            using (StreamReader sr = new StreamReader("HTMLPage1.html"))
            {
                htmlstring = sr.ReadToEnd();
            }
            htmldoc.LoadHtml(htmlstring);
            var div = htmldoc.DocumentNode.SelectNodes("//div");

            Console.WriteLine(htmldoc.DocumentNode.OuterHtml);

            foreach (var item in div)
            {
                HtmlNode newTag = htmldoc.CreateElement("p");
                newTag.SetAttributeValue("id", "change");
                item.ParentNode.ReplaceChild(newTag, item);
            }
            Console.WriteLine(htmldoc.DocumentNode.OuterHtml);
        }
    }
}

如果我注释掉第一个控制台.WriteLine,则可以成功更改元素.

if I comment out my first console.WriteLine, the element can be successfully changed.

推荐答案

这是敏捷包中的错误.它们缓存OuterHtml和InnerHtml值.发生更改时,它们只会使直接父级无效.因为您正在打印根,所以它仍然具有旧的缓存值.

It's a bug in the agility pack. They cache the OuterHtml and InnerHtml values. When a change happens, they only invalidate the immediate parent. Because you are printing the root, it still has the old cached value.

http://htmlagilitypack.codeplex.com/workitem/30053

如果您要打印出父div,则应该看到所做的更改实际上是执行的:

If you change to printing out the parent div, you should see that the changes actually were performed:

Console.WriteLine(div.OuterHtml);

这篇关于幽灵HtmlAgilityPack的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆