使用HtmlAgilityPack移除属性 [英] Remove attributes using HtmlAgilityPack

查看:86
本文介绍了使用HtmlAgilityPack移除属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图创建一个代码片段来删除所有样式属性,无论使用 HtmlAgilityPack



以下是我的代码:

  var elements = htmlDoc.DocumentNode.SelectNodes(// *); 

if(elements!= null)
{
foreach(元素中的var元素)
{
element.Attributes.Remove(style) ;
}
}

然而,我并没有明白这一点。如果我在 Remove(style)之后立即查看元素对象。我可以看到样式属性已被移除,但它仍然出现在 DocumentNode 对象中。 :/ b / b

我感觉有点愚蠢,但它对我来说似乎是?任何人使用HtmlAgilityPack完成此操作?感谢!



更新



我将代码更改为以下内容,正常工作:

  public static void RemoveStyleAttributes(this HtmlDocument html)
{
var elementsWithStyleAttribute = html。 DocumentNode.SelectNodes( // @风格);

if(elementsWithStyleAttribute!= null)
{
foreach(elementsWithStyleAttribute中的var元素)
{
element.Attributes [style]。Remove ();




解决方案

你的代码片段似乎是正确的 - 它删除了属性。事情是, DocumentNode .InnerHtml (我假设你监视这个属性)是一个复杂的属性,也许它会在一些未知的情况下得到更新,你实际上不应该使用这个属性将文档作为字符串获取。而不是 HtmlDocument.Save 方法:

  string result =空值; 
using(StringWriter writer = new StringWriter())
{
htmlDoc.Save(writer);
result = writer.ToString();

$ / code>

现在结果变量



还有一件事:您的代码可以通过将表达式改为// // [@ style ]它只让元素具有样式属性。


I'm trying to create a code snippet to remove all style attributes regardless of tag using HtmlAgilityPack.

Here's my code:

var elements = htmlDoc.DocumentNode.SelectNodes("//*");

if (elements!=null)
{
    foreach (var element in elements)
    {
        element.Attributes.Remove("style");
    }
}

However, I'm not getting it to stick? If I look at the element object immediately after Remove("style"). I can see that the style attribute has been removed, but it still appears in the DocumentNode object. :/

I'm feeling a bit stupid, but it seems off to me? Anyone done this using HtmlAgilityPack? Thanks!

Update

I changed my code to the following, and it works properly:

public static void RemoveStyleAttributes(this HtmlDocument html)
{
   var elementsWithStyleAttribute = html.DocumentNode.SelectNodes("//@style");

   if (elementsWithStyleAttribute!=null)
   {
      foreach (var element in elementsWithStyleAttribute)
      {
         element.Attributes["style"].Remove();
      }
   }
}

解决方案

Your code snippet seems to be correct - it removes the attributes. The thing is, DocumentNode .InnerHtml(I assume you monitored this property) is a complex property, maybe it get updated after some unknown circumstances and you actually shouldn't use this property to get the document as a string. Instead of it HtmlDocument.Save method for this:

string result = null;
using (StringWriter writer = new StringWriter())
{
    htmlDoc.Save(writer);
    result = writer.ToString();
}

now result variable holds the string representation of your document.

One more thing: your code may be improved by changing your expression to "//*[@style]" which gets you only elements with style attribute.

这篇关于使用HtmlAgilityPack移除属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆