使用HtmlAgilityPack移除属性 [英] Remove attributes using HtmlAgilityPack
问题描述
我试图创建一个代码片段来删除所有样式
属性,无论使用 HtmlAgilityPack 。
以下是我的代码:
var elements = htmlDoc.DocumentNode.SelectNodes(// *);
if(elements!= null)
{
foreach(元素中的var元素)
{
element.Attributes.Remove(style) ;
}
}
然而,我并没有明白这一点。如果我在 Remove(style)
之后立即查看元素
对象。我可以看到样式属性已被移除,但它仍然出现在 DocumentNode
对象中。 :/ b / b
我感觉有点愚蠢,但它对我来说似乎是?任何人使用HtmlAgilityPack完成此操作?感谢!
更新
我将代码更改为以下内容,正常工作:
public static void RemoveStyleAttributes(this HtmlDocument html)
{
var elementsWithStyleAttribute = html。 DocumentNode.SelectNodes( // @风格);
if(elementsWithStyleAttribute!= null)
{
foreach(elementsWithStyleAttribute中的var元素)
{
element.Attributes [style]。Remove ();
你的代码片段似乎是正确的 - 它删除了属性。事情是, DocumentNode .InnerHtml
(我假设你监视这个属性)是一个复杂的属性,也许它会在一些未知的情况下得到更新,你实际上不应该使用这个属性将文档作为字符串获取。而不是 HtmlDocument.Save
方法:
string result =空值;
using(StringWriter writer = new StringWriter())
{
htmlDoc.Save(writer);
result = writer.ToString();
$ / code>
现在结果
变量
还有一件事:您的代码可以通过将表达式改为// // [@ style ]
它只让元素具有样式
属性。
I'm trying to create a code snippet to remove all style
attributes regardless of tag using HtmlAgilityPack.
Here's my code:
var elements = htmlDoc.DocumentNode.SelectNodes("//*");
if (elements!=null)
{
foreach (var element in elements)
{
element.Attributes.Remove("style");
}
}
However, I'm not getting it to stick? If I look at the element
object immediately after Remove("style")
. I can see that the style attribute has been removed, but it still appears in the DocumentNode
object. :/
I'm feeling a bit stupid, but it seems off to me? Anyone done this using HtmlAgilityPack? Thanks!
Update
I changed my code to the following, and it works properly:
public static void RemoveStyleAttributes(this HtmlDocument html)
{
var elementsWithStyleAttribute = html.DocumentNode.SelectNodes("//@style");
if (elementsWithStyleAttribute!=null)
{
foreach (var element in elementsWithStyleAttribute)
{
element.Attributes["style"].Remove();
}
}
}
Your code snippet seems to be correct - it removes the attributes. The thing is, DocumentNode .InnerHtml
(I assume you monitored this property) is a complex property, maybe it get updated after some unknown circumstances and you actually shouldn't use this property to get the document as a string. Instead of it HtmlDocument.Save
method for this:
string result = null;
using (StringWriter writer = new StringWriter())
{
htmlDoc.Save(writer);
result = writer.ToString();
}
now result
variable holds the string representation of your document.
One more thing: your code may be improved by changing your expression to "//*[@style]"
which gets you only elements with style
attribute.
这篇关于使用HtmlAgilityPack移除属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!