如何使用HTML敏捷性包编辑HTML片段 [英] How do I use HTML Agility Pack to edit an HTML snippet

查看:133
本文介绍了如何使用HTML敏捷性包编辑HTML片段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我有我想用C#来修改HTML片段

 < DIV> 
这是我想链接到
℃的specialSearchWord; IMG SRC =anImage.jpg/>
< A HREF =foo.htm>将超级链接< / A>
一些文字,并再次证明specialSearchWord。
< / DIV>

和我想将其转换为这样的:

 < DIV> 
这是一个与所述;类别=特殊的href =http://mysite.com/search/specialSearchWord> specialSearchWord&下; / A>我想链接到
< IMG SRC =anImage.jpg/>
< A HREF =foo.htm>将超级链接< / A>
一些更多的文本和与所述;类别=特殊的href =http://mysite.com/search/specialSearchWord> specialSearchWord&下; / A>再次。
< / DIV>



我要基于这里的许多建议使用HTML敏捷性包,但我不知道我要去哪里。尤其是,




  1. 如何加载的部分片段作为一个字符串,而不是一个完整的HTML文档?

  2. 如何修改?

  3. 我如何然后返回编辑对象的文本字符串?


解决方案

  1. 同样作为一个完整的HTML文档。 。没关系

  2. 的有2个选择:你可以直接编辑的innerHTML 属性(或正文文本节点),或使用如修改DOM树的appendChild PrependChild

  3. 您可以使用 HtmlDocument.DocumentNode.OuterHtml 属性或使用 HtmlDocument.Save 方法(我个人更喜欢第二个选项)。



至于解析,我选择它里面含有你的 DIV ,然后只需使用搜索词的文本节点与string.replace 方法来替代它:

  VAR DOC =新的HTMLDocument (); 
doc.LoadHtml(HTML);
VAR textNodes = doc.DocumentNode.SelectNodes(/ DIV /文本()[包含('specialSearchWord')]);
如果(textNodes!= NULL)
的foreach LT(HtmlTextNode在textNodes节点)
node.Text = node.Text.Replace(specialSearchWord,&;一类='特殊'的href =HTTP://mysite.com/search/specialSearchWord'> specialSearchWord< / A>中);

和结果保存到一个字符串:

 字符串结果= NULL; 
使用(StringWriter的作家=新的StringWriter())
{
doc.Save(作家);
结果= writer.ToString();
}


So I have an HTML snippet that I want to modify using C#.

<div>
This is a specialSearchWord that I want to link to
<img src="anImage.jpg" />
<a href="foo.htm">A hyperlink</a>
Some more text and that specialSearchWord again.
</div>

and I want to transform it to this:

<div>
This is a <a class="special" href="http://mysite.com/search/specialSearchWord">specialSearchWord</a> that I want to link to
<img src="anImage.jpg" />
<a href="foo.htm">A hyperlink</a>
Some more text and that <a class="special" href="http://mysite.com/search/specialSearchWord">specialSearchWord</a> again.
</div>

I'm going to use HTML Agility Pack based on the many recommendations here, but I don't know where I'm going. In particular,

  1. How do I load a partial snippet as a string, instead of a full HTML document?
  2. How do edit?
  3. How do I then return the text string of the edited object?

解决方案

  1. The same as a full HTML document. It doesn't matter.
  2. The are 2 options: you may edit InnerHtml property directly (or Text on text nodes) or modifying the dom tree by using e.g. AppendChild, PrependChild etc.
  3. You may use HtmlDocument.DocumentNode.OuterHtml property or use HtmlDocument.Save method (personally I prefer the second option).

As to parsing, I select the text nodes which contain the search term inside your div, and then just use string.Replace method to replace it:

var doc = new HtmlDocument();
doc.LoadHtml(html);
var textNodes = doc.DocumentNode.SelectNodes("/div/text()[contains(.,'specialSearchWord')]");
if (textNodes != null)
    foreach (HtmlTextNode node in textNodes)
        node.Text = node.Text.Replace("specialSearchWord", "<a class='special' href='http://mysite.com/search/specialSearchWord'>specialSearchWord</a>");

And saving the result to a string:

string result = null;
using (StringWriter writer = new StringWriter())
{
    doc.Save(writer);
    result = writer.ToString();
}

这篇关于如何使用HTML敏捷性包编辑HTML片段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆