如何在.NET 2.0中解析html? [英] How to parse html in .NET 2.0 ?
本文介绍了如何在.NET 2.0中解析html?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
有人可以告诉我如何在.NET中解析HTML吗?
我在网上阅读了许多文章,因此似乎有很多人推荐并批评Regex,MSHTML,HTML Agility Pack和其他SGMLReader.
我基本上只需要从<a>
标记中提取href
值并获取标记文本,例如
Could anyone tell me how to parse HTML in .NET?
I have read many articles online, so many people seem to recommend and also criticize Regex, MSHTML, Html Agility Pack, others SGMLReader.
What I basically need is just to extract href
value, from <a>
tag and get the tag text, e.g
<a href="www.somesite.com">click here</a>
在这种情况下,我需要的是href
值和文本"click here"
感谢
What I need in this case is the href
value and the text "click here"
thanks
推荐答案
尝试使用此HTML解析器. HTML敏捷包 [ ^ ]
选中此 http://blogs.msdn.com/b/smourier/archive/2003/06/04/8265.aspx [ ^ ]. 有关如何修复HTML文件中所有href的示例"
使用HtmlAgilityPack解析C#HTML [使用HTML Agility Pack解析HTML文档 [使用HtmlAgilityPack在C#中使用简单的Web爬网程序 [ ^ ]
Try this HTML parser. Html Agility Pack[^]
Check this http://blogs.msdn.com/b/smourier/archive/2003/06/04/8265.aspx[^]. "Example on how you would fix all hrefs in an HTML file"
C# Parsing HTML with HtmlAgilityPack[^]
Parsing HTML Documents with the Html Agility Pack[^]
A simple web crawler in C# using HtmlAgilityPack[^]
HtmlAgilityPack是必经之路.您正在寻找的是该工具的实际用途.我在移动设备上编写示例时,现在无法提供示例.
干杯!
--MRB
HtmlAgilityPack is the way to go. What you''re looking for is what that tool was practically made for. Can''t supply a sample just now as I''m writing this on a mobile device.
Cheers!
--MRB
这篇关于如何在.NET 2.0中解析html?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文