如何在.NET 2.0中解析html? [英] How to parse html in .NET 2.0 ?

查看:85
本文介绍了如何在.NET 2.0中解析html?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以告诉我如何在.NET中解析HTML吗?

我在网上阅读了许多文章,因此似乎有很多人推荐并批评Regex,MSHTML,HTML Agility Pack和其他SGMLReader.

我基本上只需要从<a>标记中提取href值并获取标记文本,例如

Could anyone tell me how to parse HTML in .NET?

I have read many articles online, so many people seem to recommend and also criticize Regex, MSHTML, Html Agility Pack, others SGMLReader.

What I basically need is just to extract href value, from <a> tag and get the tag text, e.g

<a href="www.somesite.com">click here</a>



在这种情况下,我需要的是href值和文本"click here"

感谢



What I need in this case is the href value and the text "click here"

thanks

推荐答案

尝试使用此HTML解析器. HTML敏捷包 [ ^ ]

选中此 http://blogs.msdn.com/b/smourier/archive/2003/06/04/8265.aspx [ ^ ]. 有关如何修复HTML文件中所有href的示例"

使用HtmlAgilityPack解析C#HTML [使用HTML Agility Pack解析HTML文档 [使用HtmlAgilityPack在C#中使用简单的Web爬网程序 [ ^ ]
Try this HTML parser. Html Agility Pack[^]

Check this http://blogs.msdn.com/b/smourier/archive/2003/06/04/8265.aspx[^]. "Example on how you would fix all hrefs in an HTML file"

C# Parsing HTML with HtmlAgilityPack[^]

Parsing HTML Documents with the Html Agility Pack[^]

A simple web crawler in C# using HtmlAgilityPack[^]


HtmlAgilityPack是必经之路.您正在寻找的是该工具的实际用途.我在移动设备上编写示例时,现在无法提供示例.

干杯!

--MRB
HtmlAgilityPack is the way to go. What you''re looking for is what that tool was practically made for. Can''t supply a sample just now as I''m writing this on a mobile device.

Cheers!

--MRB


这篇关于如何在.NET 2.0中解析html?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆