HTML解析器 [英] HTML Parser

查看:111
本文介绍了HTML解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任何人都知道VB.NET或C#的HTML解析器?我知道.NET有很多XML支持,比如XMLReader和XMLWriter。是否有HTMLWriter或HTMLReader?



最终,我想要的是一个库,它将解析HTML文件并根据找到的标签引发事件。任何人都知道一个图书馆做到这一点? HTML Agility Pack 是要解析HTML的方法(它甚至在标记汤方面做得很好)。理论上,包含在BCL中的XML解析器应该能够解析有效的XHTML,但是HTML Agility Pack是一个通用的解决方案,可以处理普通的HTML,XHTML和两者的混乱变体。

在找到标签时引发事件是当然必须实现的,但使用 HtmlReader 类应该相当简单。


Anyone know of an HTML parser for VB.NET or C#? I know .NET has a lot of XML support, like XMLReader and XMLWriter. Is there an HTMLWriter or HTMLReader?

Ultimately what I'd like is a library that will parser an HTML file and raise events based on the tags it finds. Anyone know of a library to do this?

解决方案

The HTML Agility Pack is the way to go if you want to parse HTML (it even does good job on tag soup). Theoretically, the XML parser included in the BCL should be able to parse valid XHTML, but the HTML Agility Pack is a generic solution that can handle ordinary HTML, XHTML, and messy variants of both.

Raising events when finding tags is something you'll have to implement yourself of course, but it should be fairly trivial using the HtmlReader class.

这篇关于HTML解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆