简单的HTML分析器 [英] Simple html analyzer

查看:54
本文介绍了简单的HTML分析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,我想要一个简单的html分析器(在c#中),它可以获取html元素的内容.让我解释一下:我想下载一个页面,获取".class1 .class2 #id1 div"的内容,然后将其显示给用户.您是否有线索(除了System.Net.WebClient)?

P.S.到目前为止,我已经找到了使用xPath来获取元素的HTML敏捷包.

Hello, I want a simple html analyzer(in c#) which can get the contents of an html element. Let me explain: I want to download a page, get the contents of ".class1 .class2 #id1 div" and then display it to the user. Do you have any leads(besides System.Net.WebClient)?

P.S. So far I have found HTML agility pack which uses an xPath to get an element.

推荐答案



我认为jQuery可以从给定的html控件中获取所有信息.您可以通过ID/it的关联类从特定的Div/Table中获取InnerHtml内容.

假设您的HTML内容为,
Hi,

I think jQuery can get all the information from your given html control. you can get InnerHtml content from particular Div/Table through the ID/it''s Associated Class.

Suppose you have HTML content as ,
<div class="demo-container">
  <div class="demo-box">Demonstration Box</div>
</div>



然后您可以使用
提取内部Div



Then you can extract Inner Div using,


(' div.demo-container ').html();
('div.demo-container').html();


结果将是


And Result would be

<div class="demo-box">Demonstration Box</div>


(以上代码取自 jQuery [ ^ ])

但是,如果您已经有层次结构信息可以在html中导航,那么这是可能的.

希望我回答了您的查询,
谢谢
-Amit Gajjar.


(Above code is taken from jQuery[^])

But this can be possible if you already have hierarchy information to navigate in the html.

Hope i answered your query,
Thanks
-Amit Gajjar.


如果知道要查找的标签,则可以使用正则表达式解析文件.如果XmlDocument是XHTML,它也可以使用.
You can use regex to parse a file if you know the tags you''re looking for. XmlDocument also works, if its XHTML.


这篇关于简单的HTML分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆