使用XMLPullParser解析包含html标记的内容 [英] Parsing content which contains html tags using XMLPullParser

查看:270
本文介绍了使用XMLPullParser解析包含html标记的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用XmlPullParser在android中构建应用.

I am building an app in android using XmlPullParser.

如何从这样格式化的html中获取内容?

How can I get the content from an html formatted like this?

<div class="content">
"Some text is here."
<br>
"some more text "<a class="link" href="adress">continues here</a>
<br>
</div>

我想这样解析所有内容:

I want to parse all the content like this:

"Some text is here. 
 some more text continues here"

在这里继续"部分也应该超链接.

"continues here" part should also be hyperlinked.

添加一些注释后:首先将HTML放入Yahoo YQL,然后YQL生成XML.我在代码中使用了生成的XML文件.我要解析的上述部分来自生成的XML.

ADDITION after some comments: HTML is first put into Yahoo YQL and YQL generates an XML. I use the generated XML file in the code. Above mentioned part that i want to parse is from the generated XML.

推荐答案

XmlPullParser用于处理XML.很少会遇到在网络上结构良好的XHMTL页面. XML解析器期望格式很好的数据,并且不应容错.另一方面,HTML通常是松散组织的.

XmlPullParser is meant to deal with XML. It's really rare to encounter XHMTL pages that are well structured on the web. An XML Parser would expect very well formatted data and is not supposed to be fault tolerant. On the other hand, HTML is usually loosely organized.

所以,不,这不是一个好主意.您应该更喜欢其他库,例如 tagsoup

So, no, it's not a good idea. You should prefer other libraries like tagsoup or geronimo.

PS:当您问一个栈溢出问题时,最好的方法是自己尝试一些操作,如果被阻止,则提出要求.并非如此.

PS : and the best when you ask a stack over flow question is to try something by yourself and, if blocked, then ask. Not the other way around.

这篇关于使用XMLPullParser解析包含html标记的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆