SAX解析器:从XML检索HTML标签 [英] SAX Parser : Retrieving HTML tags from XML

查看:216
本文介绍了SAX解析器:从XML检索HTML标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要解析的XML,它下面

I have an XML to be parsed, which as given below

<feed>
    <feed_id>12941450184d2315fa63d6358242</feed_id>
    <content> <fieldset><table cellpadding='0'  border='0'  cellspacing='0'  style="clear :both"><tr valign='top' ><td width='35' ><a href='http://mypage.rediff.com/android/32868898'  class='space' onmousedown="return enc(this,'http://track.rediff.com/click?url=___http%3A%2F%2Fmypage.rediff.com%2Fandroid%2F32868898___&service=mypage_feeds&clientip=202.137.232.113&pos=0&feed_id=12941450184d2315fa63d6358242&prc_id=32868898&rowid=674061088')" ><div style='width:25px;height:25px;overflow:hidden;'><img src='http://socialimg04.rediff.com/image.php?uid=32868898&type=thumb'  width='25'  vspace='0'  /></div></a></td> <td><span><a href='http://mypage.rediff.com/android/32868898'  class="space" onmousedown="return enc(this,'http://track.rediff.com/click?url=___http%3A%2F%2Fmypage.rediff.com%2Fandroid%2F32868898___&service=mypage_feeds&clientip=202.137.232.113&pos=0&feed_id=12941450184d2315fa63d6358242&prc_id=32868898&rowid=674061088')" >Android </a> </span><span style='color:#000000 !important;'>testing</span><div class='divtext'></div></td></tr><tr><td height='5' ></td></tr></table></fieldset><br/></content>
    <action>status updated</action>
</feed>

标签包含HTML内容,其中包含了我所需要的数据。我使用的是SAX解析器。下面是我在做什么。

Tag contains HTML contents, which contains the data which i need. I am using a SAX Parser. Here's what i am doing

private Timeline timeLine; //Object
private String tempStr;

public void characters(char[] ch, int start, int length)
        throws SAXException {
    tempStr = new String(ch, start, length);
}

public void endElement(String uri, String localName, String qName)
        throws SAXException {
    if (localName.equalsIgnoreCase("content")) {
        if (timeLine != null) {
            timeLine.setContent(tempStr);
        }
}

将这个逻辑工作?如果没有,我应该如何提取XML使用SAX解析器嵌入HTML数据。

Will this logic work? If no, how should i extract embedded HTML data from XML using SAX Parser.

推荐答案

在开始元素
        如果该元素的内容,您的临时STR缓冲液应该被初始化。
        否则,如果内容已经开始,
           捕捉当前开始元素和它的属性,并更新到临时STR缓冲液。

On start element, if the element is content, your temp Str buffer should be initialized. else if content already started, capture the current start element and its attributes and update that to the temp Str buffer.

在字符
        如果内容开始时,charecters添加到当前字符串缓冲区

On characters, if content is started, add the charecters to the current string buffer.

在结束元素
       如果内容被启动,捕获终端节点,并添加到字符串缓冲区。

On end element if content is started, Capture the end node and add to string buffer.

我的假设:

XML将只有一个标记的内容。

The xml will have only one content tag.

这篇关于SAX解析器:从XML检索HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆