使用XML解析器拉的XML解析CDATA节 [英] Parsing the CDATA section in XML using XML Pull Parser
问题描述
示例XML
<feed xmlns="http://www.w3.org/2005/Atom">
<title>NDTV News - Top Stories</title>
<link>http://www.ndtv.com/</link>
<description>Latest entries</description>
<language>en</language>
<pubDate>Wed, 31 Jul 2013 22:33:00 GMT</pubDate>
<lastBuildDate>Wed, 31 Jul 2013 22:33:00 GMT</lastBuildDate>
<entry>
<title>Narendra Modi to be BJP's PM candidate, announcement before crucial assembly polls: sources</title>
<link>http://feedproxy.google.com/~r/NdtvNews-TopStories/~3/XN7dMIDe5YI/story01.htm</link>
<published>Wed, 31 Jul 2013 13:58:31 GMT</published>
<author>
<name>user42715</name>
</author>
<content type="html"><![CDATA[<div align="center"><a href="http://www.ndtv.com/news/images/topstory_thumbnail/ Shatrughan_Sinha_agency_120.jpg"><img border="0" src="http://www.ndtv.com/news/images/topstory_thumbnail/Shatrughan_Sinha_agency_120.jpg" alt="2013-07-29-08-43-05" /></a></div><p><span style="font-size: large;">The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September.</span><br /><br /><span style="font-size: large;"> The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September. </span><br /><br /><span style="font-size: large;">The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September. </span><br /><br /></p>]]></content>
</entry>
</feed>
随着标签内低于code,我能够检索和价值观。
With the below code I was able to retrieve , and values within the tag.
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
private XmlPullParser parser = factory.newPullParser();
private InputStream urlStream = downloadUrl(urlString);
parser.setInput(urlStream, null);
int eventType = parser.getEventType();
boolean done = false;
while (eventType != XmlPullParser.END_DOCUMENT && !done) {
tagName = parser.getName();
switch (eventType) {
case XmlPullParser.START_DOCUMENT:
break;
case XmlPullParser.START_TAG:
if (tagName.equals("entry")) {
}
if (tagName.equals("title")) {
title = parser.nextText().toString();
Log.i(TITLE, title);
}
if (tagName.equals("published")) {
pubDate = parser.nextText().toString();
Log.i(PUBLISHEDDATE, pubDate);
}
if (tagName.equals("author")) {
readAuthor(parser);
Log.i(AUTHOR, author);
}
break;
case XmlPullParser.END_TAG:
if (tagName.equals("feed")) {
done = true;
} else if (tagName.equals("entry")) {
rssFeed = new RssFeedStructure(title);
rssFeedList.add(rssFeed);
}
break;
}
eventType = parser.next();
}
private String readAuthor(XmlPullParser parser) throws IOException,
XmlPullParserException {
parser.nextTag();
parser.require(XmlPullParser.START_TAG, null, "name");
author = parser.nextText().toString();
parser.require(XmlPullParser.END_TAG, null, "name");
return author;
}
这是我怎么可以检索中的HREF值和文本值的标签(人民党可能膏纳伦德拉·莫迪.....)从
标签。
From the tag how can I retrieve the "href" value within the and the text value(The BJP is likely to anoint Narendra Modi.....) from the
tag.
推荐答案
您可以使用JSoup。下载@ http://jsoup.org/download 。罐子添加到库文件夹。
You can use JSoup. Download @ http://jsoup.org/download. Add the jar to the libs folder.
要解析器我复制的RSS源XML文件中的资产产生的文件夹。 (localy)
To parser i copied the rss feed to xml file in assests folder. (localy)
XmlPullParser xpp = factory.newPullParser();
InputStream is = this.getAssets().open("xmlparser.xml");
xpp.setInput(is, "UTF_8");
您可以使用下面的,因为你的URL。我AVE展示了如何提取URL和内容。你需要提取的其他变量的内容,你通常会做的。
You can use the below since you have the url. I ave shown how to extract the url and the content. you need to extract the contents of other tags as you would do normally.
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(urlStream, null);
boolean insideItem = false;
// Returns the type of current event: START_TAG, END_TAG, etc..
int eventType = xpp.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
if (eventType == XmlPullParser.START_TAG) {
if (xpp.getName().equalsIgnoreCase("entry")) {
insideItem = true;
}
else if (xpp.getName().equalsIgnoreCase("content")) {
if (insideItem)
{
Document doc = Jsoup.parse(xpp.nextText());
Elements links = doc.select("a[href]"); // a with href
for (Element link : links) {
Log.i("........",""+link.attr("abs:href"));
}
Element divcontent = doc.select("span").first();
Log.i("..........",""+divcontent.text());
}
}
} else if (eventType == XmlPullParser.END_TAG
&& xpp.getName().equalsIgnoreCase("entry")) {
insideItem = false;
}
eventType = xpp.next(); // move to next element
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (XmlPullParserException e1) {
e1.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
日志:
08-03 08:03:04.413: I/........(1524): http://www.ndtv.com/news/images/topstory_thumbnail/ Shatrughan_Sinha_agency_120.jpg
08-03 08:03:04.423: I/..........(1524): The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September.
编辑:遍历元素
To loop through the elements
Elements divcontent = doc.select("span");
for(int k= 1;k<divcontent.size();k++)
{
String spancontent =divcontent.get(k).text();
Log.i("..........",spancontent);
}
这篇关于使用XML解析器拉的XML解析CDATA节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!