XMLPullParser解析器无法解析“(??????)[????]& middot;”在xml标签内 [英] XMLPullParser parser failed to parse "(??????) [????] ·" inside xml tag
本文介绍了XMLPullParser解析器无法解析“(??????)[????]& middot;”在xml标签内的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用Jsoup解析XMLPullParser
I am parsing following XMLPullParser with Jsoup
<title>(??????) [????]0 BLACK LAGOON -???? · ????- ?01-09?</title>
<guid isPermaLink='true'>http://fenopy.eu/torrent/+black+lagoon+A+01+09+/OTcyOTA3Mw</guid>
<pubDate>Wed, 27 Feb 2013 11:00:04 GMT</pubDate>
<category>Anime</category>
<link>http://fenopy.eu/torrent/+black+lagoon+A+01+09+/OTcyOTA3Mw</link>
<enclosure url="http://fenopy.eu/torrent/-BLACK-LAGOON-01-09-/OTcyOTA3Mw==/download.torrent" length="569296173" type="application/x-bittorrent" />
<description><![CDATA[ Category: Anime<br/>Size: 542.9 MB<br/>Ratio: 0 seeds, 3 leechers<br/> ]]></description>
</item>
这是我的解析代码
int eventType = -1;
while (eventType != XmlPullParser.END_DOCUMENT) {
switch (eventType) {
// at start of document: START_DOCUMENT
case XmlPullParser.START_DOCUMENT:
break;
// at start of a tag: START_TAG
case XmlPullParser.START_TAG:
// get tag name
String tagName = parser.getName();
if (tagName.equalsIgnoreCase(TAG_TITLE))
String t = parser.nextText();
当我调用下一个文本时,它会抛出异常...
When I call next text and it throws following exception..
org.xmlpull.v1.XmlPullParserException: unresolved: · (position:TEXT (??????) [????] ...@36:59 in java.io.StringReader@40540698)
at org.kxml2.io.KXmlParser.exception(KXmlParser.java:273)
at org.kxml2.io.KXmlParser.error(KXmlParser.java:269)
at org.kxml2.io.KXmlParser.pushEntity(KXmlParser.java:818)
at org.kxml2.io.KXmlParser.pushText(KXmlParser.java:849)
at org.kxml2.io.KXmlParser.nextImpl(KXmlParser.java:354)
at org.kxml2.io.KXmlParser.next(KXmlParser.java:1378)
at org.kxml2.io.KXmlParser.nextText(KXmlParser.java:1432)
推荐答案
您的xml无效。 & middot;
是xml的无效引用。
Your xml isn't valid. ·
is invalid reference for xml.
XML中有5个预定义的实体引用:
There are 5 predefined entity references in XML:
& lt;
<小于
& gt;
>大于
& amp;
& &符号
&
'撇号
& quot;
引号
已更新
简单使用正则表达式替换XML中的所有HTML字符
Simple use regex to replace all HTML characters from XML
XMLString.replaceAll("(&[^\\s]+?;)", ""));
这将用
这篇关于XMLPullParser解析器无法解析“(??????)[????]& middot;”在xml标签内的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文