XmlStreamReader 未读取完整的文本值 [英] XmlStreamReader not reading complete text value

查看:36
本文介绍了XmlStreamReader 未读取完整的文本值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Reading escape characters with XMLStreamReader

但我在这里看到的问题几乎没有什么不同.

But the issue I am seeing here is little different.

我正在阅读一个相当大的 XML 文件,其中包含一大段格式错误的 html 作为标记值之一.这些值包含在 CDATA 中,通常不会引起任何问题.但间歇性地,XMLSTreamReader 类的 getText 方法只读取此 CDATA 中文本的一半,下一批的第一个字符作为示例:

"解析器将其视为开始节点而不是导致解析失败的字符.

I am reading a pretty big XML file which contains a large snippet of malformed html as one of the tag values. The values are enclosed in CDATA and normally they do not cause any issue. But intermittently, getText method of XMLSTreamReader class reads only half of the text in this CDATA and the first character in next batch is as an exmaple : "<table>" which the parser treats as Start node instead of Character causing the parsing to fail.

有没有人在使用 Stax 解析器之前遇到过这个问题.我在 jdk1.,5 上使用 sjsxp1.0.1 实现

Has anyone encountered this issue with Stax parser before. I am using sjsxp1.0.1 implementation on jdk1.,5

任何帮助或疯狂的想法将不胜感激,因为我现在没有任何想法.

Any help or wild ideas would be appreciated as I am out of all ideas now.

推荐答案

我想我在这个问题上取得了一些进展.问题似乎出在 sjsxp 实现中(即使是最新的).有时 getText 方法不会读取整个文本,如果你和我一样不幸,你会遇到一个标签,这会导致问题.我们计划对可能有效的值进行编码,但我们也尝试了 woodstox 实现(http://woodstox.codehaus.org),这似乎可以处理这种情况.所以想问一个后续问题吧

I think I made some head way with the issue. The problem seems to be in sjsxp implementation (even there latest one). Sometimes getText method does not read the entire text and if you are as unlucky as me you would encounter a tag and that would cause the problem. We were planning to encode the values which might work, but we also tried the woodstox implementation (http://woodstox.codehaus.org) and that seems to handle this case. So wanted to ask a follow up question it

有没有其他人使用过 Woodstox 的 Stax 实现并知道与 sjsxp 相比是否存在任何问题?

Has anyone else used Stax implementation of Woodstox and knows if there are any issues compared to sjsxp?

这篇关于XmlStreamReader 未读取完整的文本值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆