JAVA SAX 解析器拆分对字符的调用() [英] JAVA SAX parser split calls to characters()
问题描述
我正在做一个项目来解析 XML 中的一些数据.
I am doing a project to parse some data from the XML.
例如,XML 是
<abc>abcdefghijklmno</abc>
我需要解析abcdefghijkmnlp".
I need to parse "abcdefghijkmnlp".
但是当我测试我的解析时,我发现了一个大问题:
But while I test my parse, I discover a big problem:
public class parser{
private boolean hasABC = false;
//Constructor HERE
......................
......................
@Override
public void startDocument () throws SAXException{
}
@Override
public void endDocument () throws SAXException{
}
@Override
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException{
if ("abc".equalsIgnoreCase(localName)) {
this.hasABC = true;
}
}
@Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException{
if ("abc".equalsIgnoreCase(localName)) {
this.hasABC = false;
}
}
@Override
public void characters(char ch[], int start, int length){
String content = new String(ch, start, length).trim();
if(this.hasABC){
System.out.println("ABC = " + content);
}
}
}
我发现解析器已经解析了标签两次系统打印出来的是,
I discover that the parser has parsed the tag two time System print out is,
ABC = abcdefghi
ABC = abcdefghi
ABC = jklmno <<============ 拆分消息
ABC = jklmno <<============ split the message
为什么解析器会自动回调 characters() 两次????
Why the parser auto call back the characters() two time????
XML 是否有一些 "或 "???
Is the XML haveing some " " or " " ???
推荐答案
Parser 多次调用 characters
方法,因为它可以并且允许每个规范.这有助于快速解析器并保持较低的内存占用.如果您想要单个字符串,请在 startElement
中创建一个新的 StringBuilder
对象,并在 endElement
方法上对其进行处理.
Parser is calling characters
method more than one time, because it can and allowed per spec. This helps fast parser and keep their memory footprint low. If you want a single string create a new StringBuilder
object in the startElement
and process it on endElement
method.
这篇关于JAVA SAX 解析器拆分对字符的调用()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!