JAVA SAX解析器拆分调用字符() [英] JAVA SAX parser split calls to characters()
问题描述
我正在做一个解析XML中的一些数据的项目。
例如,XML是
< ABC> ABCDEFGHIJKLMNO< / ABC>
我需要解析abcdefghijkmnlp。
<但是,当我测试我的解析时,我发现了一个大问题:
公共类解析器{
private boolean hasABC = false;
//构造函数HERE
......................
.... ..................
@Override
public void startDocument()抛出SAXException {
}
@Override
public void endDocument()throws SAXException {
}
@Override
public void startElement(String namespaceURI,String localName,String qName,Attributes抛出SAXException {
if(abc.equalsIgnoreCase(localName)){
this.hasABC = true;
}
}
@Override
public void endElement(String namespaceURI,String localName,String qName)抛出SAXException {
if(abc.equalsIgnoreCase(localName) ){
this.hasABC = false;
}
}
@Override
public void characters(char ch [],int start,int length){
String content = new String(ch,start,长度).trim();
if(this.hasABC){
System.out.println(ABC =+ content);
}
}
}
我发现解析器有解析标签两次
系统打印输出是,
ABC = abcdefghi
ABC = jklmno << ============拆分消息
为什么解析器会自动回调两次字符()?
XML是否有一些\ n或\ r???
Parser多次调用字符
方法,因为它可以并且允许每个规范。这有助于快速解析并保持较低的内存占用。如果你想要一个字符串在 startElement
中创建一个新的 StringBuilder
对象并在上处理它endElement
方法。
I am doing a project to parse some data from the XML.
For example, the XML is
<abc>abcdefghijklmno</abc>
I need to parse "abcdefghijkmnlp".
But while I test my parse, I discover a big problem:
public class parser{
private boolean hasABC = false;
//Constructor HERE
......................
......................
@Override
public void startDocument () throws SAXException{
}
@Override
public void endDocument () throws SAXException{
}
@Override
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException{
if ("abc".equalsIgnoreCase(localName)) {
this.hasABC = true;
}
}
@Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException{
if ("abc".equalsIgnoreCase(localName)) {
this.hasABC = false;
}
}
@Override
public void characters(char ch[], int start, int length){
String content = new String(ch, start, length).trim();
if(this.hasABC){
System.out.println("ABC = " + content);
}
}
}
I discover that the parser has parsed the tag two time System print out is,
ABC = abcdefghi
ABC = jklmno <<============ split the message
Why the parser auto call back the characters() two time????
Is the XML haveing some "\n" or "\r" ???
Parser is calling characters
method more than one time, because it can and allowed per spec. This helps fast parser and keep their memory footprint low. If you want a single string create a new StringBuilder
object in the startElement
and process it on endElement
method.
这篇关于JAVA SAX解析器拆分调用字符()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!