SAX解析器忽略,因为&LT的文本; BR />标签 [英] SAX parser ignores text because of a <br /> tag

查看:209
本文介绍了SAX解析器忽略,因为&LT的文本; BR />标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这里有一个小问题,我不知道如何解决它。
我有一个看起来像这样的XML文件:

have a slight problem here and I don't know how to fix it. I have an XML file that looks like this:

<?xml version="1.0"?>
<item>
 <title>Item 1</name>
 <description>Description Text 1&lt;br /&gt;Description Text 2</description>
</item>

和我有一个SAX解析器,看起来像这样:

And I have a SAX parser that looks like this:

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    if ("item".equals(qName)) {
        currentItem = new Item();
    } else if ("title".equals(qName)) {
        parsingTitle = true;
    } else if ("description".equals(qName)) {
        parsingDescription = true;
    }
}

@Override
public void endElement(String uri, String localName, String qName) throws SAXException {

    System.out.println("Testing endelement");

    if ("item".equals(qName)) {
        Items.add(currentItem);
        currentItem = null;
    } else if ("title".equals(qName)) {
        parsingTitle = false;
    } else if ("description".equals(qName)) {
        parsingDescription = false;
    }
}

@Override
public void characters(char[] ch, int start, int length) throws SAXException {

    System.out.println("writing");

    if (parsingTitle) {
        if (currentItem != null)
            currentItem.setTitle(new String(ch, start, length));
    } else if (parsingDescription) {
        if (currentItem != null) {
            currentItem.setDescription(new String(ch, start, length));
            parsingDescription = false;
        }
    }

的问题是,SAX是解析仅在标签的文本的第一部分中,直到&所述峰; br /&gt;中文本(这是
标签),忽略其余部分。
如何让我的SAX解析器忽略&LT; BR /&gt;中并解析描述的其余部分

The problem is that SAX is parsing only the first part of the text in the tag, up until the "<br />" text (which is the
tag) and ignores the rest. How do I make the SAX parser ignore "<br />" and parse the rest of the description?

感谢。

推荐答案

由于在评论中提到的,你不能依靠字符()来提供所有的元素的一次性文本。我建议这样的事情(查找在code中的意见,看到我修改了它),然后做一个类似的修改为标题:

As mentioned in the comments, you can't rely on characters() to provide all of an element's text in one shot. I recommend something like this (look for the comments in the code to see where I modified it) and then making a similar modification for the title:

// buffer to hold description
private StringBuffer descriptionBuffer;
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    if ("item".equals(qName)) {
        currentItem = new Item();
    } else if ("title".equals(qName)) {
        parsingTitle = true;
    } else if ("description".equals(qName)) {
        parsingDescription = true;
        // initialize buffer
        descriptionBuffer = new StringBuffer();
    }
}

@Override
public void endElement(String uri, String localName, String qName) throws SAXException {

    System.out.println("Testing endelement");

    if ("item".equals(qName)) {
        Items.add(currentItem);
        currentItem = null;
    } else if ("title".equals(qName)) {
        parsingTitle = false;
    } else if ("description".equals(qName)) {
        // Put contents of buffer into description
        currentItem.setDescription(descriptionBuffer.toString());
        descriptionBuffer = null;
        parsingDescription = false;
    }
}

@Override
public void characters(char[] ch, int start, int length) throws SAXException {

    System.out.println("writing");

    if (parsingTitle) {
        if (currentItem != null)
            currentItem.setTitle(new String(ch, start, length));
    } else if (parsingDescription) {
        // add to buffer
        descriptionBuffer.append(ch, start, length); 
    }
}

这篇关于SAX解析器忽略,因为&LT的文本; BR /&GT;标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆