如何在Characters方法中使用SAX解析器读取转义字符? [英] How to read escaped characters using SAX parser in Characters method?

查看:176
本文介绍了如何在Characters方法中使用SAX解析器读取转义字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用解析器解析以下XML:

I'm parsing the following XML using parser:

<Person>
<Name>Test</Name>
<Phone>111-111-2222</OtherPhone>
<Address>lee h&amp;y</Address>
<Person>

sax解析器的character方法仅读取地址数据,直到'lee h',因为它不考虑'&'.作为一个角色.我需要在address元素中获取完整的文本.关于我应该怎么做的任何想法?这是我的sax解析器(这里的address是一个标志,用于通知XML中存在一个address元素):

The characters method of the sax parser is only reading the address data until 'lee h' as it does not consider '&' as a character. I need to get the complete text in the address element. Any ideas on how I should do it? This is my sax parser(here address is a flag which notifies that an address element is present in XML):

boolean address=false;

 public void startElement(String uri, String localName,
            String qName, Attributes attributes)
            throws SAXException {


        if (qName.equalsIgnoreCase("Address")) {
            address= true;

        }

    public void characters(char ch[], int start, int length)
                throws SAXException {

            String data = new String(ch, start, length);


            if (address) {

                System.out.println("Address is: "+data);
                address = false;
            }

的输出是:: lee h

and the output is:: lee h

推荐答案

由于外部实体的存在,此处调用了character方法3次以报告元素Address的内容.您应该累积字符调用的内容,直到收到endElement事件,然后您便拥有了完整的内容.

The characters method is called three times here to report the content of the element Address because of the presence of an external entity. You should accumulate the content of the calls to characters until you receive an endElement event and then you have the complete content.

请注意字符方法的文档.

您还可以通过将ignorableWhitespace方法与验证解析器和适当的架构(例如DTD)结合使用,让解析器知道哪些空间是可忽略的(由于缩进).

You could also benefit from the use of the ignorableWhitespace method with a validating parser and the appropriate schema (e.g. DTD) to let the parser know which spaces are ignorable (due to indentation).

在Java中,可能是:

In Java, it could be:

class MyHandler extends DefaultHandler {

    private StringBuilder acc;

    public MyHandler() {
        acc = new StringBuilder();
    }

    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        System.out.printf("Characters accumulated: %s\n", acc.toString());
        acc.setLength(0);
    }

    @Override
    public void characters(char[] ch, int start, int length)
            throws SAXException {
        acc.append(ch, start, length);
    }
}

这篇关于如何在Characters方法中使用SAX解析器读取转义字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆