在Java中的字符串xml节点中转义xml字符 [英] Escape xml characters within nodes of string xml in java

查看:863
本文介绍了在Java中的字符串xml节点中转义xml字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一串XML数据。我需要对节点内的值进行转义,而不对节点本身进行转义。

I have a string of XML data. I need to escape the values within the nodes, but not the nodes themselves.

Ex:

< node1> R& R< / node1>

应该转义至:

< node1> R& R< / node1> ;

不应转义至:

& lt; node1& R& R& lt // node1& gt;

最近几天我一直在为此进行工作,但是并没有取得太大的成功。我不是Java专家,但是以下是我尝试过的不起作用的事情:

I have been working on this for the last couple of days, but haven't had much success. I'm not an expert with Java, but the following are things that I have tried that will not work:


  1. 将字符串xml解析为文件。由于节点内的数据包含无效的xml数据,因此无法正常工作。

  2. 转义所有字符。由于接收到该数据的程序将无法接受这种格式,因此无法正常工作。

  3. 转义所有字符,然后解析为文档。抛出各种错误。

任何帮助将不胜感激。

推荐答案

您可以使用正则表达式匹配来查找尖括号之间的所有字符串,并循环遍历/处理每个字符串。在此示例中,我使用了 Apache Commons Lang 进行XML转义。 / p>

You could use regular expression matching to find all the strings between angled brackets, and loop through/process each of those. In this example I've used the Apache Commons Lang to do the XML escaping.

public String sanitiseXml(String xml)
{
    // Match the pattern <something>text</something>
    Pattern xmlCleanerPattern = Pattern.compile("(<[^/<>]*>)([^<>]*)(</[^<>]*>)");

    StringBuilder xmlStringBuilder = new StringBuilder();

    Matcher matcher = xmlCleanerPattern.matcher(xml);
    int lastEnd = 0;
    while (matcher.find())
    {
        // Include any non-matching text between this result and the previous result
        if (matcher.start() > lastEnd) {
            xmlStringBuilder.append(xml.substring(lastEnd, matcher.start()));
        }
        lastEnd = matcher.end();

        // Sanitise the characters inside the tags and append the sanitised version
        String cleanText = StringEscapeUtils.escapeXml10(matcher.group(2));
        xmlStringBuilder.append(matcher.group(1)).append(cleanText).append(matcher.group(3));
    }
    // Include any leftover text after the last result
    xmlStringBuilder.append(xml.substring(lastEnd));

    return xmlStringBuilder.toString();
}

这会查找< something> text< / something>的匹配项,并捕获标签名称和包含的文本,对包含的文本进行消毒,然后将其放回原处。

This looks for matches of <something>text</something>, captures the tag names and contained text, sanitises the contained text, and then puts it back together.

这篇关于在Java中的字符串xml节点中转义xml字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆