爪哇 - 除去XML属性的双引号 [英] Java - Removing the double quotes in XML attributes

查看:230
本文介绍了爪哇 - 除去XML属性的双引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个XML字符串,我通过REST调用获得。然而,一些属性已损坏的值。例如:

I have an xml string which I get via a REST call. However, some of the attributes have corrupted values. For example:

<property name="foo" value="Some corrupted String because of "something" like that"/>

我怎么能代替双引号不是没有$ P $通过的值pceded = /> 用单引号不follown并得到一个有效的XML串出的,在Java 6的损坏呢?

How can I replace double-quotes either not preceded by value= or not follown by /> with a single quote and get a valid XML string out of that corrupted one in Java 6?

编辑:

我试图修改此预测先行/后向正则表达式是用于VisualBasic中。但由于转义字符我猜的不合,我无法创建它的Java版本。在这里,它是:

I have tried to modify this lookahead/lookbehind regex that was used for VisualBasic. But because of the incompatibility of escape characters I guess, I could not create the Java version of it. Here it is:

(小于?= ^ [^] *,(大于[^] *[^] *)* [^] *)(?!\\ S + \\ W + = | \\ S * [/?]?&GT;)|(小于?!?(= ^?\\ W + =)] *(大于[^] *[^] *)* [^] * $)

推荐答案

您可以使用下面的正则表达式:

You can use the following regex:

\s+[\w:.-]+="([^"]*(?:"(?!\s+[\w:.-]+="|\s*(?:\/?|\?)>)[^"]*)*)"

请参阅正则表达式演示。这将匹配任何属性的名称/值对捕捉到后者的1组,我们可以回调内改变。

See regex demo. It will match any attribute name/value pair capturing the latter into Group 1 that we can change inside a callback.

下面是一个的Java code演示

String s =  "<?xml version=\"1.0\" encoding=\"UTF-8\"?> <resources> <resource> <properties> <property name=\"name\" value=\"retrieveFoo\"/>\n<property name=\"foo\" value=\"Some corrupted String because of \"something\" like that\"/>";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("(\\s+[\\w:.-]+=\")([^\"]*(?:\"(?!\\s+[\\w:.-]+=\"|\\s*(?:/?|\\?)>)[^\"]*)*)\"").matcher(s);
while (m.find()) {
    m.appendReplacement(result, m.group(1) + m.group(2).replace("\"", "&quot;") + "\"");
}
m.appendTail(result);
System.out.println(result.toString());

输出:

&LT;?XML版本=1.0编码=UTF-8&GT?; &LT;资源&GT; &LT;资源&GT; &LT;性状&gt; &LT;属性名=名字值=retrieveFoo/&GT;
&LT;属性名=foo的值=因为与放一些损坏的字符串; QUOT;东西&安培; QUOT;像/&GT;

这篇关于爪哇 - 除去XML属性的双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆