在Java中使用正则表达式修改xml [英] Using regexp in java to modify an xml

查看:215
本文介绍了在Java中使用正则表达式修改xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过使用Java中的正则表达式来更改xml,但是我找不到正确的方法.我有这样的xml(简体):

I'm trying to change an xml by using regular expressions in java, but I can't find the right way. I have an xml like this (simplified):

<ROOT>
   <NODE ord="1" />
   <NODE ord="3,2" />
</ROOT>

xml实际上用两种语言显示了一个带有节点,块……的句子,并具有更多的属性.每个句子都加载到两个RichTextAreas中(一个用于源句子,另一个用于翻译后的句子).

The xml actually shows a sentence with its nodes, chunks ... in two languages and has more attributes. Each sentence it's loaded in two RichTextAreas (one for the source sentence, and the other for the translated one).

我需要做的是向其ord属性中具有特定值的每个节点添加一个style属性(此style属性将显示两种语言之间的对应关系,就像将鼠标悬停在一个单词上时,Google Translate一样).我知道可以使用DOM来做到这一点(获取所有NODE节点,然后逐个查看ord属性),但是我正在寻找最快的方法来进行更改,因为它将在我的GWT客户端中执行应用程序.

What I need to do is add a style attribute to every node that has an specific value in its ord attribute (this style attribute will show correspondences between two languages, like Google Translate does when you mouse over a word). I know this could be done using DOM (getting all the NODE nodes and then seeing the ord attribute one by one), but I am looking for the fastest way to do the change as it is going to execute in the client side of my GWT app.

当该ord属性具有单个值时(例如在第一个节点中),仅将xml作为字符串并使用replaceAll()函数就很容易做到.问题在于属性具有组合值时(例如在第二个节点中).

When that ord attribute has a single value (like in the first node) it is easy to do just taking the xml as a string and using the replaceAll() function . The problem is when the attribute has composed values (like in the second node).

例如,如果我要查找的值为2,如何添加该属性?我相信可以使用正则表达式来完成此操作,但我不知道如何做.任何提示或帮助将不胜感激(即使它不使用regexp和replaceAll函数).

For example, how could I do to add that attribute if the value I'm looking for is 2? I believe this could be done using regular expressions, but I can't find out how. Any hint or help would be appreciated (even if it doesn't use regexp and replaceAll function).

谢谢.

推荐答案

XPath可以为您完成此任务.您可以选择:

XPath can do this for you. You could select:

/ROOT/NODE[contains(concat(',', @ord, ','), ',2,')]

由于您打算在客户端上使用GWT,因此可以尝试 gwtxslt .有了它,您可以指定一个XSLT样式表来为您进行转换(即添加属性):

Since you intend to use GWT on the client, you could give gwtxslt a try. With it you could specify an XSLT stylesheet to do the transformation (i.e. adding the attribute) for you:

XsltProcessor processor = new XsltProcessor();
processor.importStyleSheet(styleSheetText);
processor.importSource(sourceText);
processor.setParameter("ord", "2");
processor.setParameter("style", "whatever");
String resultString = processor.transform();
// do something with resultString

其中styleSheetText可能是XSLT文档,

where styleSheetText could be an XSLT document along the lines of

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:param name="ord"   select="''" />
  <xsl:param name="style" select="''" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="NODE">
    <xsl:copy>
      <xsl:apply-templates select="@*" />
      <xsl:if test="contains(concat(',', @ord, ','), concat(',', $ord, ','))">
        <xsl:attribute name="style">
          <xsl:value-of select="$style" />
        </xsl:attribute>
      </xsl:if>
      <xsl:apply-templates select="node()" />
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

请注意,我使用concat()来防止@ord的属性值实际上是逗号分隔的列表中的部分匹配.

Note that I use concat() to prevent partial matches in the comma-separated list that the attribute value of @ord actually is.

这篇关于在Java中使用正则表达式修改xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆