JAXB将非ASCII字符转换为ASCII字符 [英] JAXB convert non-ASCII characters to ASCII characters

查看:159
本文介绍了JAXB将非ASCII字符转换为ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些xsd模式,元素名称包含非ASCII字符。当我使用Eclipse Kepler使用生成JAXB Classes 命令生成java类时,生成的类和变量包含非ASCII字符。我想将这个非ASCII字符转换为ASCII字符。



我已经在JAVA_TOOL_OPTIONS设置了语言环境

  -Duser.country = GB -Duser.language = en 

For示例

 İ - >我$ b $bÇ - > C $ b $bŞ - > S $ b $bÖ - > O 
Ğ - > G $ b $bÜ - > U $ b $bı - > i $ b $bö - > o $ b $bü - >你好b $bç - > c $ b $bğ - > g $ b $bş - > s


解决方案

编辑: 由于要求是通用解决方案且不使用外部绑定文件,我在下面提供了 2个选项:



选项1 - 通用解决方案 - 创建自定义XJC插件以规范化



通用解决方案实际上是:


  1. 扩展 com.sun.tools.xjc.Plugin 抽象类和覆盖的方法JAXB 用于命名工件 - 基本上创建一个插件

  2. 在具体的具体内容中将此实现打包在 jar 中调用jar中 META-INF 文件夹的 services 目录中的实现名称

  3. 将这个新创建的jar与 jaxb libs一起部署并通过ANT运行它( build.xml 下面提供,请继续阅读)

为你目的,我已经创建了一个插件,你可以从



我使用选项1 中已经提到的xsd,其元素名称包含重音(非ASCII)字符:



如果我在没有指定外部绑定的情况下生成类,我会得到以下输出:





现在,如果我更改绑定为了生成我选择的类名和变量,我将 binding.xml 写为:

 < jxb:bindings xmlns:xs =http://www.w3.org/2001/XMLSchema
xmlns:jxb =http://java.sun.com/xml / ns / jaxbversion =2.1>
< jxb:globalBindings localScoping =toplevel/>

< jxb:bindings schemaLocation =some.xsd>
< jxb:bindings node =// xs:element [@ name ='Şhİpto']>
< jxb:class name =ShipTo/>
< / jxb:bindings>
< jxb:bindings node =// xs:element [@ name ='Örderperson']>
< jxb:property name =OrderPerson/>
< / jxb:bindings>
< jxb:bindings node =// xs:element [@ name ='Şhİpto'] // xs:complexType>
< jxb:class name =ShipToo/>
< / jxb:bindings>
< / jxb:bindings>

< / jxb:bindings>

现在我通过指定绑定文件通过eclipse生成我的类:





在接下来的步骤中,我选择了包和我得到的绑定文件,





注意:如果您没有使用eclipse生成类,则可能需要检查 xjc绑定编译器以利用外部绑定文件。


I have some xsd schemas that element names contains non-ASCII characters. When I generate java classes using Generate JAXB Classes command using Eclipse Kepler, generated classes and variables of them contains non-ASCII characters. I want to transform this non-ASCII characters to ASCII characters.

I already set locale at JAVA_TOOL_OPTIONS

-Duser.country=GB -Duser.language=en

For example

İ -> I
Ç -> C
Ş -> S
Ö -> O
Ğ -> G
Ü -> U
ı -> i
ö -> o
ü -> u
ç -> c
ğ -> g
ş -> s

解决方案

EDIT: Since the requirement is of a generic solution and not using the external binding files, I have offered 2 options below:

Option 1 - A Generic Solution - Create a Custom XJC plugin to normalize

The generic solution is effectively:

  1. Extend com.sun.tools.xjc.Plugin abstract class and override methods that JAXB uses to name the artifacts - create a plugin bascially
  2. Pack this implementation in a jar after specifically calling out the name of the implementation within the services directory of the META-INF folder inside the jar
  3. Deploy this newly created jar along with jaxb libs and run it through ANT (build.xml provided below, read on)

For your purpose, I have created the plugin for which you can download the jar from here, download the ant script (build.xml) from here. Put the jar to your build path in eclipse and edit the ant file to provide your locations of your JAXB libs, target package of the generated classes, project name and schema location and run it. That's it!

Explanation:

I created a custom XJC plugin with an extra command line option -normalize to replace the accented characters in your created Java classes, methods, variables, properties and interfaces with their ASCII equivalents.

XJC has the capability of custom plugins creation to control the names, annotations and other attributes of the generated classes, variables and so on. This blog post though old can get you started with the basics of such plugin implementations.

Long story short, I created a class extending the abstract com.sun.tools.xjc.Plugin class, overriding its methods important one being onActivated.

In this method, I have set com.sun.tools.xjc.Option#setNameConverter to a custom class which takes care of overriding the required methods of acquiring names of the class, methods etc. I have committed the source to my git repo here as well, below is the detailed usage of it:

import java.text.Normalizer;

import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;

import com.sun.tools.xjc.BadCommandLineException;
import com.sun.tools.xjc.Options;
import com.sun.tools.xjc.Plugin;
import com.sun.tools.xjc.outline.Outline;
import com.sun.xml.bind.api.impl.NameConverter;

/**
 * {@link Plugin} that normalized the names of JAXB generated artifacts
 * 
 * @author popofibo
 */
public class NormalizeElements extends Plugin {

    /**
     * Set the command line option
     */
    @Override
    public String getOptionName() {
        return "normalize";
    }

    /**
     * Usage content of the option
     */
    @Override
    public String getUsage() {
        return "  -normalize    :  normalize the classes and method names generated by removing the accented characters";
    }

    /**
     * Set the name converted option to a delegated custom implementation of
     * NameConverter.Standard
     */
    @Override
    public void onActivated(Options opts) throws BadCommandLineException {
        opts.setNameConverter(new NonAsciiConverter(), this);
    }

    /**
     * Always return true
     */
    @Override
    public boolean run(Outline model, Options opt, ErrorHandler errorHandler)
            throws SAXException {
        return true;
    }

}

/**
 * 
 * @author popofibo
 * 
 */
class NonAsciiConverter extends NameConverter.Standard {

    /**
     * Override the generated class name
     */
    @Override
    public String toClassName(String s) {
        String origStr = super.toClassName(s);
        return normalize(origStr);
    }

    /**
     * Override the generated property name
     */
    @Override
    public String toPropertyName(String s) {
        String origStr = super.toPropertyName(s);
        return normalize(origStr);
    }

    /**
     * Override the generated variable name
     */
    @Override
    public String toVariableName(String s) {
        String origStr = super.toVariableName(s);
        return normalize(origStr);
    }

    /**
     * Override the generated interface name
     */
    @Override
    public String toInterfaceName(String s) {
        String origStr = super.toInterfaceName(s);
        return normalize(origStr);
    }

    /**
     * Match the accented characters within a String choosing Canonical
     * Decomposition option of the Normalizer, regex replaceAll using non POSIX
     * character classes for ASCII
     * 
     * @param accented
     * @return normalized String
     */
    private String normalize(String accented) {
        String normalized = Normalizer.normalize(accented, Normalizer.Form.NFD);
        normalized = normalized.replaceAll("[^\\p{ASCII}]", "");
        return normalized;
    }
}

To enable this plugin with the normal jaxb unmarshalling is to pack these class in a jar, add /META-INF/services/com.sun.tools.xjc.Plugin file within the jar and put it in your build path.

/META-INF/services/com.sun.tools.xjc.Plugin file within the jar:

This file reads:

com.popofibo.plugins.jaxb.NormalizeElements

As mentioned before, I pack it in a jar, deploy it in my eclipse build path, now the problem I ran in to with running eclipse kepler with JDK 1.7 is I get this exception (message):

com.sun.tools.xjc.plugin Provider <my class> not a subtype

Hence, it's better to generate the classes using ANT, the following build.xml does justice to the work done so far:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<project name="SomeProject" default="createClasses">

    <taskdef name="xjc" classname="com.sun.tools.xjc.XJC2Task">
        <classpath>
            <pathelement
                path="C:/Workspace/jaxb-ri-2.2.7/jaxb-ri-2.2.7/lib/jaxb-xjc.jar" />
            <pathelement
                path="C:/Workspace/jaxb-ri-2.2.7/jaxb-ri-2.2.7/lib/jaxb-impl.jar" />
            <pathelement
                path="C:/Workspace/jaxb-ri-2.2.7/jaxb-ri-2.2.7/lib/jaxb2-value-constructor.jar" />
            <pathelement path="C:/Workspace/normalizeplugin_xjc_v0.4.jar" />
        </classpath>
    </taskdef>

    <target name="clean">
        <delete dir="src/com/popofibo/jaxb" />
    </target>

    <target name="createClasses" depends="clean">
        <xjc schema="res/some.xsd" destdir="src" package="com.popofibo.jaxb"
            encoding="UTF-8">
            <arg value="-normalize" />
        </xjc>
    </target>
</project>

The schema to showcase this normalization process I chose was:

<xs:element name="shiporder">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="Örderperson" type="xs:string"/>
      <xs:element name="Şhİpto">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="name" type="xs:string"/>
            <xs:element name="address" type="xs:string"/>
            <xs:element name="Çity" type="xs:string"/>
            <xs:element name="ÇoÜntry" type="xs:string"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
      <xs:element name="İtem" maxOccurs="unbounded">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="title" type="xs:string"/>
            <xs:element name="note" type="xs:string" minOccurs="0"/>
            <xs:element name="qÜantity" type="xs:positiveInteger"/>
            <xs:element name="price" type="xs:decimal"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
    <xs:attribute name="orderid" type="xs:string" use="required"/>
  </xs:complexType>
</xs:element>

</xs:schema> 

As you can see, I have set the argument and package as to where I want to have my classes generated, and voila - the ASCII names for classes, methods, variables in the generated artifacts (the only gap I see is with the XML annotations which would not affect the cause but also easy to overcome):

The above screenshot shows the names were normalized and are replaced by their ASCII counterparts (to check how it would look without the replacement, please refer to the screenshots in option 2).

Option 2 - Using External binding file

To remove accented characters, you can create a custom binding file and use it to bind your class and property names while generating your classes. Refer to: Creating an External Binding Declarations File Using JAXB Binding Declarations

I took the xsd already mentioned in Option 1 with element names containing "accented" (Non-ASCII) characters:

If I generate the classes without specifying the external binding, I get the following outputs:

!

Now if I change the binding a bit to generate class names and variables of my choice, I write my binding.xml as:

<jxb:bindings xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:jxb="http://java.sun.com/xml/ns/jaxb" version="2.1">
    <jxb:globalBindings localScoping="toplevel" />

    <jxb:bindings schemaLocation="some.xsd">
        <jxb:bindings node="//xs:element[@name='Şhİpto']">
            <jxb:class name="ShipTo" />
        </jxb:bindings>
        <jxb:bindings node="//xs:element[@name='Örderperson']">
            <jxb:property name="OrderPerson" />
        </jxb:bindings>
        <jxb:bindings node="//xs:element[@name='Şhİpto']//xs:complexType">
            <jxb:class name="ShipToo" />
        </jxb:bindings>
    </jxb:bindings>

</jxb:bindings>

Now when I generate my class through eclipse by specifying the binding file:

In the next steps, I choose the package and the binding file I get,

Note: If you are not using eclipse to generate your classes, you might want to check xjc binding compiler out to utilize your external binding file.

这篇关于JAXB将非ASCII字符转换为ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆