Heroku用 替换UTF-8字节(0xEF 0xBF 0xBD) [英] Heroku replacing UTF-8 bytes with � (0xEF 0xBF 0xBD)

查看:1710
本文介绍了Heroku用 替换UTF-8字节(0xEF 0xBF 0xBD)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Heroku上托管的Java文件中遇到一个字符集问题(UTF-8)。



使用一个小例子更好地解释它:

  //'...'UTF-8编码为0xE2 0x80 0xA6 
// stringToHex()将HEX值输出到控制台/日志
stringToHex(new String(....getBytes(),UTF-8));

现在,一切都在本地完成(Tomcat 7) - 0xE2 0x80 0xA6 。



当我在驻留在Heroku(Jetty 7)上的临时服务器上尝试时,0xEF 0xBF 0xBD 0xEF 0xBF 0xBD 0xEF 0xBF 0xBD 。



两个服务器都运行java参数-Dfile.encoding = UTF-8(因此 Charset.defaultCharset()。toString ()在两者中输出UTF-8。



任何人都可以帮助我解决这个奇怪的问题?



感谢。



更新 - 忘记说:所有文件都采用UTF-8编码, javac -encoding UTF-8



更新2 '...',并在临时服务器上我得到0xEF 0xBF 0xBD 0xEF 0xBF 0xBD而不是0xC2 0xA3...似乎总是将每个字节转换为0xEF 0xBF 0xBD(对应于 )。



更新3 - 自Heroku使用Jetty后,我尝试在本地使用Jetty,一切正常。

更新4 - 这里是我的stringToHex()函数:

  private void stringToHex(String string)throws UnsupportedEncodingException {
String result =;
String tmp;
for(byte b:string.getBytes(UTF-8)){
tmp = Integer.toHexString(0xFF& b);
if(tmp.length()== 1){
tmp + ='0';
}

result + =0x+ tmp.toUpperCase()+;
}

logger.info(result);
}

要使用UTF-8编译,我使用maven-compiler-plugin。 pom.xml相关部分:

 < plugins& 
...
< plugin>
< groupId> org.apache.maven.plugins< / groupId>
< artifactId> maven-compiler-plugin< / artifactId>
< version> 2.3.2< / version>
< configuration>
< encoding> UTF-8< / encoding>
< / configuration>
< / plugin>
...
< / plugins>


解决方案

问题是由于AspectJ配置。如果你想使用AspectJ与Java和Spring,你必须在插件配置中指定编码:

 < plugins& 
...
< plugin>
< groupId> org.codehaus.mojo< / groupId>
< artifactId> aspectj-maven-plugin< / artifactId>
< version> 1.0< / version>
< dependencies>
< dependency>
< groupId> org.aspectj< / groupId>
< artifactId> aspectjrt< / artifactId>
< version> 1.6.10< / version>
< / dependency>
< dependency>
< groupId> org.aspectj< / groupId>
< artifactId> aspectjtools< / artifactId>
< version> 1.6.10< / version>
< / dependency>
< / dependencies>
< executions>
< execution>
< goals>
< goal> compile< / goal>
< goal> test-compile< / goal>
< / goal>
< / execute>
< / executions>
< configuration>
< outxml> true< / outxml>
< verbose> true< / verbose>
< showWeaveInfo> true< / showWeaveInfo>
< aspectLibraries> ¥b $ b< aspectLibrary>
< groupId> org.springframework< / groupId>
< artifactId> spring-aspects< / artifactId>
< / aspectLibrary>
< / aspectLibraries>
< source> 1.6< / source>
< target> 1.6< / target>
< encoding> UTF-8< / encoding>
< / configuration>
< / plugin>
...
< / plugins>


I'm facing a charset problem (UTF-8) in a java file hosted on Heroku.

Better explaining it using a small example:

// '…' UTF-8 encoding is 0xE2 0x80 0xA6
// stringToHex() outputs the HEX value to console/log
stringToHex(new String("…".getBytes(), "UTF-8"));

Now, everything works perfectly locally (Tomcat 7)—"0xE2 0x80 0xA6" is output in the console.

When I try it on the staging server, hosted on Heroku (Jetty 7), "0xEF 0xBF 0xBD 0xEF 0xBF 0xBD 0xEF 0xBF 0xBD" is written to the log instead.

Both the servers are running java with the parameter "-Dfile.encoding=UTF-8" (so Charset.defaultCharset().toString() outputs "UTF-8" in both).

Can anyone help me solving this bizarre problem?

Thanks.

Update - forgot to say: all the files are encoded in UTF-8 and compiled using javac -encoding UTF-8

Update 2 - tried with '£' instead of '…' and on the staging server I get "0xEF 0xBF 0xBD 0xEF 0xBF 0xBD" instead of "0xC2 0xA3"... Seems like it's always converting every single byte to "0xEF 0xBF 0xBD" (which corresponds to �)... ???

Update 3 - since Heroku is using Jetty, I tried using Jetty locally and everything is working perfectly.

Update 4 - here is my stringToHex() function:

private void stringToHex(String string) throws UnsupportedEncodingException {
    String result = "";
    String tmp;
    for(byte b : string.getBytes("UTF-8")) {
        tmp = Integer.toHexString(0xFF & b);
        if(tmp.length() == 1) {
            tmp += '0';
        }

        result += "0x" + tmp.toUpperCase() + " ";
    }

    logger.info(result);
}

To compile in UTF-8 I use the maven-compiler-plugin. pom.xml relevant part:

<plugins>
    ...
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>2.3.2</version>
        <configuration>
            <encoding>UTF-8</encoding>
        </configuration>
    </plugin>
    ...
</plugins>

解决方案

The problem was due to the AspectJ configuration. If you want to use AspectJ with Java and Spring you have to specify the encoding in the plugin configuration:

<plugins>
    ...
    <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>aspectj-maven-plugin</artifactId>
        <version>1.0</version>
        <dependencies>
            <dependency>
                <groupId>org.aspectj</groupId>
                <artifactId>aspectjrt</artifactId>
                <version>1.6.10</version>
            </dependency>
            <dependency>
                <groupId>org.aspectj</groupId>
                <artifactId>aspectjtools</artifactId>
                <version>1.6.10</version>
            </dependency>
        </dependencies>
        <executions>
            <execution>
                <goals>
                    <goal>compile</goal>
                    <goal>test-compile</goal>
                </goals>
            </execution>
        </executions>
        <configuration>
            <outxml>true</outxml>
            <verbose>true</verbose>
            <showWeaveInfo>true</showWeaveInfo>
            <aspectLibraries>
                <aspectLibrary>
                    <groupId>org.springframework</groupId>
                    <artifactId>spring-aspects</artifactId>
                </aspectLibrary>
            </aspectLibraries>
            <source>1.6</source>
            <target>1.6</target>
            <encoding>UTF-8</encoding>
        </configuration>
    </plugin>
    ...
</plugins>

这篇关于Heroku用 替换UTF-8字节(0xEF 0xBF 0xBD)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆