将UTF-8 Unicode字符串转换为ASCII Unicode转义字符串 [英] Convert UTF-8 Unicode string to ASCII Unicode escaped String

查看:221
本文介绍了将UTF-8 Unicode字符串转换为ASCII Unicode转义字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将unicode字符串转换为具有以unicode编码的非ascii字符的字符串。例如,字符串汉字Max应显示为\\\漢 \ u5B57 Max。

I need to convert unicode string to string which have non-ascii characters encoded in unicode. For example, string "漢字 Max" should be presented as "\u6F22\u5B57 Max".

我尝试过:


  1. Differenct combinations of

新字符串的组合

new String(sourceString.getBytes(encoding1), encoding2)

Apache StringEscapeUtils,它也像双引号一样转义为ascii字符

Apache StringEscapeUtils which escapes also ascii chars like double-quote

StringEscapeUtils.escapeJava(source)

StringEscapeUtils.escapeJava(source)

是否有一种简单的方法来编码这样的字符串?理想情况下,只应使用Java 6 SE或Apache Commons来实现所需的结果。

Is there an easy way to encode such string? Ideally only Java 6 SE or Apache Commons should be used to achieve desired result.

推荐答案

这是一种简单的代码Jon Skeet在他的评论中记得:

This is the kind of simple code Jon Skeet had in mind in his comment:

final String in = "šđčćasdf";
final StringBuilder out = new StringBuilder();
for (int i = 0; i < in.length(); i++) {
  final char ch = in.charAt(i);
  if (ch <= 127) out.append(ch);
  else out.append("\\u").append(String.format("%04x", (int)ch));
}
System.out.println(out.toString());

正如Jon所说,代理对将表示为一对 \你逃脱。

As Jon said, surrogate pairs will be represented as a pair of \u escapes.

这篇关于将UTF-8 Unicode字符串转换为ASCII Unicode转义字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆