UTF-8和UTF-16之间是否存在巨大差异? [英] Is there a drastic difference between UTF-8 and UTF-16

查看:636
本文介绍了UTF-8和UTF-16之间是否存在巨大差异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我调用了一个webservice,它给了我一个具有UTF-8编码的响应xml。我在java中使用 getAllHeaders()方法检查了它。

I call a webservice, that gives me back a response xml that has UTF-8 encoding. I checked that in java using getAllHeaders() method.

现在,在我的java代码中,我接受了这个响应然后对它进行一些处理。然后,将其传递给不同的服务。

Now, in my java code, I take that response and then do some processing on it. And later, pass it on to a different service.

现在,我搜索了一下,发现默认情况下,Java中字符串的编码是UTF-16。

Now, I googled a bit and found out that by default the encoding in Java for strings is UTF-16.

在我的回复xml中,其中一个元素有一个字符É。现在这搞砸了我对其他服务的后处理请求。

In my response xml, one of the elements had a character É. Now this got screwed in the post processing request that I make to a different service.

它发送了一些乱七八糟的东西而不是发送É。现在我想知道,这两种编码真的会有很大不同吗?如果我想知道什么将从UTF-8转换为UTF-16,那我该怎么办呢?

Instead of sending É, it sent some jibberish stuff. Now I wanted to know, will there be really a lot of difference in the two of these encodings? And if I wanted to know what will É convert from UTF-8 to UTF-16, then how can I do that?

谢谢

推荐答案


两个UTF- 8和UTF-16是可变长度编码。但是,在UTF-8中,一个字符可能占用至少8位,而在UTF-16中,字符长度从16位开始。

Both UTF-8 and UTF-16 are variable length encodings. However, in UTF-8 a character may occupy a minimum of 8 bits, while in UTF-16 character length starts with 16 bits.

主要UTF-8优点:


  1. 基本的ASCII字符,如数字,没有
    重音符号的拉丁字符等占用一个字节,与US-ASCII $相同b $ b代表。这样所有US-ASCII字符串都变为有效的UTF-8,
    在许多情况下提供了良好的向后兼容性。

  2. 没有空字节,允许使用以空字符结尾的字符串,这个
    也引入了很多向后兼容性。

主要UTF-8缺点:


  1. 许多常见字符的长度不同,这会减慢索引
    并严重计算字符串长度。

主要UTF-16专业人士:

Main UTF-16 pros:


  1. 最合理的字符,如拉丁文,西里尔文,中文,日文
    可以用2个字节表示。除非真正奇特的字符需要
    ,否则这意味着UTF-16的16位子集可以用作
    固定长度编码,从而加快索引速度。

主要UTF-16缺点:

Main UTF-16 cons:


  1. US-ASCII中有很多空字节字符串,这意味着没有
    以null结尾的字符串和大量浪费的内存。

一般来说,UTF-16是通常更适合内存中表示,而UTF-8非常适合文本文件和网络协议

In general, UTF-16 is usually better for in-memory representation while UTF-8 is extremely good for text files and network protocol

这篇关于UTF-8和UTF-16之间是否存在巨大差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆