Java中的字节和字符转换 [英] Byte and char conversion in Java

查看：123 发布时间：2017/8/16 19:27:12 java encoding unicode utf-16

本文介绍了Java中的字节和字符转换的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如果我将字符转换为字节，然后返回到 char ，那个角色神秘地消失，成为别的东西。这是可能的吗？

这是代码：

  char a ='È' // line 1 
 byte b =（byte）a; // line 2 
 char c =（char）b; // line 3 
 System.out.println（（char）c ++（int）c）;

直到第2行一切正常：

在第1行中，我可以在控制台中打印a，并显示È。

第2行我可以在控制台中打印b，它会显示-56，因为字节被签名是200。而200是È。所以还是很好。

但是第3行出了什么问题？ c成为别的东西，程序打印？ 65480 。这是一个完全不同的东西。

为了获得正确的结果，我应该在第3行写什么？

解决方案

Java中的一个字符是一个Unicode代码单元，被视为无符号数字。所以如果你执行 c =（char）b 你得到的值是2 ^ 16 - 56或65536 - 56。

或者更确切地说，在扩展转换中，首先使用符号扩展名将字节转换为值为$ code> 0xFFFFFFC8 的有符号整数。然后，当转换为 char 时，这反过来变窄为 0xFFC8 ，这转换为正数 65480 。

从语言规范：

5.1.4。扩大和缩小原始转换

首先，通过扩展原语转换将字节转换为int（§5.1.2），
，然后通过缩小原始转换
（§5.1.3）将结果int转换为char。

要获得正确的点，请使用 char c =（char）（b& 0xFF）通过使用掩码将 b 的值转换为正整数 200 ，将转换后的前24位置零： 0xFFFFFFC8 成为 0x000000C8 或小数位数 200

以上是直接说明在字节之间转换过程中会发生什么， int 和 char 原始类型。

如果你想从字节编码/解码字符，使用 Charset ， CharsetEncoder ， CharsetDecoder 或一个方便的方法，如 new String（byte [] bytes，Charset charset）或 String＃toBytes（Charset charset）。您可以从 StandardCharsets 获取字符集（如UTF-8或Windows-1252）。

If I convert a character to byte and then back to char, that character mysteriously disappears and becomes something else. How is this possible?

This is the code:

char a = 'È';       // line 1       
byte b = (byte)a;   // line 2       
char c = (char)b;   // line 3
System.out.println((char)c + " " + (int)c);

Until line 2 everything is fine:

In line 1 I could print "a" in the console and it would show "È".
In line 2 I could print "b" in the console and it would show -56, that is 200 because byte is signed. And 200 is "È". So it's still fine.

But what's wrong in line 3? "c" becomes something else and the program prints ? 65480. That's something completely different.

What I should write in line 3 in order to get the correct result?

解决方案

A character in Java is a Unicode code-unit which is treated as an unsigned number. So if you perform c = (char)b the value you get is 2^16 - 56 or 65536 - 56.

Or more precisely, the byte is first converted to a signed integer with the value 0xFFFFFFC8 using sign extension in a widening conversion. This in turn is then narrowed down to 0xFFC8 when casting to a char, which translates to the positive number 65480.

From the language specification:

5.1.4. Widening and Narrowing Primitive Conversion

First, the byte is converted to an int via widening primitive conversion (§5.1.2), and then the resulting int is converted to a char by narrowing primitive conversion (§5.1.3).

To get the right point use char c = (char) (b & 0xFF) which first converts the byte value of b to the positive integer 200 by using a mask, zeroing the top 24 bits after conversion: 0xFFFFFFC8 becomes 0x000000C8 or the positive number 200 in decimals.

Above is a direct explanation of what happens during conversion between the byte, int and char primitive types.

If you want to encode/decode characters from bytes, use Charset, CharsetEncoder, CharsetDecoder or one of the convenience methods such as new String(byte[] bytes, Charset charset) or String#toBytes(Charset charset). You can get the character set (such as UTF-8 or Windows-1252) from StandardCharsets.

这篇关于Java中的字节和字符转换的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Java中的字节和字符转换 [英] Byte and char conversion in Java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Java中的字节和字符转换 [英] Byte and char conversion in Java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭