Java中的String的字符编码是什么? [英] What is the character encoding of String in Java?

查看:453
本文介绍了Java中的String的字符编码是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

实际上我对于一种语言中的字符串编码感到困惑。我有一些这些问题....请帮助我,如果你知道他们的答案....



1)什么是java字符串的本地编码内存,即当我写 String a =Hello时,将存储的格式。因为java是机器独立的,所以我不认为这将是系统编码做的



2)我读这个网络上的UTF-16 是默认编码,但我很困惑,因为说当我写 int a ='c'默认答案给我的ASCII,所以是ASCII和UTF-16相同?



3)另外我有这样的怀疑,在什么因素上的内存中的字符串的存储取决于os,语言

char 数组和字符串的长度。字符数组通常实现为一个16位字的连续数组,每个字符以原生字节顺序包含一个Unicode字符。



2)整数将16位Unicode字符代码转换为其等价的整数。因此,'c',即U + 0063,变为 0x0063 或99。



3)由于每个 String 是一个对象,它包含除其类成员之外的其他信息(例如类描述符字,锁/信号量字,

ADENDUM

对象内容取决于JVM实现(其确定了与之相关的固有开销与每个对象),以及类如何实际编码(即,一些库可能比其他库更有效)。



示例 />
典型的实现将为每个对象实例(对于类描述符/指针和信号量/锁控制字)分配两个字的开销。 String 对象还包含 int 长度和 char [] 数组引用。字符串的实际字符内容存储在第二个对象中, char [] 数组,后者又分配两个字,加上一个数组长度字, 16-bit char 字符串需要的元素(加上创建字符串时挂起的任何额外的字符)。



查看Apache实现的实际源代码,例如在: http://www.docjar.com/html/api/ java / lang / String.java.html


I am actually confused regarding encoding of string in a language. I had some of these questions .... Please help me if you know the answer to them....

1) What is the native encoding of java strings in memory ie when I write String a = "Hello" in which format will it be stored. Since java is machine independent so I don't think it will be the encoding done by the system ??

2) I read this on the net that "UTF-16" is the default encoding but I got confused because say when I write that int a = 'c' the default answer gives me ASCII so are ASCII and UTF-16 same ??

3) Also I had this doubt that on what factors the storage of a string in the memory depend os , language ??

解决方案

1) Strings are objects, which typically contain a char array and the strings's length. The character array is usually implemented as a contiguous array of 16-bit words, each one containing a Unicode character in native byte order.

2) Assigning a character value to an integer converts the 16-bit Unicode character code into its integer equivalent. Thus 'c', which is U+0063, becomes 0x0063, or 99.

3) Since each String is an object, it contains other information than its class members (e.g., class descriptor word, lock/semaphore word, etc.).

ADENDUM
The object contents depend on the JVM implementation (which determines the inherent overhead associated with each object), and how the class is actually coded (i.e., some libraries may be more efficient than others).

EXAMPLE
A typical implementation will allocate an overhead of two words per object instance (for the class descriptor/pointer, and a semaphore/lock control word); a String object also contains an int length and a char[] array reference. The actual character contents of the string are stored in a second object, the char[] array, which in turn is allocated two words, plus an array length word, plus as many 16-bit char elements as needed for the string (plus any extra chars that were left hanging around when the string was created).

Take a look at the actual source code for Apache's implementation, e.g. at:
http://www.docjar.com/html/api/java/lang/String.java.html

这篇关于Java中的String的字符编码是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆