Java String内部表示 [英] Java String internal representation

查看:101
本文介绍了Java String内部表示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道Java for String的内部表示是UTF-16。什么是java字符串表示?



另外,我知道在UTF-16字符串中,每个'character'用一个或两个16位代码单元编码。



但是,当我调试以下java代码

  String hello =Hello; 

变量hello是一个5字节的数组0x48,0x101,0x108,0x108,0x111
是Hello的ASCII。



如何做?

解决方案

的一个迷你java进程与这个代码:

  class Hi {
public static void main(String args []) {
String hello =Hello;
try {
Thread.sleep(60_000);
} catch(InterruptedException e){
e.printStackTrace();
}

}
}

Ubuntu上的一个 gcore 内存转储。 (usign jps 获取 pid 并将其传递给gcore)



如果在使用Hexeditor的转储中找到这个: 48 65 6C 6C 6F ,则内存中的某个位置是ASCII。



但也是 48 00 65 00 6C 00 6C 这是 UTF-16 表示 String


I understand that the internal representation of Java for String is UTF-16. What is java string representation?

Also, I know that in a UTF-16 String, each 'character' is encoded with one or two 16-bit code units.

However, when I debug the following java code

String hello = "Hello";

the variable hello is an array of 5 bytes 0x48, 0x101, 0x108, 0x108, 0x111 which is ASCII for "Hello".

How can this be?

解决方案

I took a gcore dump of a mini java process with this code:

 class Hi {
    public static void main(String args[]) {
        String hello = "Hello";
        try {
            Thread.sleep(60_000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

    }
}

And did a gcore memory dump on Ubuntu. (usign jps to get the pid and passed that to gcore)

If found this: 48 65 6C 6C 6F in the dump using a Hexeditor, so it is somewhere in the memory as ASCII.

But also 48 00 65 00 6C 00 6C which is part of the UTF-16 representation of the String

这篇关于Java String内部表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆