Java 9中紧凑字符串和压缩字符串之间的区别 [英] Difference between compact strings and compressed strings in Java 9

查看:164
本文介绍了Java 9中紧凑字符串和压缩字符串之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

紧凑字符串相对于JDK9中的压缩字符串有什么优势?

What are the advantages of compact strings over compressed strings in JDK9?

推荐答案

压缩字符串(Java 6)和紧凑字符串(Java 9)都有相同的动机(字符串通常是拉丁文-1,所以一半空间浪费)和目标(使这些字符串变小)但实现差异很大。

Compressed strings (Java 6) and compact strings (Java 9) both have the same motivation (strings are often effectively Latin-1, so half the space is wasted) and goal (make those strings small) but the implementations differ a lot.

面试中采访AlekseyShipilëv(负责人)有关压缩字符串的说法:

In an interview Aleksey Shipilëv (who was in charge of implementing the Java 9 feature) had this to say about compressed strings:


UseCompressedStrings功能相当保守:区分 char [] byte [] case,并尝试压缩 char [] 进入 byte [] on String 构造,它完成最多 String char [] 上的操作,需要解压缩字符串。因此,它只受益于特殊类型的工作负载,其中大多数字符串是可压缩的(因此压缩不会浪费),并且只有有限数量的已知 String 操作在他们身上执行(因此不需要拆包)。在很多工作负载中,启用 -XX:+ UseCompressedStrings 是一种悲观。

UseCompressedStrings feature was rather conservative: while distinguishing between char[] and byte[] case, and trying to compress the char[] into byte[] on String construction, it done most String operations on char[], which required to unpack the String. Therefore, it benefited only a special type of workloads, where most strings are compressible (so compression does not go to waste), and only a limited amount of known String operations are performed on them (so no unpacking is needed). In great many workloads, enabling -XX:+UseCompressedStrings was a pessimization.

[...] UseCompressedStrings实现基本上是一个可选功能,在 alt-rt.jar 中维护了一个完全不同的 String 实现,该实现在提供VM选项。可选功能更难测试,因为它们会使选项组合的数量加倍。

[...] UseCompressedStrings implementation was basically an optional feature that maintained a completely distinct String implementation in alt-rt.jar, which was loaded once the VM option is supplied. Optional features are harder to test, since they double the number of option combinations to try.



Compact Strings



另一方面,在Java 9中,紧凑字符串完全集成到JDK源中。 字符串 总是 byte [] 支持,其中字符使用一个字节,如果它们是Latin-1和其他两个。大多数操作都会检查以查看是哪种情况,例如: charAt

Compact Strings

In Java 9 on the other hand, compact strings are fully integrated into the JDK source. String is always backed by byte[], where characters use one byte if they are Latin-1 and otherwise two. Most operations do a check to see which is the case, e.g. charAt:

public char charAt(int index) {
    if (isLatin1()) {
        return StringLatin1.charAt(value, index);
    } else {
        return StringUTF16.charAt(value, index);
    }
}

默认情况下启用紧凑字符串,可以部分禁用 - 部分因为它们仍然由 byte [] 支持,并且返回 char 的操作仍然必须将它们放在一起来自两个单独的字节(由于内在函数,很难说这是否会对性能产生影响)。

Compact strings are enabled by default and can be partially disabled - "partially" because they are still backed by a byte[] and operations returning chars must still put them together from two separate bytes (due to intrinsics it is hard to say whether this has a performance impact).

如果您对更紧凑的字符串背景感兴趣,我建议您阅读访谈我上面链接和/或观看同一个Aleksey的精彩演讲Shipilëv(也解释了新的字符串连接)。

If you're interested in more background on compact strings I recommend to read the interview I linked to above and/or watch this great talk by the same Aleksey Shipilëv (which also explains the new string concatenation).

这篇关于Java 9中紧凑字符串和压缩字符串之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆