Java 9中紧凑字符串和压缩字符串之间的区别 [英] Difference between compact strings and compressed strings in Java 9
问题描述
紧凑字符串相对于JDK9中的压缩字符串有什么优势?
What are the advantages of compact strings over compressed strings in JDK9?
推荐答案
压缩字符串(Java 6)和紧凑字符串(Java 9)都有相同的动机(字符串通常是拉丁文-1,所以一半空间浪费)和目标(使这些字符串变小)但实现差异很大。
Compressed strings (Java 6) and compact strings (Java 9) both have the same motivation (strings are often effectively Latin-1, so half the space is wasted) and goal (make those strings small) but the implementations differ a lot.
在面试中采访AlekseyShipilëv(负责人)有关压缩字符串的说法:
In an interview Aleksey Shipilëv (who was in charge of implementing the Java 9 feature) had this to say about compressed strings:
UseCompressedStrings功能相当保守:区分
char []
和byte []
case,并尝试压缩char []
进入byte []
onString
构造,它完成最多String
char []
上的操作,需要解压缩字符串。
因此,它只受益于特殊类型的工作负载,其中大多数字符串是可压缩的(因此压缩不会浪费),并且只有有限数量的已知String
操作在他们身上执行(因此不需要拆包)。在很多工作负载中,启用-XX:+ UseCompressedStrings
是一种悲观。
UseCompressedStrings feature was rather conservative: while distinguishing between
char[]
andbyte[]
case, and trying to compress thechar[]
intobyte[]
onString
construction, it done mostString
operations onchar[]
, which required to unpack theString.
Therefore, it benefited only a special type of workloads, where most strings are compressible (so compression does not go to waste), and only a limited amount of knownString
operations are performed on them (so no unpacking is needed). In great many workloads, enabling-XX:+UseCompressedStrings
was a pessimization.
[...] UseCompressedStrings实现基本上是一个可选功能,在 alt-rt.jar
中维护了一个完全不同的 String
实现,该实现在提供VM选项。可选功能更难测试,因为它们会使选项组合的数量加倍。
[...] UseCompressedStrings implementation was basically an optional feature that maintained a completely distinct String
implementation in alt-rt.jar
, which was loaded once the VM option is supplied. Optional features are harder to test, since they double the number of option combinations to try.
Compact Strings
另一方面,在Java 9中,紧凑字符串完全集成到JDK源中。 字符串
总是由 byte []
支持,其中字符使用一个字节,如果它们是Latin-1和其他两个。大多数操作都会检查以查看是哪种情况,例如: charAt
:
Compact Strings
In Java 9 on the other hand, compact strings are fully integrated into the JDK source. String
is always backed by byte[]
, where characters use one byte if they are Latin-1 and otherwise two. Most operations do a check to see which is the case, e.g. charAt
:
public char charAt(int index) {
if (isLatin1()) {
return StringLatin1.charAt(value, index);
} else {
return StringUTF16.charAt(value, index);
}
}
默认情况下启用紧凑字符串,可以部分禁用 - 部分因为它们仍然由 byte []
支持,并且返回 char
的操作仍然必须将它们放在一起来自两个单独的字节(由于内在函数,很难说这是否会对性能产生影响)。
Compact strings are enabled by default and can be partially disabled - "partially" because they are still backed by a byte[]
and operations returning char
s must still put them together from two separate bytes (due to intrinsics it is hard to say whether this has a performance impact).
如果您对更紧凑的字符串背景感兴趣,我建议您阅读访谈我上面链接和/或观看同一个Aleksey的精彩演讲Shipilëv(也解释了新的字符串连接)。
If you're interested in more background on compact strings I recommend to read the interview I linked to above and/or watch this great talk by the same Aleksey Shipilëv (which also explains the new string concatenation).
这篇关于Java 9中紧凑字符串和压缩字符串之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!