Java 7 String - 子串复杂性 [英] Java 7 String - substring complexity

查看:137
本文介绍了Java 7 String - 子串复杂性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Java 6之前,我们在 String 上有一个恒定时间子字符串。在Java 7中,为什么他们决定复制 char 数组 - 并降级到线性时间复杂度 - 当像 StringBuilder 究竟是为了那个?

Until Java 6, we had a constant time substring on String. In Java 7, why did they decide to go with copying char array - and degrading to linear time complexity - when something like StringBuilder was exactly meant for that?

推荐答案

为什么他们决定在 Oracle bug#4513622:(str)保留字段的子字符串会阻止GC对象


当您在示例中调用String.substring时,不会分配用于存储的新字符数组。它使用原始String的字符数组。因此,支持原始String的字符数组不能进行GC,直到子字符串的引用也可以是GC。这是一种有意的优化,可以防止在常见场景中使用子字符串时过多的分配。不幸的是,有问题的代码遇到了原始数组的开销明显的情况。对于两个边缘情况都难以优化。对空间/大小权衡的任何优化通常都很复杂,并且通常可以是特定于平台的。

When you call String.substring as in the example, a new character array for storage is not allocated. It uses the character array of the original String. Thus, the character array backing the the original String can not be GC'd until the substring's references can also be GC'd. This is an intentional optimization to prevent excessive allocations when using substring in common scenarios. Unfortunately, the problematic code hits a case where the overhead of the original array is noticeable. It is difficult to optimize for both edges cases. Any optimization for space/size trade-offs are generally complex and can often be platform-specific.

还有这个 note ,注意到根据测试,曾经的优化已成为一种悲观情绪:

There's also this note, noting that what once was an optimization had become a pessimization according to tests:


长期以来,准备工作和规划一直在从java.lang.String中删除偏移量和计数字段。这两个字段使多个String实例共享相同的后备字符缓冲区。共享字符缓冲区是旧基准测试的重要优化,但使用当前的真实代码和基准测试,实际上最好不共享后备缓冲区。共享字符串数组后备缓冲区只有win才能大量使用String.substring。受到负面影响的情况可能包括解析器和编译器,但是当前的测试显示总体上这种变化是有益的。

For a long time preparations and planing have been underway to remove the offset and count fields from java.lang.String. These two fields enable multiple String instances to share the same backing character buffer. Shared character buffers were an important optimization for old benchmarks but with current real world code and benchmarks it's actually better to not share backing buffers. Shared char array backing buffers only "win" with very heavy use of String.substring. The negatively impacted situations can include parsers and compilers however current testing shows that overall this change is beneficial.

这篇关于Java 7 String - 子串复杂性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆