具有显着变化长度的输入的最佳StringBuffer初始容量是多少? [英] What is the optimal StringBuffer initial capacity for inputs with drastically varying lengths?

查看:158
本文介绍了具有显着变化长度的输入的最佳StringBuffer初始容量是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下午好,我正在使用 java .lang.StringBuilder 存储一些字符。我不知道我要提前存放多少个角色,除了:

Good afternoon all, I'm using a java.lang.StringBuilder to store some characters. I have no idea how many characters I'm going to store in advance, except that:


  1. 60%的时间,它只是(确切地说)7个字符

  2. 39%的时间,(大约)3500个字符

  3. 1%的时间,大约是20k字符

我们如何计算应该使用的最佳初始缓冲区长度?

How do we go about calculating the optimal initial buffer length that should be used?

目前我正在使用 new java.lang.StringBuilder(4000)但这只是因为我以前懒得思考。

Currently I'm using new java.lang.StringBuilder(4000) but that's just because I was too lazy to think previously.

推荐答案

这里有两个因素:时间和内存消耗。时间主要受调用 java.lang.AbstractStringBuilder.expandCapacity()的次数的影响。当然,每次调用的成本与缓冲区的当前大小成线性关系,但我在这里简化并只计算它们:

There are two factors here: time and memory consumption. The time is mostly influenced by the number of times java.lang.AbstractStringBuilder.expandCapacity() is called. Of course the cost of each call is linear to the current size of the buffer, but I am simplifying here and just counting them:


  • 在60%的情况下, StringBuilder 将扩展0次

  • 在39%的情况下 StringBuilder 将扩展8次

  • 在1%的情况下, StringBuilder 将扩展11次

  • In 60% of the cases the StringBuilder will expand 0 times
  • In 39% of the cases the StringBuilder will expand 8 times
  • In 1% of the cases the StringBuilder will expand 11 times

expandCapacity 的预期数量为3,23。

The expected number of expandCapacity is 3,23.


  • 在99%的情况下 StringBuilder 将展开0次

  • 在1%的情况下, StringBuilder 将展开3次

  • In 99% of the cases the StringBuilder will expand 0 times
  • In 1% of the cases the StringBuilder will expand 3 times

expandCapacity 的预期数量是0,03。

The expected number of expandCapacity is 0,03.

你可以看到第二个情景rio似乎要快得多,因为它很少需要扩展 StringBuilder (每100个输入三次)。但请注意,第一次扩展不太重要(复制少量内存);另外,如果你在巨大的块中向构建器添加字符串,它将在更少的迭代中更加热切地扩展。

As you can see the second scenario seems much faster, as it very rarely has to expand the StringBuilder (three time per every 100 inputs). Note however that first expands are less significant (copying small amount of memory); also if you add strings to the builder in huge chunks, it will expand more eagerly in less iterations.

另一方面,内存消耗增加:

On the other hand the memory consumption grows:


  • 在60%的情况下, StringBuilder 将占用16个字符

  • 在39%的情况下 StringBuilder 将占用4K个字符

  • 在1%的情况下, StringBuilder 将占用32K字符

  • In 60% of the cases the StringBuilder will occupy 16 characters
  • In 39% of the cases the StringBuilder will occupy 4K characters
  • In 1% of the cases the StringBuilder will occupy 32K characters

预期平均内存消耗为: 1935 字符。

The expected average memory consumption is: 1935 characters.


  • 在99%的情况下, StringBuilder 将占用4K个字符

  • 在1%的情况下, StringBuilder 将占用32K字符

  • In 99% of the cases the StringBuilder will occupy 4K characters
  • In 1% of the cases the StringBuilder will occupy 32K characters

预期的平均内存消耗为: 4383 字符。

The expected average memory consumption is: 4383 characters.

这让我相信将初始缓冲区扩大到4K会使内存消耗增加两倍以上,同时将程序加速两个数量级

This makes me believe that enlarging the initial buffer to 4K will increase the memory consumption by more than two times while speeding up the program by two orders of magnitude.

底线是:试试!编写一个能够处理不同长度和不同初始容量的百万字符串的基准并不难。但我相信更大的缓冲区可能是一个不错的选择。

The bottom line is: try! It is not that hard to write a benchmark that will process million strings of various length with different initial capacity. But I believe a bigger buffer might be a good choice.

这篇关于具有显着变化长度的输入的最佳StringBuffer初始容量是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆