为什么String.indexOf()不使用KMP? [英] Why does String.indexOf() not use KMP?

查看:89
本文介绍了为什么String.indexOf()不使用KMP?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读了 java.lang.String 的源代码,但惊讶地发现 String.indexof()不使用

I read the source code of java.lang.String and I was surprised to find that String.indexof() does not use the Knuth–Morris–Pratt algorithm? As we know, KMP is more effective. So why isn't it used in String.indexOf()?

我周围的人告诉我,对于短字符串KMP来说已经足够了,但是如果您需要性能并且打算与大字符串一起使用,那么它不是一个好选择.但是他没有告诉我细节.

Someone around me told me that for short string KMP is good enough, but if you need performance and you intend to use with large strings then is not a good choice. However he didn't tell me the details.

所以,这是我的问题:

  1. 我们为什么不在 String.indexOf()中使用KMP?
  2. 为什么对于大的Strings,KMP不是一个好的选择?

推荐答案

KMP在较差的情况下具有更好的性能,但实际上需要进行一些前期计算(以生成偏移量表). 还需要初始内存分配,这也可能影响性能.

KMP has better worst-case performance, but actually requires a little bit of up-front computation (to generate the table of offsets). It also requires an initial memory allocation, which could also impact performance.

对于(大概)在较短字符串中进行搜索的常见用例,这实际上可能比原始实现要慢.

For (presumably) common use-cases of searching in relatively short strings, this might actually end up slower than the primitive implementation.

与以下事实捆绑在一起:对于非常庞大的数据集,您可能会使用比简单的 String 更专业的数据结构,这意味着增加的实现(可能还有运行时)成本不值得投资

This, bundled with the fact that for really huge data sets you will probably be using more specialized data structures than a simple String means that the increased implementation (and possibly runtime) cost is not worth investing.

请注意,由于未指定实际算法,因此在将来的Java版本中可能会发生 更改.

Note that this might change in future Java versions, as the actual algorithm is not specified.

这篇关于为什么String.indexOf()不使用KMP?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆