比较字符串(文字和数字)的最快方法 [英] Fastest way to compare strings (literal and numerical)

查看:566
本文介绍了比较字符串(文字和数字)的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与字符串比较有关的性能问题(在Java中).

I have a performance problem related to string comparison (in Java).

我正在研究一个需要对巨大列表进行排序的项目(Eclipse中的TableViewer).无论如何,我已经将瓶颈确定为要比较的字符串的compareTo()调用.

I'm working on a project that needs to sort a huge list (a TableViewer in Eclipse). Anyway I've pinpointed the bottleneck down to the call to compareTo() for the string to be compared.

有什么方法可以优化字符串比较的性能?我搜索并用Google搜索都无济于事...

Is there any way to optimize the performance of a string compare? I've searched and googled to no avail...

由于该项目严格限于Win32环境,所以我想也许可以利用它...

As the project is strictly limited to a Win32 environment, I was thinking that maybe it would be possible to leverage on that...

任何建议将不胜感激.

我忘了提到我需要对字符串进行数值比较和文字比较.

I forgot to mention that I would need both numerical comparison and literal comparison of the strings.

目标实质上是加快用户界面的速度,因为每次单击表头进行排序时,都不能等待几秒钟.我正在研究以某种方式缓存值以加快比较的速度.由于字符串几乎是静态的,所以我认为这是有可能的.

The goal is essentially to speed up the user interface because it is unacceptable to wait a few seconds each time you click on the header of the table to perform a sort. I'm looking into maybe caching values somehow to speed up the comparison. As the strings are pretty much static I think it would be possible.

我知道很多人对try()-catch()感到不安.实际上,这没什么大问题,因为即使我删除了该代码并仅执行catch块(单个compareTo()),它的执行速度实际上也与原始代码相同.但是,如果我也将compareTo()注释掉;只剩下比较功能(获取标签等)的开销,它快如闪电.因此,我仍然需要一种更好的方式来比较字符串.通过缓存或其他一些魔术.

I know a lot of you have been disturbed by the try()-catch() thing. Actually that is less of a concern because even if I remove that code and only execute the catch-block (a single compareTo()) it still executes at virtually the same speed as the original code. If however I comment out the compareTo() also; leaving me with only the overhead of the compare function (getting labels, etc.) it is lightning fast. So I still need a better way to compare strings. Either by caching or by doing some other magic.

不幸的是,无法更改排序算法-但是我怀疑它的速度太慢,因为它可以非常快速地成功对纯整数进行排序.

Unfortunately it is not possible to change the sorting algorithm - however I doubt that it is that slow because it succeeds in sorting pure integers quite fast.

澄清:

compare函数被实现为TableViewer框架的一部分,用于执行排序操作,这意味着我没有实现特定的排序算法,而是由SWT/JFace实现.我只实现比较功能.

The compare function is implemented as part of the TableViewer framework for performing sort operations which means that I'm not implementing the specific sorting algorithm but rather it is implemented by SWT/JFace. I'm only implementing the compare function.

更有趣的是,用于对双精度字进行排序的代码比字符串比较要快 .与实际文字字符串相比,仅对数字进行排序的速度更快....这使我得出结论,compareTo()方法中发生了一些令人毛骨悚然的事情...

What is further more interesting is the fact that the code for sorting doubles is faster than the string comparison. It is faster to sort columns with only numbers than with actual literal strings.... Which leads me to the conclusion that something fishy is going on in the compareTo() method...

这是功能的核心:

// e1Label and e2Label is Strings to be compared
//

// Be smart about the comparison and use non-lexical comparison if
// possible (i.e. if both strings are actually numbers...)
//
// Warning: This is only "semi-smart" as the sorting might get "a bit"
// messed up if some of the values in a column can be parsed as
// doubles while others can not...
//
try {
    // Try using numeric (double) comparison of label values
    //
    double e1_double = Double.parseDouble(e1Label);
    double e2_double = Double.parseDouble(e2Label);
    rc = Double.compare(e1_double, e2_double);
} catch (NumberFormatException e) {
    // Use lexical comparison if double comparison is not possible
    //
    rc = e1Label.compareToIgnoreCase(e2Label);
}

推荐答案

如果您了解String内容,则可以预先计算并存储其他信息以加快比较.例如,假设您的String仅包含大写字母A-Z.您可以根据前3个字母为String分配等级.例如

If you have knowledge about your String content you can pre-compute and store additional information to speed up the comparison. For example, suppose your Strings only contained capital letters A-Z. You could assign a rank to the String based on say the first 3 letters; e.g.

  • AAA:= 1
  • AAB:= 2
  • ...
  • ABA:= 27

然后,您可以通过首先比较每个String的排名(基于快速int的比较),然后在排名相等的情况下仅执行完整的String比较,来加快compareTo的速度.

Then you could speed up your compareTo by first comparing each String's rank (fast int based comparison) and then only performing a full String compare if the ranks were equal.

这篇关于比较字符串(文字和数字)的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆