在Java中的字符串对集合迅速比较 [英] Quickly compare a string against a Collection in Java

查看:715
本文介绍了在Java中的字符串对集合迅速比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图来计算字符串编辑距离对集合找到最接近的匹配。我现在的问题是,收藏是非常大的(约25000项),所以我不得不缩小设置为相同的长度只是字符串,但仍然只会范围缩小到几千弦,这仍然是很慢的。是否有一个数据结构,允许类似的字符串或快速查找有另一种办法可以解决这个问题?

I am trying to calculate edit distances of a string against a collection to find the closest match. My current problem is that the collection is very large (about 25000 items), so I had to narrow down the set to just strings of similar lengths but that still would only narrow it down to a few thousand strings and this still is very slow. Is there a datastructure that allows for a quick lookup of similar strings or is there another way I could address this problem?

推荐答案

听起来像一个 BK树可能是你想要的。这里有一篇文章讨论这些问题:<一href="http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees">http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees.一个快速谷歌产生了一些Java实现。

Sounds like a BK-tree might be what you want. Here's an article discussing them: http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees. A quick Google yields some Java implementations.

这篇关于在Java中的字符串对集合迅速比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆