局部敏感哈希实现? [英] Locality Sensitive Hash Implementation?

查看:657
本文介绍了局部敏感哈希实现?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有任何C比较简单易懂(并实现简单)局部性敏感哈希例子/ C ++ / Java的/ C#?

Are there any relatively simple to understand (and simple to implement) locality-sensitive hash examples in C/C++/Java/C#?

我想更多地了解这个概念,因此想尝试一些文本文件的实施只是为了看看它是如何工作的,所以我什么都不需要高性能或什么...只是一个例子的用于返回类似的输入类似哈希的哈希函数。我可以学习例如,从更后。 :)

I'd like to learn more about the concept and so want to try an implementation on a few text files just to see how it works, so I don't need anything high-performance or anything... just an example of a hash function that returns similar hashes for similar inputs. I can learn more from it by example afterwards. :)

推荐答案

对于字符串您可以使用近似匹配算法。

For strings you can use approximate matching algorithm.

  • Generate a random string
  • For all the strings compute their distance from that random shared string using an algorithm like http://www.dotnetperls.com/levenshtein

如果字符串是等距离的参考线,然后有机会,他们是彼此相似。有你去,你有一个地方senitive字符串哈希实施。

If the strings are equidistant from a reference string then chances are that they are similar to each other. And there you go you have a locality senitive hash implementation for strings.

您可以为距离范围内创建不同的散列桶。

You can create different hash buckets for a range of distances.

编辑::您可以尝试串的距离等变化。一个更简单的算法将只返回没有。两个字符串之间的共性。

You can try other variations of string distance. A simpler algorithm would just return no. of common characters between two strings.

这篇关于局部敏感哈希实现?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆