什么是一个很好的策略将类似的话吗？ [英] What is a good strategy to group similar words?

查看：216 发布时间：2015/11/30 14:22:37 python algorithm redis

本文介绍了什么是一个很好的策略将类似的话吗？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

说我有电影名称与拼写错误和小的变化这样一个清单 -

Say I have a list of movie names with misspellings and small variations like this -

 "Pirates of the Caribbean: The Curse of the Black Pearl"
 "Pirates of the carribean"
 "Pirates of the Caribbean: Dead Man's Chest"
 "Pirates of the Caribbean trilogy"
 "Pirates of the Caribbean"
 "Pirates Of The Carribean"

我如何组或找到这样的套语，preferably使用python和/或Redis的？

How do I group or find such sets of words, preferably using python and/or redis?

推荐答案

有一个在模糊匹配。在下面的线程一些伟大的工具串之间计算的相似性。

Have a look at "fuzzy matching". Some great tools in the thread below that calculates similarities between strings.

我特别喜欢的 difflib 模块

>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
['apple', 'ape']
>>> import keyword
>>> get_close_matches('wheel', keyword.kwlist)
['while']
>>> get_close_matches('apple', keyword.kwlist)
[]
>>> get_close_matches('accept', keyword.kwlist)
['except']

<一个href="http://stackoverflow.com/questions/682367/good-python-modules-for-fuzzy-string-comparison">Good Python模块模糊字符串比较？

这篇关于什么是一个很好的策略将类似的话吗？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

什么是一个很好的策略将类似的话吗？ [英] What is a good strategy to group similar words?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

什么是一个很好的策略将类似的话吗？ [英] What is a good strategy to group similar words?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭