使用 SQL Server 2005 模糊匹配可能重复项的良好 SQL 策略 [英] A good SQL strategy for fuzzy matching possible duplicates using SQL Server 2005

查看：24 发布时间：2021/8/26 19:05:43 sql-server-2005 fuzzy-search

本文介绍了使用 SQL Server 2005 模糊匹配可能重复项的良好 SQL 策略的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在大型数据库中查找可能的候选重复记录，匹配 COMPANYNAME 和 ADDRESSLINE1 等字段

I want to find possible candidate duplicate records in a large database matching on fields like COMPANYNAME and ADDRESSLINE1

示例:

对于具有以下 COMPANYNAME 的记录:

For a record with the following COMPANYNAME:

Acme, Inc."

我希望我的查询以这些 COMPANYNAME 值作为可能的重复输出其他记录:

I would like for my query to spit out other records with these COMPANYNAME values as possible dups:

Acme 公司"
"Acme, Incorporated"
极致"

我知道如何进行连接、相关子查询等，以完成提取我想要的数据集的机制.我知道这之前已经在这里讨论过.我有兴趣听到关于进行模糊搜索的最佳方法的想法 - 我应该使用全文索引还是 soundex 函数或其他我不知道的方法?(我使用的是 SQL Server 2005)

I know how to do the joins, correlated subqueries, etc. to do the mechanics of pulling the set of data I want. And I know that has been covered on here before. I am interested hearing thoughts on the best way to do the fuzzy searching - should I use full-text indexing or the soundex function or something else that I am unware of for this process? (I am using SQL Server 2005)

感谢任何帮助！

使用 SQL Server 2005 模糊匹配可能重复项的良好 SQL 策略 [英] A good SQL strategy for fuzzy matching possible duplicates using SQL Server 2005

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用 SQL Server 2005 模糊匹配可能重复项的良好 SQL 策略 [英] A good SQL strategy for fuzzy matching possible duplicates using SQL Server 2005

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭