2个表/数据集之间的地址匹配 [英] Address matching between 2 tables/datasets
问题描述
您好,
我有2个表/数据集,它有不同的字段,如姓名,年龄,性别,地址。我想在地址栏上运行匹配。我想要一个只获取两个表之间匹配地址的程序。这里的问题是可以通过多种方式输入相同的地址
例如
表1包含
114 Mary Street
表2包含
114 Mary St
114 Mary St.
以上样本记录相同但在通过查询匹配时会被视为不同。它需要一些算法,因为相同的地址可以用1000种不同的方式编写,同一地址也可以包含拼写错误。
我已经搜索了很多关于可能的解决方案,许多人推荐模糊搜索算法,但我不知道在哪里以及如何开始。
我正在寻找有效算法的想法。任何想法都可以是伪代码或您的首选语言。
任何帮助都将受到高度赞赏。
谢谢
我尝试了什么:
我在不同的地方搜索过这个,但仍然没有运气许多人推荐模糊搜索算法我不知道在哪里以及如何开始。
数据有两个表格,它有很多记录所以如果我能得到一个带来的程序会非常有帮助我类似或近似相同的记录。
谢谢。
Hello,
I have 2 tables/DataSets and it has different fields like Name, Age, Gender,Address. I want to run a match on the address column. I want a program which fetches only the matched addresses between the two tables. The problem here is that the same address can be entered in multiple ways
For Example
Table 1 Contains
114 Mary Street
Table 2 Contains
114 Mary St
114 Mary St.
The above sample records are same but they will be considered different when matched through a query. It requires some algo as a same address can be written in 1000 different ways and a same address can also contain typos.
I have searched a lot regarding the possible solution, many have recommended fuzzy search algorithm but i am not sure where and how to start.
I am looking for ideas for an effective algorithm. Any idea can be pseudo code or in your preferred language.
Any help would be highly appreciated.
Thanks
What I have tried:
I have searched regarding this at different places but still no luck many have recommended Fuzzy search algorithm i am not sure where and how to start.
The data is available in 2 tables and it has many records so it will be quite helpful if i can get a program that brings me similar or approximate same records.
Thanks.
推荐答案
几年前我问过类似问题 [ ^ ]。在那里你会找到2个答案。请阅读并阅读所有评论。
Few years ago i asked similar question[^]. There you'll find 2 answers. Please read them and read all comments.
这篇关于2个表/数据集之间的地址匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!