查找数据库中的重复地址,阻止用户提前输入它们? [英] find duplicate addresses in database, stop users entering them early?

查看:119
本文介绍了查找数据库中的重复地址,阻止用户提前输入它们?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在数据库中找到重复的地址,或者在填写表单时更好地停止用户?我想早些时候更好?

How do I find duplicate addresses in a database, or better stop people already when filling in the form ? I guess the earlier the better?

有没有什么好的方法抽象街道,邮政编码等,以便拼写错误和简单的尝试,可以检测到2注册?例如:

Is there any good way of abstracting street, postal code etc so that typos and simple attempts to get 2 registrations can be detected? like:

Quellenstrasse 66/11 
Quellenstr. 66a-11

我说的是德国地址...
谢谢! >

I'm talking German addresses... Thanks!

推荐答案


Johannes:

Johannes:


@PConroy:这也是我最初的initial。有趣的部分就是为地址的不同部分找到好的转换规则!任何好的建议?

@PConroy: This was my initial thougt also. the interesting part on this is to find good transformation rules for the different parts of the address! Any good suggestions?


当我们以前处理这种类型的项目时,使用我们现有的地址(150k左右),然后对我们的域应用最常见的变换(爱尔兰,所以Dr - >驱动器,Rd - >道路等)。恐怕没有全面的在线资源,这样的事情,当时,所以我们最终基本上提出了一个清单,我们自己,检查的东西,像电话簿(压缩空间,地址以各种方式缩写! )。正如我前面提到的,你会惊讶你会发现有多少重复,只添加了一些常用规则!

When we were working on this type of project before, our approach was to take our existing corpus of addresses (150k or so), then apply the most common transformations for our domain (Ireland, so "Dr"->"Drive", "Rd"->"Road", etc). I'm afraid there was no comprehensive online resource for such things at the time, so we ended up basically coming up with a list ourselves, checking things like the phone book (pressed for space there, addresses are abbreviated in all manner of ways!). As I mentioned earlier, you'd be amazed how many "duplicates" you'll detect with the addition of only a few common rules!

我最近偶然发现一个包含相当全面的地址缩写列表的网页,虽然它是美国英语,因此我不知道在德国有多有用!一个快速谷歌翻了几个网站,但他们似乎是垃圾邮件通讯注册陷阱。虽然这是我用谷歌英语,所以你可以看看更多的德语德国地址缩写:)

I've recently stumbled across a page with a fairly comprehensive list of address abbreviations, although it's american english, so I'm not sure how useful it'd be in Germany! A quick google turned up a couple of sites, but they seemed like spammy newsletter sign-up traps. Although that was me googling in english, so you may have more look with "german address abbreviations" in german :)

这篇关于查找数据库中的重复地址,阻止用户提前输入它们?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆