开源地址清理器? [英] Open Source Address Scrubber?

查看:161
本文介绍了开源地址清理器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组名称和地址,已经输入和excel电子表格,但问题是,许多人输入地址输入他们在许多不同的非标准格式。我想擦除地址,然后将所有这些地址传输到我的数据库。环顾四周,我发现地址擦除器(解析器或格式化程序)的方式是由信号量。为了我的目的,我真的不需要所有的,我不想支付软件的许可费用。有没有什么是免费和/或开源,将为我擦洗?

I have set of names and addresses that have been entered into and excel spreadsheet, but the problem is that the many people that entered the addresses entered them in many different non-standard formats. I want to scrub the addresses before transferring all of of them to my database. Looking around, all I really found in the way of address scrubbers(parsers or formatters) is the one that is put out by Semaphore. For my purposes, I don't really need all of that and I don't want to pay for the licensing fees for the software. Is there anything out there that is Free and/or Open Source that will do the scrubbing for me?

推荐答案

因为我在邮件业务工作...

Since I work in the mailing business ...

可邮寄地址不是地理编码。一个允许USPS传递邮件,另一个告诉你在哪里的地方。 USPS不对他们的可寻址地址进行地理编码。这对于标记要定位的区域/区域非常有用。

A mailable address is not geo-coding. One allows the USPS to deliver mail to and the other tells you where on earth that point is. The USPS does not geo-code their mailable addresses. It's useful for marking areas/regions of people for targeting.

您不是购买软件的许可证,而是购买数据。邮局有很多规则,特别是如果你做这个商业,试图获得比一流的更好的速度。有关完整的规则列表,请参见 USPS国内邮件手册。 USPS一直在拉链之间移动拉链和家庭。公司(我工作)支付USPS的更新的邮件列表,所以我们可以保持我们的数据库更新。每周。

You're not buying a license to the software, you're buying the data. The post office has lots of rules especially if you're doing this commercially and trying to get a better rate than first class. See USPS Domestic Mail Manual for the complete list of rules. The USPS moves zips and households between zips all the time. The company (I work for) pays the USPS for its updated mailing list so we can keep our DBs updated. Weekly.

返回您的问题。您要将数据更改为常用格式(街 - > st),还是要查找重复项,并且只想存储实际的可用邮件地址?

Back to your question. Do you want to change the data into a common format (street -> st) or are you looking for duplicates and want to only store real mailable addresses ?

;您可以将地址分成多个部分,清除空白空间并应用术语/翻译字典。然后应用一些sql来查找重复的。请记住住户(1个主要地区)与人(约翰·doe,1个主要地区)不同。

for common format; you can break the address into pieces, clean up the white space and apply a dictionary of terms/translations. Then apply some sql to find the duplicates. Keep in mind households (1 main st) are different from persons (john doe, 1 main st).

不会喜欢这个答案,但你想要的信息,这是不是免费的。有人花费时间或金钱来获取和维护这些列表。所以,找到一个商业模式来获得资金列表或去某人谁会为你做。 数据和邮件管理

for the mailable addresses, well some of you (the readers) won't like this answer, but you want information and that isn't free. Someone spends time or money to acquire and maintain these lists. So, find a business model to acquire funds for the list or go to someone who will do it for you. Data and mail management

实际上,信号量相当便宜,只要记住,地址db必须每季度更新一次,$ 19 /季度是相当便宜。

Realistically, Semaphore is pretty cheap, just keep in mind that the address db will have to be updated quarterly and $19/quarter is pretty cheap.

另一个地址擦洗产品。 SAP PostalSoft 。我不知道数据会花费多少。

Another Address Scrubbing product. SAP PostalSoft. I don't know what the data will cost though.

这篇关于开源地址清理器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆