是否有可以解析这样的地址的Java解析器 [英] Is there a Java parser that can parse addresses like this

查看:122
本文介绍了是否有可以解析这样的地址的Java解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Java 6.我正在寻找一种自动解析地址的方法。我不担心地址是否存在。我发现的最好的东西是JGeocoder(v 0.4.1),但是JGeocoder无法解析这样的地址

I'm using Java 6. I'm looking for an automated way to parse addresses. I'm not concerned if the addresses exist or not. The best thing I have found is JGeocoder (v 0.4.1), but JGeocoder is unable to parse addresses like this

16th Street Theater, Berwyn Cultural Center,  6420 16th St.

有谁知道免费的Java地址解析器是迎接挑战?通过解析我的意思是能够区分街道,城市,州,邮政编码,以及可能的场地名称(上述场地名称是第16街剧院,伯温文化中心)。

Does anyone know of a free Java address parser that is up to the challenge? By "parse" I mean the ability to distinguish street, city, state, postal code, and potentially the venue name (the above venue name is "16th Street Theater, Berwyn Cultural Center").

推荐答案

更新 这个StackOverflow问题

我为 SmartyStreets 工作,我们在那里解析和处理地址,我们得到了答案。这就是我们所说的 SLAP 单行地址解析(或处理)。正式术语是命名实体识别(NER)

I work for SmartyStreets where we parse and process addresses, and we have an answer. This is what we call "SLAP" or Single-Line Address Parsing (or Processing). The formal term is Named Entity Recognition (NER).

我不是Java库的专家,但我知道任何内部实现都不符合预期。以下是我帮助过的人之前遇到困难的一些常见原因:

I'm not an expert on Java libraries, but I do know that any in-house implementations will not live up to expectations. Here's some common reasons that people who I've helped have previously had difficulty:


  • Google / Yahoo! / Bing Maps Web服务不允许自动查询,也不验证解析地址的准确性。

  • Google / Yahoo! / Bing Maps web services do not allow automated queries and do not verify accuracy of the parsed address.

内部代码也可以做出最好的猜测不知道现有地址(数据库)或其他种类的官方来源。我知道你想要一个可以在内部完成这项工作的图书馆,但你最多可以猜到...

In-house code can make also only make a best guess without any knowledge of existent addresses (a database) or other sorts of official sources. I know you want a library that can do this in-house, but you can at best make a guess...

顺便说一句,正则表达式是答案。我见过的解析地址的最好的正则表达式是在数百行代码和几个类中动态生成的。这是一个烂摊子,只适用于你所期望的地址类型,而不是所有有效的(美国)格式。

By the way, regular expressions are not the answer. The best regex I've seen to parse addresses was dynamically generated over hundreds of lines of code and several classes. It was a mess, and was only correct for types of addresses you'd expect, not all the valid (US) formats there actually are.

这是一项非常复杂的任务......除非你有合适的工具。我们的一项服务称为 LiveAddress API ,它类似于谷歌地图,因为它解析地址并对它们进行地理编码,但通过CASS认证并仅返回有效的地址更进一步,几乎无论输入格式如何。

This is an incredibly complex task... unless you have the right tools. One of our services is called LiveAddress API, and it's similar to Google Maps in that it parses addresses and geocodes them, but goes a step further by being CASS-Certified and returning only valid addresses, almost no matter the input format.

I鼓励你自己做一些研究,但这可能是最有效和最可靠的方法。

I encourage you to do some research of your own, but this is probably the most effective and reliable method.

这篇关于是否有可以解析这样的地址的Java解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆