使用 Python Regex 从字符串中提取门牌号和街道名称 [英] Extract House Number and Street Name from string using Python Regex

查看:110
本文介绍了使用 Python Regex 从字符串中提取门牌号和街道名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Regex 的新手,正在尝试使用它来将地址解析为门牌号和街道.

I'm new to Regex and am trying to use it to parse apart addresses into House Number and Street.

示例:123 Main St --> ['123', 'Main St']

Example: 123 Main St --> ['123', 'Main St']

由于我的一些街道字符串将包含带连字符的街道地址,因此有点复杂,在这种情况下,我想在连字符之前取第一个数字.

It gets slightly complicated by the fact that some of my street strings will have hyphenated street addresses, in which case I want to take the first number before the hyphen.

例如:123-127 Main St --> ['123', 'Main St']

Example: 123-127 Main St --> ['123', 'Main St']

最后,我需要能够处理以数字开头的街道名称.

Lastly, I need to be able to handle street names that start with a number.

最复杂的例子是:123-127 3rd Ave --> ['123', '3rd Ave']

Most complicated example being: 123-127 3rd Ave --> ['123', '3rd Ave']

到目前为止,我已经能够提取街道号码,包括在带连字符的情况下,但我不确定如何提取匹配街道号码模式后的街道名称.

So far I've been able to extract the street number, including in the hyphenated scenario, but I'm unsure how to extract the street name which comes after matching the street number pattern.

MyString='123-127 Main St'
StreetNum=digit=re.findall('(^\d+)', MyString)

感谢您的帮助!

我还在编辑问题以指出破折号并不是唯一可以用两个数字分隔街道的字符.数据中总共出现了三种情况:

Am also editing the question to point out that a dash is not the only character that can separate streets with two numbers. There are three total situations that come up in the data:

1) 第 5 街 123-127 号

1) 123-127 5th St

2) 123 1/2 第五街

2) 123 1/2 5th St

3) 123 &第 5 街 125 号

3) 123 & 125 5th St

在所有 3 种情况下,结果都应该是 123 5th St.

In all 3 of these situations the result should be 123 5th St.

推荐答案

希望这是您正在寻找的:

Hope this is what you're looking for:

(\d+).*?\s+(.+)

这篇关于使用 Python Regex 从字符串中提取门牌号和街道名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆