使用正则表达式从python中的自由格式文本中提取电话号码 [英] Extracting phone numbers from a free form text in python by using regex
问题描述
我必须从自由形式的文本中提取电话号码.
I have to extract phone numbers from free form of texts.
如何在python中使用reg-ex来管理它?
How can I manage it by using reg-ex in python?
我发现一个是为了提取电子邮件地址. https://gist.github.com/dideler/5219706
I have found for one in order to extract e-mail addresses. https://gist.github.com/dideler/5219706
我已经通过使用电话号码正则表达式而不是电子邮件地址正则表达式实现了相同的方法,但是我无法获得输出.
I have implemented the same approach by using a phone number regex instead of e-mail address regex, but I couldn't get output.
def get_phoneNumber(text):
phone_number = ""
regex = re.compile("((\(\d{3,4}\)|\d{3,4}-)\d{4,9}(-\d{1,5}|\d{0}))|(\d{4,12})")
for phoneNumber in get_phoneNumbers(text, regex):
phone_number = phone_number + phoneNumber + "\n"
return phone_Number
def get_phoneNumbers(s, regex):
return (phoneNumber[0] for phoneNumber in re.findall(regex, s)
我该如何做到呢?
推荐答案
此正则表达式与来自北美的典型电话号码匹配
This regex matches typical phone numbers from North America
匹配3334445555、333.444.5555、333-444-5555、3334445555,(333)4445555及其所有组合,例如333 4445555,(333)4445555或333444-5555. 与国际符号+13334445555不匹配,但与+1 333 4445555中的国内部分匹配.
Matches 3334445555, 333.444.5555, 333-444-5555, 333 444 5555, (333) 444 5555 and all combinations thereof, like 333 4445555, (333)4445555 or 333444-5555. Does not match international notation +13334445555, but matches domestic part in +1 333 4445555.
\(?\b[2-9][0-9]{2}\)?[-. ]?[2-9][0-9]{2}[-. ]?[0-9]{4}\b
来源:RegexBuddy
Source: RegexBuddy
以下Python代码遍历所有匹配项
The following Python code iterates over all matches
for match in re.finditer(r"\(?\b[2-9][0-9]{2}\)?[-. ]?[2-9][0-9]{2}[-. ]?[0-9]{4}\b", subject):
# match start: match.start()
# match end (exclusive): match.end()
# matched text: match.group()
您期望什么模式?
这篇关于使用正则表达式从python中的自由格式文本中提取电话号码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!