使正则表达式更具体以排除某些字符 [英] making regex more specific to exclude certain characters
问题描述
import re
s = '01.11.11 12/12/1981 1*51*12 . 22|1|13 03-02-1919 1-22-12 or 01-23-18 or 03-23-1984 01.11.18 or 2.2.17 or 02.02.18 or 12.1.16 12.23.1943 01-23-11 not 12.23.192 not 02.02.1'
我有以下字符串 s
并且我想提取由 3 个项目分隔的所有日期:1)一个句点,例如01.11.11
或 2) 破折号,例如1-22-12
或 3) 反斜杠,例如12/12/1981
.
I have the following string s
and I want to extract all the dates that are separated by 3 items: either 1) a period e.g. 01.11.11
or 2) a dash e.g. 1-22-12
or 3)a backslash e.g. 12/12/1981
.
为此,我尝试了以下方法
To do so, I have tried the following
reg = r'\d{1,2}.\d{1,2}.(?:\d{4}|\d{2})'
r1 = re.findall(reg,s)
它有效,但给了我一些不需要的东西,例如 '1*51*12'
和 22|1|13'
It works but gives me some unwanted things such as '1*51*12'
and 22|1|13'
['01.11.11',
'12/12/1981',
'1*51*12',
'22|1|13',
'03-02-1919',
'1-22-12',
'01-23-18',
'03-23-1984',
'01.11.18',
'2.2.17',
'02.02.18',
'12.1.16',
'12.23.1943',
'01-23-11',
'12.23.19']
我希望我的输出是
['01.11.11',
'12/12/1981',
'03-02-1919',
'1-22-12',
'01-23-18',
'03-23-1984',
'01.11.18',
'2.2.17',
'02.02.18',
'12.1.16',
'12.23.1943',
'01-23-11',
'12.23.19']
如何调整 reg
以使其更具体并获得我想要的输出?
How do I tweak reg
to be more specific and get my desired output?
推荐答案
\b((?:\d{1,2}(?:\.|\/|-)){2}(?:\d{4}|\d{2}))\b
此正则表达式将匹配您的所有测试用例,并会过滤不正确的年份,例如 12.23.192
This regex will match all of your test cases, and will filter improper years, such as 12.23.192
这篇关于使正则表达式更具体以排除某些字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!