使用正则表达式从字符串中提取日期和时间 [英] Extract Date and Time from a String Using Regex
问题描述
我正在研究一个正则表达式,它接受所有可能的日期和时间格式以从句子中提取它们.
I am working on a regex that accepts all possible formats of date and time to extract them from a sentence.
这是我的正则表达式:
@"(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:1|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)(?:0?2|(?:Feb))\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})(?:[\D]*)(?<time>\d{1,2}\:\d{2}\s(?:A|P)M)";
目前,正则表达式在提取句子中任何位置的时间时效果很好,但只有在句首时才提取日期.此外,如果句子中有第二个日期,则正则表达式不会确认它,但如果在它与日期旁边的文本匹配之后直接有文本.
Currently, the regex works perfectly when extracting time at any position in the sentence but extracts the date only if it is at the beginning of a sentence. Also, if there is a second date in the sentence the regex does not acknowledge it but if there is text directly after it matches the text alongside the date.
例如:
Meet me on 31/07/2019 at 3:00 PM to celebrate and then the meeting will be on 03/08/2019 at 12:00 PM.
正则表达式应该匹配:
1) 31/07/2019
2) 3:00 PM
3) 03/08/2019
4) 12:00 PM
注意:应从句子的任何部分(开头、中间、结尾)中提取预期输出
Note: The expected output should be extracted from any part of the sentence (Beginning, Middle, End)
推荐答案
\D*
+ 时间模式之前的正则表达式部分匹配各种类型的日期,并且必须在添加任何其他模式之前进行分组跟随.即,(?
.
Your regex part before the \D*
+ time pattern matches various types of dates, and must be grouped before adding any other pattern to follow. That is, (?<date>DATE1_PATTERN|DATE2_PATTERN|DATEn_PATTERN)\D*(?<time>TIME_PATTERN)
.
然后,只需匹配和访问命名组:
Then, just match and access named groups:
var s = "Meet me on 31/07/2019 at 3:00 PM to celebrate and then the meeting will be on 03/08/2019 at 12:00 PM.";
var pattern = @"(?<date>(?:(?:31([-/.])(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:1|30)([-/.])(?:0?[13-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})|(?:29([-/.])(?:0?2|Feb)\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))|(?:0?[1-9]|1\d|2[0-8])([-/.])(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2}))\D*(?<time>\d{1,2}:\d{2}\s[AP]M)";
var result = Regex.Matches(s, pattern);
foreach (Match m in result) {
Console.WriteLine(m.Groups["date"].Value);
Console.WriteLine(m.Groups["time"].Value);
}
参见 C# 演示,输出:
31/07/2019
3:00 PM
03/08/2019
12:00 PM
这篇关于使用正则表达式从字符串中提取日期和时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!