正则表达式找到有效的DateTime可解析字符串 [英] Regex to locate valid DateTime parseable strings

查看:137
本文介绍了正则表达式找到有效的DateTime可解析字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,使用正则表达式解析DateTime字符串是不合适的 - DateTime.Parse或TryParse要好得多,并考虑到本地格式,时区和其他任何可能合适的内容。这些问题是如果有任何虚假的前导或尾随字符,它们就会失败。



所以我需要做的是正则表达文档来查找字符串,然后使用适当的方法解析匹配的子字符串





I understand that having a regex parse a DateTime string is inappropriate - The DateTime.Parse or TryParse is much better for that, and takes into account local formatting, time zones, and anything else that may be appropriate. The problem with those is if there are any spurious leading or trailing characters, they just fail.

So what I need to do is regex the document to find the strings, and then use the appropriate method to parse the matched substring

i.e.

Regex re = new Regex(....);  // Somewhere in there will be (?<datetime>...)
Matches matches = re.Matches(document);
foreach (Match m in matches) {
    DateTime dt;
    if (DateTime.TryParse(m.Groups["datetime"], out dt))
        OperateOn(dt);
}





我意识到我可能必须限制可接受的匹配,但是如果我可以获得所有标准的DateTime输出格式对于en-AU,en-UK和en-US(除了M或Y格式,因为它们没有生成完整的日期),我会是一个快乐的人。



I realise I may have to limit the acceptable matches, but if I could get all the standard DateTime output formats for en-AU, en-UK, and en-US (except the "M" or "Y" formats, as they do not produce a full date), I'd be a happy man.

推荐答案

让我感到惊讶的是DateTime.TryParse是如何健壮的(注意:我使用DateTime.TryParse的唯一经验是使用标准的英文字符编码);示例:
What surprises me is how "robust" DateTime.TryParse can be (note: my only experience using DateTime.TryParse is with standard English char encodings); example:
string strDate = "\t 12 .  12  /2015 14:34";
DateTime realDate;
DateTime.TryParse(strDate, out realDate);

这会给你一个合理的结果。



我不清楚你可能正在使用的文化背景和焦化者的范围,但是看看DateTime.TryParse在其他环境中是否具有可靠性可能很有价值。 。



编辑:使用Midi-Mick的测试数据和DateTime.TryParse:

That will give you a plausible result.

I am not clear about the range of possible culture-contexts and char-whatevers you may be working with, but it might be valuable to see if DateTime.TryParse may be as robust in those other contexts.

Using Midi-Mick's test data with DateTime.TryParse:

List<string> test = new List<string>
{
    "29/11/15",
    "29 November 2015 6:27PM",
    "2015-11-29T18:27:45.50+10:00",
    "2015-11-29 18:27:45.50Z",
    "NOV 29, 2015",
    "21:15"
};

private void parseTest()
{
    DateTime testDate;
    
    foreach (string str in test)
    {
        if (DateTime.TryParse(str, out testDate))
        {
            Console.WriteLine(testDate);
        }
        else
        {
            Console.WriteLine("could not parse: {0}", str);
        }
    }
}

/* results of test
    could not parse: 29/11/15
    11/29/2015 6:27:00 PM
    11/29/2015 3:27:45 PM
    11/30/2015 1:27:45 AM
    11/29/2015 12:00:00 AM
    11/30/2015 9:15:00 PM (reflects local time when test was run at GMT+7
*/


这个正则表达式应该处理大多数变体。如果你遇到另一种格式你可以只添加到表达式



This regular expression should take care of most of the variants. If you run into yet another format you can just add to the expression

Regex regex = new Regex(@"([0-9]{2}\s+[a-v]{3,9}\s+[0-9]{4}\s+[0-9]{1,2}:[0-9]{2}(AM|PM))|([a-v]{3,9}\s+[0-9]{2},\s*[0-9]{4})|([0-9]{1,2}/[0-9]{1,2}/[0-9]{1,2})|([0-9]{4}-[0-9]{2}-[0-9]{2}(T| )[0-9]{2}:[0-9]{2}:[0-9]{2})", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Multiline);

MatchCollection matchCollection = regex.Matches( [Target_string] );
foreach (Match match in matchCollection)
{
    DateTime dt;
    if (DateTime.TryParse(match.Value, out dt))
    {
        // Do what you are supposed to do.
    }
    else
    {
        // Well, what should you do if you can't convert a matched string?
    }
}





经过测试



Tested on

29 Nov 2015 4:15PM
29 November 2015 4:15PM
29/11/15
Nov 29, 2015
November 29, 2015
11/29/15
2015-11-29 16:15:00Z


尝试使用:

Have a try with:
Regex reg = new Regex(@"(\d+)[.](\d+)[.](\d+)");



OR,


OR,

var regex = new Regex(@"\b\d{2}\.\d{2}.\d{4}\b");



OR,


OR,

Regex regex = new Regex(
      ";(?<date>.+?)",
    RegexOptions.IgnoreCase
    | RegexOptions.CultureInvariant
    | RegexOptions.IgnorePatternWhitespace
    | RegexOptions.Compiled
    );



OR,

这一个 [ ^ ]


这篇关于正则表达式找到有效的DateTime可解析字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆