判断字符串是否为有效日期的最快方法 [英] Fastest way to tell if a string is a valid date
问题描述
我在工作中支持一个公共库,它对给定字符串执行许多检查以查看它是否是有效日期。 Java API,commons-lang库和JodaTime都有方法可以解析字符串并将其转换为日期,让您知道它是否实际上是一个有效的日期,但我希望有一种方法在没有实际创建日期对象(或JodaTime库的情况下的DateTime)的情况下进行验证。例如,这是一段简单的示例代码:
I am supporting a common library at work that performs many checks of a given string to see if it is a valid date. The Java API, commons-lang library, and JodaTime all have methods which can parse a string and turn it in to a date to let you know if it is actually a valid date or not, but I was hoping that there would be a way of doing the validation without actually creating a date object (or DateTime as is the case with the JodaTime library). For example here is a simple piece of example code:
public boolean isValidDate(String dateString) {
SimpleDateFormat df = new SimpleDateFormat("yyyyMMdd");
try {
df.parse(dateString);
return true;
} catch (ParseException e) {
return false;
}
}
这对我来说似乎很浪费,我们正在扔掉结果对象。从我的基准测试中,我们在这个公共图书馆中有大约5%的时间用于验证日期。我希望我只是错过了一个明显的API。任何建议都会很棒!
This just seems wasteful to me, we are throwing away the resulting object. From my benchmarks about 5% of our time in this common library is spent validating dates. I'm hoping I'm just missing an obvious API. Any suggestions would be great!
UPDATE
假设我们可以随时使用始终使用相同的日期格式(可能是yyyyMMdd)。我确实考虑过使用正则表达式,但是它需要知道每个月的天数,闰年等...
Assume that we can always use the same date format at all times (likely yyyyMMdd). I did think about using a regex as well, but then it would need to be aware of the number of days in each month, leap years, etc...
结果
解析日期1000万次
Using Java's SimpleDateFormat: ~32 seconds
Using commons-lang DateUtils.parseDate: ~32 seconds
Using JodaTime's DateTimeFormatter: ~3.5 seconds
Using the pure code/math solution by Slanec: ~0.8 seconds
Using precomputed results by Slanec and dfb (minus filling cache): ~0.2 seconds
有一些非常有创意的答案,我很感激!我想现在我只需要决定我需要多少灵活性,我希望代码看起来像。我要说dfb的答案是正确的,因为它纯粹是最快的,这是我原来的问题。谢谢!
There were some very creative answers, I appreciate it! I guess now I just need to decide how much flexibility I need what I want the code to look like. I'm going to say that dfb's answer is correct because it was purely the fastest which was my original questions. Thanks!
推荐答案
如果您真的关心性能而且日期格式非常简单,那么只需预先计算所有有效的字符串并在内存中散列它们。您上面的格式只有大约800万有效组合,最多2050
If you're really concerned about performance and your date format is really that simple, just pre-compute all the valid strings and hash them in memory. The format you have above only has ~ 8 million valid combinations up to 2050
Slanec编辑 - 参考实现
此实现取决于您的特定日期格式。它可以适应任何特定的日期格式(就像我的第一个答案,但更好一点)。
This implementation depends on your specific dateformat. It could be adapted to any specific dateformat out there (just like my first answer, but a bit better).
它使一组所有日期
从1900年到2050年(存储为字符串 - 其中有54787个),然后将给定日期与存储日期进行比较。
It makes a set of all dates
from 1900 to 2050 (stored as Strings - there are 54787 of them) and then compares the given dates with those stored.
日期
设置已创建,它很快就像地狱一样。与我的第一个解决方案相比,快速微基准测试显示了10倍的改进。
Once the dates
set is created, it's fast as hell. A quick microbenchmark showed an improvement by a factor of 10 over my first solution.
private static Set<String> dates = new HashSet<String>();
static {
for (int year = 1900; year < 2050; year++) {
for (int month = 1; month <= 12; month++) {
for (int day = 1; day <= daysInMonth(year, month); day++) {
StringBuilder date = new StringBuilder();
date.append(String.format("%04d", year));
date.append(String.format("%02d", month));
date.append(String.format("%02d", day));
dates.add(date.toString());
}
}
}
}
public static boolean isValidDate2(String dateString) {
return dates.contains(dateString);
}
P.S。它可以修改为使用 Set< Integer>
甚至 Trove 的 TIntHashSet
可以大大减少内存使用量(因此允许使用更大的时间跨度),然后性能下降到低于我的原始解决方案。
P.S. It can be modified to use Set<Integer>
or even Trove's TIntHashSet
which reduces memory usage a lot (and therefore allows to use a much larger timespan), the performance then drops to a level just below my original solution.
这篇关于判断字符串是否为有效日期的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!