awk日期验证 [英] Awk date validation
问题描述
我有一个awk脚本,需要在其中验证大量包含日期的行.
我目前正在使用基于正则表达式的解决方案进行基本验证(不测试leap年或),或者调用UNIX date命令对其进行更准确的验证.date命令的效果很好,但是调用系统命令的性能却非常昂贵.
我希望这里的人能够提出既准确又快速的解决方案.
这是我的数据示例
20140804024614201408031900202014080316332020140803083222201408031703212014080323404420140804011857201408032040082014080316002620140803140120
谢谢.
给出关于输入文件的大量假设,这可能就是您需要使用GNU awk的时间函数和gensub仅打印有效日期和时间的全部内容():
awk'strftime(%Y%m%d%H%M%S",mktime(gensub(/(.{4})(..)(..)(..)(..)/,"\\ 1 \\ 2 \\ 3 \\ 4 \\ 5","))))== $ 0'文件
它仅适用于从时代开始的日期.
如果您需要为每个日期/时间打印某种有效/无效"消息:
$ cat文件2014023003590020140804024614$$ awk'{print(strftime(%Y%m%d%H%M%S",mktime(gensub(/(.{4})(..)(..)(..)(..)/,"\\ 1 \\ 2 \\ 3 \\ 4 \\ 5","))))== $ 0?":"in")"valid:",$ 0}'文件无效:20140230035900有效期:20140804024614
以上方法通过将日期+时间转换为自该纪元以来的秒数,然后将这些秒转换为原始格式的日期+时间,并且如果结果与您开始时的结果相同,则原始日期有效./p>
I have an awk script where I need to validate a large number of lines containing dates.
I'm currently using either a regex based solution to do a basic validation (without testing for leap years or ) or calling the UNIX date command to validate it more accurately. The date command works well, but calling a system command is pretty expensive in terms of performance.
I was hoping that someone here might be able to suggest a solution that is both accurate and is fast.
Here's an example of my data
20140804024614
20140803190020
20140803163320
20140803083222
20140803170321
20140803234044
20140804011857
20140803204008
20140803160026
20140803140120
Thanks.
Given a whole lot of assumptions about your input file, this is probably all you need to print only the valid dates+times using GNU awk for time functions and gensub():
awk 'strftime("%Y%m%d%H%M%S",mktime(gensub(/(.{4})(..)(..)(..)(..)/,"\\1 \\2 \\3 \\4 \\5 ",""))) == $0' file
It will only work with dates since the epoch.
If you need to print some kind of "valid/invalid" message for each date/time:
$ cat file
20140230035900
20140804024614
$
$ awk '{print (strftime("%Y%m%d%H%M%S",mktime(gensub(/(.{4})(..)(..)(..)(..)/,"\\1 \\2 \\3 \\4 \\5 ",""))) == $0 ? "" : "in") "valid:", $0}' file
invalid: 20140230035900
valid: 20140804024614
The above works by converting the date+time to seconds since the epoch, then converting those seconds to a date+time in the original format and if the result is identical to what you started with then the original date was valid.
这篇关于awk日期验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!