用小数秒创建时间戳 [英] Create timestamp with fractional seconds
问题描述
awk
可以使用strftime函数生成时间戳,例如
awk
can generate a timestamp with strftime function, e.g.
$ awk 'BEGIN {print strftime("%Y/%m/%d %H:%M:%S")}'
2019/03/26 08:50:42
但是我需要一个带有小数秒的时间戳,理想情况下可以低至纳秒. gnu date
可以通过%N
元素执行此操作:
But I need a timestamp with fractional seconds, ideally down to nanoseconds. gnu date
can do this with the %N
element:
$ date "+%Y/%m/%d %H:%M:%S.%N"
2019/03/26 08:52:32.753019800
但是与调用strftime
相比,从awk
内部调用date
效率相对较低,并且我需要高性能,因为我正在使用awk
处理许多大文件,并且在处理时需要生成许多时间戳记文件. awk
是否可以有效地生成包括小数秒(理想情况下为纳秒,但可以接受毫秒)的时间戳?
But it is relatively inefficient to invoke date
from within awk
compared to calling strftime
, and I need high performance as I'm processing many large files with awk
and need to generate many timestamps while processing the files. Is there a way that awk
can efficiently generate a timestamp that includes fractional seconds (ideally nanoseconds, but milliseconds would be acceptable)?
添加我要执行的操作的示例:
Adding an example of what I am trying to perform:
awk -v logFile="$logFile" -v outputFile="$outputFile" '
BEGIN {
print "[" strftime("%Y%m%d %H%M%S") "] Starting to process " FILENAME "." >> logFile
}
{
data[$1] += $2
}
END {
print "[" strftime("%Y%m%d %H%M%S") "] Processed " NR " records." >> logFile
for (id in data) {
print id ": " data[id] >> outputFile
}
}
' oneOfManyLargeFiles
推荐答案
如果您确实需要亚秒级计时,则可以调用诸如date
之类的外部命令或读取诸如/proc/uptime
之类的外部系统文件或/proc/rct
无法达到亚秒精度的目的.两种情况都需要大量资源来检索所请求的信息(即时间)
If you are really in need of subsecond timing, then any call to an external command such as date
or reading an external system file such as /proc/uptime
or /proc/rct
defeats the purpose of the subsecond accuracy. Both cases require to many resources to retrieve the requested information (i.e. the time)
由于OP已经使用了GNU awk,因此可以使用动态扩展.动态扩展是通过实现用C或C ++编写的新功能并使用gawk动态加载新功能来向awk添加新功能的一种方式. GNU awk手册中广泛记录了如何编写这些功能. .
Since the OP already makes use of GNU awk, you could make use of a dynamic extension. Dynamic extensions are a way of adding new functionality to awk by implementing new functions written in C or C++ and dynamically loading them with gawk. How to write these functions is extensively written down in the GNU awk manual.
幸运的是,GNU awk 4.2.1附带了一组默认动态库,可以随意加载它们.这些库之一是具有两个简单功能的time
库:
Luckily, GNU awk 4.2.1 comes with a set of default dynamic libraries which can be loaded at will. One of these libraries is a time
library with two simple functions:
the_time = gettimeofday()
以秒为单位返回自1970-01-01 UTC以来经过的时间(以秒为单位).如果该平台上没有时间,请返回-1
并设置ERRNO
. 返回的时间应具有亚秒精度,但实际精度可能会因平台而异.如果标准Cgettimeofday()
系统调用在此平台上可用,则它仅返回该值.否则,如果在MS-Windows上,它将尝试使用GetSystemTimeAsFileTime()
.
the_time = gettimeofday()
Return the time in seconds that has elapsed since 1970-01-01 UTC as a floating-point value. If the time is unavailable on this platform, return-1
and setERRNO
. The returned time should have sub-second precision, but the actual precision may vary based on the platform. If the standard Cgettimeofday()
system call is available on this platform, then it simply returns the value. Otherwise, if on MS-Windows, it tries to useGetSystemTimeAsFileTime()
.
result = sleep(seconds)
尝试睡眠seconds
秒.如果seconds
为负,或尝试睡眠失败,请返回-1
并设置ERRNO
.否则,在指定的时间内睡眠后返回零.请注意,秒可能是浮点(非整数)值.实现细节:根据平台的可用性,此功能尝试使用nanosleep()
或select()
来实现延迟.
result = sleep(seconds)
Attempt to sleep for seconds
seconds. If seconds
is negative, or the attempt to sleep fails, return -1
and set ERRNO
. Otherwise, return zero after sleeping for the indicated amount of time. Note that seconds may be a floating-point (nonintegral) value. Implementation details: depending on platform availability, this function tries to use nanosleep()
or select()
to implement the delay.
源: GNU awk手册
现在可以以相当简单的方式调用此函数:
It is now possible to call this function in a rather straightforward way:
awk '@load "time"; BEGIN{printf "%.6f", gettimeofday()}'
1553637193.575861
为了证明此方法比更经典的实现更快,我使用gettimeofday()
为所有3个实现计时:
In order to demonstrate that this method is faster then the more classic implementations, I timed all 3 implementations using gettimeofday()
:
awk '@load "time"
function get_uptime( a) {
if((getline line < "/proc/uptime") > 0)
split(line,a," ")
close("/proc/uptime")
return a[1]
}
function curtime( cmd, line, time) {
cmd = "date \047+%Y/%m/%d %H:%M:%S.%N\047"
if ( (cmd | getline line) > 0 ) {
time = line
}
else {
print "Error: " cmd " failed" | "cat>&2"
}
close(cmd)
return time
}
BEGIN{
t1=getimeofday(); curtime(); t2=gettimeofday();
print "curtime()",t2-t1
t1=getimeofday(); get_uptime(); t2=gettimeofday();
print "get_uptime()",t2-t1
t1=getimeofday(); gettimeofday(); t2=gettimeofday();
print "gettimeofday()",t2-t1
}'
输出:
curtime() 0.00519109
get_uptime() 7.98702e-05
gettimeofday() 9.53674e-07
虽然curtime()
在加载外部二进制文件时最慢,但令人震惊的是看到awk在处理额外的外部/proc/文件时速度非常快.
While it is evident that curtime()
is the slowest as it loads an external binary, it is rather startling to see that awk is blazingly fast in processing an extra external /proc/ file.
这篇关于用小数秒创建时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!