用小数秒创建时间戳 [英] Create timestamp with fractional seconds

查看:290
本文介绍了用小数秒创建时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

awk可以使用strftime函数生成时间戳,例如

awk can generate a timestamp with strftime function, e.g.

$ awk 'BEGIN {print strftime("%Y/%m/%d %H:%M:%S")}'
2019/03/26 08:50:42

但是我需要一个带有小数秒的时间戳,理想情况下可以低至纳秒. gnu date可以通过%N元素执行此操作:

But I need a timestamp with fractional seconds, ideally down to nanoseconds. gnu date can do this with the %N element:

$ date "+%Y/%m/%d %H:%M:%S.%N"
2019/03/26 08:52:32.753019800

但是与调用strftime相比,从awk内部调用date效率相对较低,并且我需要高性能,因为我正在使用awk处理许多大文件,并且在处理时需要生成许多时间戳记文件. awk是否可以有效地生成包括小数秒(理想情况下为纳秒,但可以接受毫秒)的时间戳?

But it is relatively inefficient to invoke date from within awk compared to calling strftime, and I need high performance as I'm processing many large files with awk and need to generate many timestamps while processing the files. Is there a way that awk can efficiently generate a timestamp that includes fractional seconds (ideally nanoseconds, but milliseconds would be acceptable)?

添加我要执行的操作的示例:

Adding an example of what I am trying to perform:

awk -v logFile="$logFile" -v outputFile="$outputFile" '
BEGIN {
   print "[" strftime("%Y%m%d %H%M%S") "] Starting to process " FILENAME "." >> logFile
}
{
    data[$1] += $2
}
END {
    print "[" strftime("%Y%m%d %H%M%S") "] Processed " NR " records." >> logFile
    for (id in data) {
        print id ": " data[id] >> outputFile
    }
}
' oneOfManyLargeFiles

推荐答案

如果您确实需要亚秒级计时,则可以调用诸如date之类的外部命令或读取诸如/proc/uptime之类的外部系统文件或/proc/rct无法达到亚秒精度的目的.两种情况都需要大量资源来检索所请求的信息(即时间)

If you are really in need of subsecond timing, then any call to an external command such as date or reading an external system file such as /proc/uptime or /proc/rct defeats the purpose of the subsecond accuracy. Both cases require to many resources to retrieve the requested information (i.e. the time)

由于OP已经使用了GNU awk,因此可以使用动态扩展.动态扩展是通过实现用C或C ++编写的新功能并使用gawk动态加载新功能来向awk添加新功能的一种方式. GNU awk手册中广泛记录了如何编写这些功能. .

Since the OP already makes use of GNU awk, you could make use of a dynamic extension. Dynamic extensions are a way of adding new functionality to awk by implementing new functions written in C or C++ and dynamically loading them with gawk. How to write these functions is extensively written down in the GNU awk manual.

幸运的是,GNU awk 4.2.1附带了一组默认动态库,可以随意加载它们.这些库之一是具有两个简单功能的time库:

Luckily, GNU awk 4.2.1 comes with a set of default dynamic libraries which can be loaded at will. One of these libraries is a time library with two simple functions:

the_time = gettimeofday() 以秒为单位返回自1970-01-01 UTC以来经过的时间(以秒为单位).如果该平台上没有时间,请返回-1并设置ERRNO. 返回的时间应具有亚秒精度,但实际精度可能会因平台而异.如果标准C gettimeofday()系统调用在此平台上可用,则它仅返回该值.否则,如果在MS-Windows上,它将尝试使用GetSystemTimeAsFileTime().

the_time = gettimeofday() Return the time in seconds that has elapsed since 1970-01-01 UTC as a floating-point value. If the time is unavailable on this platform, return -1 and set ERRNO. The returned time should have sub-second precision, but the actual precision may vary based on the platform. If the standard C gettimeofday() system call is available on this platform, then it simply returns the value. Otherwise, if on MS-Windows, it tries to use GetSystemTimeAsFileTime().

result = sleep(seconds) 尝试睡眠seconds秒.如果seconds为负,或尝试睡眠失败,请返回-1并设置ERRNO.否则,在指定的时间内睡眠后返回零.请注意,秒可能是浮点(非整数)值.实现细节:根据平台的可用性,此功能尝试使用nanosleep()select()来实现延迟.

result = sleep(seconds) Attempt to sleep for seconds seconds. If seconds is negative, or the attempt to sleep fails, return -1 and set ERRNO. Otherwise, return zero after sleeping for the indicated amount of time. Note that seconds may be a floating-point (nonintegral) value. Implementation details: depending on platform availability, this function tries to use nanosleep() or select() to implement the delay.

源: GNU awk手册

现在可以以相当简单的方式调用此函数:

It is now possible to call this function in a rather straightforward way:

awk '@load "time"; BEGIN{printf "%.6f", gettimeofday()}'
1553637193.575861

为了证明此方法比更经典的实现更快,我使用gettimeofday()为所有3个实现计时:

In order to demonstrate that this method is faster then the more classic implementations, I timed all 3 implementations using gettimeofday():

awk '@load "time"
     function get_uptime(   a) {
        if((getline line < "/proc/uptime") > 0)
        split(line,a," ")
        close("/proc/uptime")
        return a[1]
     }
     function curtime(    cmd, line, time) {
        cmd = "date \047+%Y/%m/%d %H:%M:%S.%N\047"
        if ( (cmd | getline line) > 0 ) {
           time = line
        }
        else {
           print "Error: " cmd " failed" | "cat>&2"
        }
        close(cmd)
        return time
      }
      BEGIN{
        t1=getimeofday(); curtime(); t2=gettimeofday();
        print "curtime()",t2-t1
        t1=getimeofday(); get_uptime(); t2=gettimeofday();
        print "get_uptime()",t2-t1
        t1=getimeofday(); gettimeofday(); t2=gettimeofday();
        print "gettimeofday()",t2-t1
      }'

输出:

curtime() 0.00519109
get_uptime() 7.98702e-05
gettimeofday() 9.53674e-07

虽然curtime()在加载外部二进制文件时最慢,但令人震惊的是看到awk在处理额外的外部/proc/文件时速度非常快.

While it is evident that curtime() is the slowest as it loads an external binary, it is rather startling to see that awk is blazingly fast in processing an extra external /proc/ file.

这篇关于用小数秒创建时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆