从tzdata提取历史性的闰秒 [英] Extract historic leap seconds from tzdata

查看:272
本文介绍了从tzdata提取历史性的闰秒的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法从大多数linux发行版上发布的时区数据库中提取历史性跨越时间的时刻?我正在寻找一个解决方案在python,但任何在命令行上工作也是很好的。



我的用例是在gps-time之间基本上是从1980年开始第一个GPS卫星开始的秒数)和UTC或当地时间。 UTC随时调整闰秒,而gps时间线性增加。这相当于UTC和 TAI 之间的转换。 TAI也忽略了闰秒,所以TAI和gps-time应该总是以相同的偏移进行演变。在工作中,我们使用gps-time作为世界各地同步天文观测的时间标准。



我有工作功能,可以在gps-time和UTC之间转换,但是不得不硬编码一个闰秒表,我得到这里(文件 tzdata2013xx.tar.gz 包含名为 leapseconds 的文件)。我们必须每隔几年更新一次这个文件,当一个新的leapsecond被宣布。我更喜欢从标准的tzdata获取这些信息,它通过系统更新每年多次自动更新。



我很确定信息被隐藏在一些二进制文件位于 / usr / share / zoneinfo / 中的某个位置。我已经能够使用 struct.unpack man tzfile 提供了一些关于格式的信息)但我从来没有完全工作。有没有可以访问这些信息的标准包?我知道 pytz ,这似乎是从同一个数据库获取标准的DST信息,但它没有提供访问权限闰秒我还发现了 tai64n ,但是看它的源代码,它只包含一个硬编码表。



编辑



受到steveha的回答和 pytz / tzfile.py ,我终于得到了一个工作的解决方案在py2.5和py2.7上测试):

  from struct import unpack,calcsize 
from datetime import datetime

def print_leap(tzfile ='/ usr / share / zoneinfo / right / UTC'):
with open(tzfile,'rb')as f:
#read header
fmt ='> 4s c 15x 6l'
(magic,format,ttisgmtcnt,ttisstdcnt,leapcnt,timecnt,
typecnt,charcnt)= unpack(fmt,f.read(calcsize(fmt )))
assert magic =='TZif'.encode('US-ASCII'),'不是时区文件'
print'Found%i leapsecond s:'%leapcnt

#跳过一些不感兴趣的数据
fmt ='>%(timecnt)dl%(timecnt)dB%(ttinfo)s%(charcnt)ds'% dict(
timecnt = timecnt,ttinfo ='lBB'* typecnt,charcnt = charcnt)
f.read(calcsize(fmt))

#read闰秒
fmt ='> 2l'
for x in xrange(leapcnt):
tleap,nleap = unpack(fmt,f.read(calcsize(fmt)))
print datetime。 utcfromtimestamp(tleap-nleap + 1)

结果

 在[2]中:print_leap()
找到25个leapseconds:
1972-07-01 00:00:00
1973-01 -01 00:00:00
1974-01-01 00:00:00
...
2006-01-01 00:00:00
2009-01- 01 00:00:00
2012-07-01 00:00:00

这确实解决了我的问题,我可能不会去解决这个问题。相反,我将包括 leap-seconds.list 与我的代码,正如马特·约翰逊所建议的那样。这似乎是用作tzdata的来源的权威列表,并且可能由NIST每年更新两次。这意味着我必须手动进行更新,但是这个文件很简单,可以解析并包含一个截止日期(tzdata似乎丢失了)。

解决方案

我刚刚做了 man 5 tzfile ,并计算出可以找到闰秒信息的偏移量,然后读取闰秒信息。



您可以取消对DEBUG:打印语句的注释,以查看文件中找到的更多内容。



编辑:程序更新到现在是正确的。现在使用文件 / usr / share / zoneinfo / right / UTC ,现在可以发现闰秒打印。



原始程序没有跳过时间戳缩写字符,它们记录在手册页中但是隐藏的(...和tt_abbrind作为跟随ttinfo结构的时区缩写字符数组的索引) s)$)

  import datetime 
import struct

TZFILE_MAGIC ='TZif'.encode('US-ASCII')

def leap_seconds(f):

返回此格式的元组列表:(timestamp, number_of_seconds)
timestamp:一个32位时间戳,自UNIX历元以来的秒数
number_of_seconds:在时间戳上出现了多少次跳跃


fmt = > 4s c 15x 6l
size = struct.calcsize(fmt)
(tzfile_magic,tzfile_format,ttisgmtcnt,ttisstdcnt,leapcnt,timecnt,
typecnt,charcnt)= struc t.unpack(fmt,f.read(size))
#print(DEBUG:tzfile_magic:{} tzfile_format:{} ttisgmtcnt:{} ttisstdcnt:{} leapcnt:{} timecnt:{} typecnt:{ } charCnt:{}。format(tzfile_magic,tzfile_format,ttisgmtcnt,ttisstdcnt,leapcnt,timecnt,typecnt,charcnt))

#确保它是一个tzfile(5)文件
assert tzfile_magic == TZFILE_MAGIC,(
不是tzfile;文件魔术是:'{}'。format(tzfile_magic))

下面显示的结果代码如32位长整数的l
offset =(timecnt * 4 #转换时间,每个l
+ timecnt * 1#索引绑定到ttinfo值的转换时间,每个B
+ typecnt * 6#ttinfo结构,每个存储为lBB
+ charcnt * 1)#时区缩写字符,每个c

f.seek(offset,1)#从当前位置查找偏移量字节

fmt ='> ; {} l'.format(leapcnt * 2)
#print(DEBUG:leapcnt:{} fmt:'{}'。format(leapcnt,fmt))
size = struct.calcsize (fmt)
data = struct.unpack(fmt,f.read(size))

lst = [(data [i],data [i + 1]) (len(lst)-1)]
对于范围(len(lst)-1)中的i,所有(lst [i] [0]< lst [i + 1] [0])
为所有范围(len(lst)-1)中的i(lst [i] [1] == lst [i + 1] [1] -1))

return lst

def print_leaps(leap_lst):
#l eap_lst是元组:(timestamp,num_leap_seconds)
为ts,num_secs为leap_lst:
print(datetime.datetime.utcfromtimestamp(ts - num_secs + 1))

如果__name__ = ='__main__':
import os
zoneinfo_fname ='/ usr / share / zoneinfo / right / UTC'
with open(zoneinfo_fname,'rb')as f:
leap_lst = leap_seconds(f)
print_leaps(leap_lst)


Is there a way to extract the moment of historic leap seconds from the time-zone database that is distributed on most linux distributions? I am looking for a solution in python, but anything that works on the command line would be fine too.

My use case is to convert between gps-time (which is basically the number of seconds since the first GPS-satellite was switched on in 1980) and UTC or local time. UTC is adjusted for leap-seconds every now and then, while gps-time increases linearly. This is equivalent to converting between UTC and TAI. TAI also ignores leap-seconds, so TAI and gps-time should always evolve with the same offset. At work, we use gps-time as the time standard for synchronizing astronomical observations around the world.

I have working functions that convert between gps-time and UTC, but I had to hard-code a table of leap seconds, which I get here (the file tzdata2013xx.tar.gz contains a file named leapseconds). I have to update this file by hand every few years when a new leapsecond is announced. I would prefer to get this information from the standard tzdata, which is automatically updated via system updates several times a year.

I am pretty sure the information is hidden in some binary files somewhere in /usr/share/zoneinfo/. I have been able to extract some of it using struct.unpack (man tzfile gives some info about the format), but I never got it working completely. Are there any standard packages that can access this information? I know about pytz, which seems to get the standard DST information from the same database, but it does not give access to leap seconds. I also found tai64n, but looking at its source code, it just contains a hard-coded table.

EDIT

Inspired by steveha's answer and some code in pytz/tzfile.py, I finally got a working solution (tested on py2.5 and py2.7):

from struct import unpack, calcsize
from datetime import datetime

def print_leap(tzfile = '/usr/share/zoneinfo/right/UTC'):
    with open(tzfile, 'rb') as f:
        # read header
        fmt = '>4s c 15x 6l'
        (magic, format, ttisgmtcnt, ttisstdcnt,leapcnt, timecnt,
            typecnt, charcnt) =  unpack(fmt, f.read(calcsize(fmt)))
        assert magic == 'TZif'.encode('US-ASCII'), 'Not a timezone file'
        print 'Found %i leapseconds:' % leapcnt

        # skip over some uninteresting data
        fmt = '>%(timecnt)dl %(timecnt)dB %(ttinfo)s %(charcnt)ds' % dict(
            timecnt=timecnt, ttinfo='lBB'*typecnt, charcnt=charcnt)
        f.read(calcsize(fmt))

        #read leap-seconds
        fmt = '>2l'
        for i in xrange(leapcnt):
            tleap, nleap = unpack(fmt, f.read(calcsize(fmt)))
            print datetime.utcfromtimestamp(tleap-nleap+1)

with result

In [2]: print_leap()
Found 25 leapseconds:
1972-07-01 00:00:00
1973-01-01 00:00:00
1974-01-01 00:00:00
...
2006-01-01 00:00:00
2009-01-01 00:00:00
2012-07-01 00:00:00

While this does solve my question, I will probably not go for this solution. Instead, I will include leap-seconds.list with my code, as suggested by Matt Johnson. This seems to be the authoritative list used as a source for tzdata, and is probably updated by NIST twice a year. This means I will have to do the update by hand, but this file is straightforward to parse and includes an expiration date (which tzdata seems to be missing).

解决方案

I just did man 5 tzfile and computed an offset that would find the leap seconds info, then read the leap seconds info.

You can uncomment the "DEBUG:" print statements to see more of what it finds in the file.

EDIT: program updated to now be correct. It now uses the file /usr/share/zoneinfo/right/UTC and it now finds leap-seconds to print.

The original program wasn't skipping the timezeone abbreviation characters, which are documented in the man page but sort of hidden ("...and tt_abbrind serves as an index into the array of timezone abbreviation characters that follow the ttinfo structure(s) in the file.").

import datetime
import struct

TZFILE_MAGIC = 'TZif'.encode('US-ASCII')

def leap_seconds(f):
    """
    Return a list of tuples of this format: (timestamp, number_of_seconds)
        timestamp: a 32-bit timestamp, seconds since the UNIX epoch
        number_of_seconds: how many leap-seconds occur at timestamp

    """
    fmt = ">4s c 15x 6l"
    size = struct.calcsize(fmt)
    (tzfile_magic, tzfile_format, ttisgmtcnt, ttisstdcnt, leapcnt, timecnt,
        typecnt, charcnt) =  struct.unpack(fmt, f.read(size))
    #print("DEBUG: tzfile_magic: {} tzfile_format: {} ttisgmtcnt: {} ttisstdcnt: {} leapcnt: {} timecnt: {} typecnt: {} charcnt: {}".format(tzfile_magic, tzfile_format, ttisgmtcnt, ttisstdcnt, leapcnt, timecnt, typecnt, charcnt))

    # Make sure it is a tzfile(5) file
    assert tzfile_magic == TZFILE_MAGIC, (
            "Not a tzfile; file magic was: '{}'".format(tzfile_magic))

    # comments below show struct codes such as "l" for 32-bit long integer
    offset = (timecnt*4  # transition times, each "l"
        + timecnt*1  # indices tying transition time to ttinfo values, each "B"
        + typecnt*6  # ttinfo structs, each stored as "lBB"
        + charcnt*1)  # timezone abbreviation chars, each "c"

    f.seek(offset, 1) # seek offset bytes from current position

    fmt = '>{}l'.format(leapcnt*2)
    #print("DEBUG: leapcnt: {}  fmt: '{}'".format(leapcnt, fmt))
    size = struct.calcsize(fmt)
    data = struct.unpack(fmt, f.read(size))

    lst = [(data[i], data[i+1]) for i in range(0, len(data), 2)]
    assert all(lst[i][0] < lst[i+1][0] for i in range(len(lst)-1))
    assert all(lst[i][1] == lst[i+1][1]-1 for i in range(len(lst)-1))

    return lst

def print_leaps(leap_lst):
    # leap_lst is tuples: (timestamp, num_leap_seconds)
    for ts, num_secs in leap_lst:
        print(datetime.datetime.utcfromtimestamp(ts - num_secs+1))

if __name__ == '__main__':
    import os
    zoneinfo_fname = '/usr/share/zoneinfo/right/UTC'
    with open(zoneinfo_fname, 'rb') as f:
        leap_lst = leap_seconds(f)
    print_leaps(leap_lst)

这篇关于从tzdata提取历史性的闰秒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆