从tzdata提取历史性的闰秒 [英] Extract historic leap seconds from tzdata
问题描述
我的用例是在gps-time之间基本上是从1980年开始第一个GPS卫星开始的秒数)和UTC或当地时间。 UTC随时调整闰秒,而gps时间线性增加。这相当于UTC和 TAI 之间的转换。 TAI也忽略了闰秒,所以TAI和gps-time应该总是以相同的偏移进行演变。在工作中,我们使用gps-time作为世界各地同步天文观测的时间标准。
我有工作功能,可以在gps-time和UTC之间转换,但是不得不硬编码一个闰秒表,我得到这里(文件 tzdata2013xx.tar.gz
包含名为 leapseconds
的文件)。我们必须每隔几年更新一次这个文件,当一个新的leapsecond被宣布。我更喜欢从标准的tzdata获取这些信息,它通过系统更新每年多次自动更新。
我很确定信息被隐藏在一些二进制文件位于 / usr / share / zoneinfo /
中的某个位置。我已经能够使用 struct.unpack
( man tzfile
提供了一些关于格式的信息)但我从来没有完全工作。有没有可以访问这些信息的标准包?我知道 pytz ,这似乎是从同一个数据库获取标准的DST信息,但它没有提供访问权限闰秒我还发现了 tai64n ,但是看它的源代码,它只包含一个硬编码表。
编辑
受到steveha的回答和 pytz / tzfile.py ,我终于得到了一个工作的解决方案在py2.5和py2.7上测试):
from struct import unpack,calcsize
from datetime import datetime
def print_leap(tzfile ='/ usr / share / zoneinfo / right / UTC'):
with open(tzfile,'rb')as f:
#read header
fmt ='> 4s c 15x 6l'
(magic,format,ttisgmtcnt,ttisstdcnt,leapcnt,timecnt,
typecnt,charcnt)= unpack(fmt,f.read(calcsize(fmt )))
assert magic =='TZif'.encode('US-ASCII'),'不是时区文件'
print'Found%i leapsecond s:'%leapcnt
#跳过一些不感兴趣的数据
fmt ='>%(timecnt)dl%(timecnt)dB%(ttinfo)s%(charcnt)ds'% dict(
timecnt = timecnt,ttinfo ='lBB'* typecnt,charcnt = charcnt)
f.read(calcsize(fmt))
#read闰秒
fmt ='> 2l'
for x in xrange(leapcnt):
tleap,nleap = unpack(fmt,f.read(calcsize(fmt)))
print datetime。 utcfromtimestamp(tleap-nleap + 1)
结果
在[2]中:print_leap()
找到25个leapseconds:
1972-07-01 00:00:00
1973-01 -01 00:00:00
1974-01-01 00:00:00
...
2006-01-01 00:00:00
2009-01- 01 00:00:00
2012-07-01 00:00:00
这确实解决了我的问题,我可能不会去解决这个问题。相反,我将包括 leap-seconds.list 与我的代码,正如马特·约翰逊所建议的那样。这似乎是用作tzdata的来源的权威列表,并且可能由NIST每年更新两次。这意味着我必须手动进行更新,但是这个文件很简单,可以解析并包含一个截止日期(tzdata似乎丢失了)。
我刚刚做了 man 5 tzfile
,并计算出可以找到闰秒信息的偏移量,然后读取闰秒信息。
您可以取消对DEBUG:打印语句的注释,以查看文件中找到的更多内容。
编辑:程序更新到现在是正确的。现在使用文件 / usr / share / zoneinfo / right / UTC
,现在可以发现闰秒打印。
原始程序没有跳过时间戳缩写字符,它们记录在手册页中但是隐藏的(...和tt_abbrind作为跟随ttinfo结构的时区缩写字符数组的索引) s)$)
import datetime
import struct
TZFILE_MAGIC ='TZif'.encode('US-ASCII')
def leap_seconds(f):
返回此格式的元组列表:(timestamp, number_of_seconds)
timestamp:一个32位时间戳,自UNIX历元以来的秒数
number_of_seconds:在时间戳上出现了多少次跳跃
fmt = > 4s c 15x 6l
size = struct.calcsize(fmt)
(tzfile_magic,tzfile_format,ttisgmtcnt,ttisstdcnt,leapcnt,timecnt,
typecnt,charcnt)= struc t.unpack(fmt,f.read(size))
#print(DEBUG:tzfile_magic:{} tzfile_format:{} ttisgmtcnt:{} ttisstdcnt:{} leapcnt:{} timecnt:{} typecnt:{ } charCnt:{}。format(tzfile_magic,tzfile_format,ttisgmtcnt,ttisstdcnt,leapcnt,timecnt,typecnt,charcnt))
#确保它是一个tzfile(5)文件
assert tzfile_magic == TZFILE_MAGIC,(
不是tzfile;文件魔术是:'{}'。format(tzfile_magic))
下面显示的结果代码如32位长整数的l
offset =(timecnt * 4 #转换时间,每个l
+ timecnt * 1#索引绑定到ttinfo值的转换时间,每个B
+ typecnt * 6#ttinfo结构,每个存储为lBB
+ charcnt * 1)#时区缩写字符,每个c
f.seek(offset,1)#从当前位置查找偏移量字节
fmt ='> ; {} l'.format(leapcnt * 2)
#print(DEBUG:leapcnt:{} fmt:'{}'。format(leapcnt,fmt))
size = struct.calcsize (fmt)
data = struct.unpack(fmt,f.read(size))
lst = [(data [i],data [i + 1]) (len(lst)-1)]
对于范围(len(lst)-1)中的i,所有(lst [i] [0]< lst [i + 1] [0])
为所有范围(len(lst)-1)中的i(lst [i] [1] == lst [i + 1] [1] -1))
return lst
def print_leaps(leap_lst):
#l eap_lst是元组:(timestamp,num_leap_seconds)
为ts,num_secs为leap_lst:
print(datetime.datetime.utcfromtimestamp(ts - num_secs + 1))
如果__name__ = ='__main__':
import os
zoneinfo_fname ='/ usr / share / zoneinfo / right / UTC'
with open(zoneinfo_fname,'rb')as f:
leap_lst = leap_seconds(f)
print_leaps(leap_lst)
Is there a way to extract the moment of historic leap seconds from the time-zone database that is distributed on most linux distributions? I am looking for a solution in python, but anything that works on the command line would be fine too.
My use case is to convert between gps-time (which is basically the number of seconds since the first GPS-satellite was switched on in 1980) and UTC or local time. UTC is adjusted for leap-seconds every now and then, while gps-time increases linearly. This is equivalent to converting between UTC and TAI. TAI also ignores leap-seconds, so TAI and gps-time should always evolve with the same offset. At work, we use gps-time as the time standard for synchronizing astronomical observations around the world.
I have working functions that convert between gps-time and UTC, but I had to hard-code a table of leap seconds, which I get here (the file tzdata2013xx.tar.gz
contains a file named leapseconds
). I have to update this file by hand every few years when a new leapsecond is announced. I would prefer to get this information from the standard tzdata, which is automatically updated via system updates several times a year.
I am pretty sure the information is hidden in some binary files somewhere in /usr/share/zoneinfo/
. I have been able to extract some of it using struct.unpack
(man tzfile
gives some info about the format), but I never got it working completely. Are there any standard packages that can access this information? I know about pytz, which seems to get the standard DST information from the same database, but it does not give access to leap seconds. I also found tai64n, but looking at its source code, it just contains a hard-coded table.
EDIT
Inspired by steveha's answer and some code in pytz/tzfile.py, I finally got a working solution (tested on py2.5 and py2.7):
from struct import unpack, calcsize
from datetime import datetime
def print_leap(tzfile = '/usr/share/zoneinfo/right/UTC'):
with open(tzfile, 'rb') as f:
# read header
fmt = '>4s c 15x 6l'
(magic, format, ttisgmtcnt, ttisstdcnt,leapcnt, timecnt,
typecnt, charcnt) = unpack(fmt, f.read(calcsize(fmt)))
assert magic == 'TZif'.encode('US-ASCII'), 'Not a timezone file'
print 'Found %i leapseconds:' % leapcnt
# skip over some uninteresting data
fmt = '>%(timecnt)dl %(timecnt)dB %(ttinfo)s %(charcnt)ds' % dict(
timecnt=timecnt, ttinfo='lBB'*typecnt, charcnt=charcnt)
f.read(calcsize(fmt))
#read leap-seconds
fmt = '>2l'
for i in xrange(leapcnt):
tleap, nleap = unpack(fmt, f.read(calcsize(fmt)))
print datetime.utcfromtimestamp(tleap-nleap+1)
with result
In [2]: print_leap()
Found 25 leapseconds:
1972-07-01 00:00:00
1973-01-01 00:00:00
1974-01-01 00:00:00
...
2006-01-01 00:00:00
2009-01-01 00:00:00
2012-07-01 00:00:00
While this does solve my question, I will probably not go for this solution. Instead, I will include leap-seconds.list with my code, as suggested by Matt Johnson. This seems to be the authoritative list used as a source for tzdata, and is probably updated by NIST twice a year. This means I will have to do the update by hand, but this file is straightforward to parse and includes an expiration date (which tzdata seems to be missing).
I just did man 5 tzfile
and computed an offset that would find the leap seconds info, then read the leap seconds info.
You can uncomment the "DEBUG:" print statements to see more of what it finds in the file.
EDIT: program updated to now be correct. It now uses the file /usr/share/zoneinfo/right/UTC
and it now finds leap-seconds to print.
The original program wasn't skipping the timezeone abbreviation characters, which are documented in the man page but sort of hidden ("...and tt_abbrind serves as an index into the array of timezone abbreviation characters that follow the ttinfo structure(s) in the file.").
import datetime
import struct
TZFILE_MAGIC = 'TZif'.encode('US-ASCII')
def leap_seconds(f):
"""
Return a list of tuples of this format: (timestamp, number_of_seconds)
timestamp: a 32-bit timestamp, seconds since the UNIX epoch
number_of_seconds: how many leap-seconds occur at timestamp
"""
fmt = ">4s c 15x 6l"
size = struct.calcsize(fmt)
(tzfile_magic, tzfile_format, ttisgmtcnt, ttisstdcnt, leapcnt, timecnt,
typecnt, charcnt) = struct.unpack(fmt, f.read(size))
#print("DEBUG: tzfile_magic: {} tzfile_format: {} ttisgmtcnt: {} ttisstdcnt: {} leapcnt: {} timecnt: {} typecnt: {} charcnt: {}".format(tzfile_magic, tzfile_format, ttisgmtcnt, ttisstdcnt, leapcnt, timecnt, typecnt, charcnt))
# Make sure it is a tzfile(5) file
assert tzfile_magic == TZFILE_MAGIC, (
"Not a tzfile; file magic was: '{}'".format(tzfile_magic))
# comments below show struct codes such as "l" for 32-bit long integer
offset = (timecnt*4 # transition times, each "l"
+ timecnt*1 # indices tying transition time to ttinfo values, each "B"
+ typecnt*6 # ttinfo structs, each stored as "lBB"
+ charcnt*1) # timezone abbreviation chars, each "c"
f.seek(offset, 1) # seek offset bytes from current position
fmt = '>{}l'.format(leapcnt*2)
#print("DEBUG: leapcnt: {} fmt: '{}'".format(leapcnt, fmt))
size = struct.calcsize(fmt)
data = struct.unpack(fmt, f.read(size))
lst = [(data[i], data[i+1]) for i in range(0, len(data), 2)]
assert all(lst[i][0] < lst[i+1][0] for i in range(len(lst)-1))
assert all(lst[i][1] == lst[i+1][1]-1 for i in range(len(lst)-1))
return lst
def print_leaps(leap_lst):
# leap_lst is tuples: (timestamp, num_leap_seconds)
for ts, num_secs in leap_lst:
print(datetime.datetime.utcfromtimestamp(ts - num_secs+1))
if __name__ == '__main__':
import os
zoneinfo_fname = '/usr/share/zoneinfo/right/UTC'
with open(zoneinfo_fname, 'rb') as f:
leap_lst = leap_seconds(f)
print_leaps(leap_lst)
这篇关于从tzdata提取历史性的闰秒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!