pytz 时区转换性能 [英] pytz timezone conversion performance

查看:95
本文介绍了pytz 时区转换性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有超过 100 万个来自数据库的 datetime 对象,我想将它们中的每一个转换为时区感知的 datetime 对象.这是我的辅助函数 conv_tz:

I have more than 1 million datetime object from database, and I want to convert each of them to timezone-aware datetime objects. Here is my helper funcion conv_tz:

# dt is python datetime object, src_tz and dest_tz and pytz.timezone objects
def conv_tz(dt, src_tz, dest_tz):
    if not dt: return None
    sdt = src_tz.localize(dt)
    return sdt.astimezone(dest_tz)

这是分析器的结果:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1101475    1.166    0.000   44.440    0.000 ../release/python/lib/dtutil.py:128(conv_tz)
1101475    9.092    0.000   35.656    0.000 /app/python/lib/python3.4/site-packages/pytz/tzinfo.py:244(localize)

问题 1:有没有办法让它运行得更快?假设数据库中的每个日期时间对象都在 pytz.timezone('America/New_York') 中,并且目标时区因每个日期时间对象(或数据库中的每一行)而异

Question 1: Is there anyway to make it run faster? Each datetime object from the database is assumed to be in pytz.timezone('America/New_York'), and the destination timezone varies by each datetime object (or each row in the database)

其实,在我得到了timezone-aware datetime对象之后,我真正想要实现的是将这些datetime对象转换为matlab时间(不是时区感知的.)所以这里是我使用的to_mat函数:

In fact, after I get the timezone-aware datetime object, what I really want to achieve is to convert these datetime object to matlab time (which is not timezone aware.) So here is the to_mat function I use:

def to_mat(dt):
    if not dt:  return None
    val = dt.toordinal() + 366
    t = dt.time()
    return val + (((t.hour * 60) + t.minute) * 60 + t.second) / float(_seconds_day) + t.microsecond / 1.0e6 / _seconds_day

我正在为超过 100 万个日期时间对象调用这两个函数:

I am calling these 2 functions together for more than 1million datetime objects:

matdt = dtutil.to_mat(dtutil.conv_tz(dt, pytz.timezone('America/New_York'), dst_tz))

问题 2:也许有更好的方法来一起进行这些转换?下面是to_mat的profiler,貌似比conv_tz耗时少:

Question2: Maybe there is a better way to do these conversions together? Here is the profiler of to_mat, which seems less time consuming than conv_tz:

3304425    5.067    0.000    5.662    0.000 ../release/python/lib/dtutil.py:8(to_mat)

环境:CentOS6 x64 + Python3.4.3 x64

Environment: CentOS6 x64 + Python3.4.3 x64

推荐答案

感谢 J.F. Sebastian 的评论!这是我决定使用的,假设这些数据时间对象的默认时区与操作系统时区一致:

Thanks for J.F. Sebastian comment! Here is what I decide to use, assuming the default timezone for these datatime objects are consistent with the OS timezone:

def conv_tz2(dt, dest_tz):
    if not dt: return None
    return datetime.fromtimestamp(dt.timestamp(), dest_tz)

它运行原始 conv_tz 的一小部分.以下是基于 50 万次转化的测试:

And it runs a small fraction of the original conv_tz. Here is a test based on half million conversion:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
567669    0.664    0.000   23.354    0.000 ../test/test_tz.py:17(conv_tz)
567669    4.831    0.000   18.732    0.000 /app/python/lib/python3.4/site-packages/pytz/tzinfo.py:244(localize)
567669    0.472    0.000    5.786    0.000 ../test/test_tz.py:22(conv_tz2)

这篇关于pytz 时区转换性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆