Python datetime和pandas为同一日期提供不同的时间戳 [英] Python datetime and pandas give different timestamps for the same date

查看:88
本文介绍了Python datetime和pandas为同一日期提供不同的时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

from datetime import datetime
import pandas as pd

date="2020-02-07T16:05:16.000000000"

#Convert using datetime
t1=datetime.strptime(date[:-3],'%Y-%m-%dT%H:%M:%S.%f')

#Convert using Pandas
t2=pd.to_datetime(date)

#Subtract the dates
print(t1-t2)

#subtract the date timestamps
print(t1.timestamp()-t2.timestamp())

在此示例中,我的理解是datetime和pandas都应使用时区天真日期。谁能解释为什么日期之间的差异为零,但时间戳之间的差异不为零?对我而言,这是5个小时,这是我的时间与GMT时区的偏移。

In this example, my understanding is that both datetime and pandas should use timezone naive dates. Can anyone explain why the difference between the dates is zero, but the difference between the timestamps is not zero? It's off by 5 hours for me, which is my time zone offset from GMT.

推荐答案

从Python的<$派生的原始datetime对象c $ c> datetime.datetime 类表示本地时间。从文档可以明显看出这一点但仍然可以与他人合作。如果在其上调用 timestamp 方法,则返回的POSIX时间戳将按原样引用UTC(自纪元以来的秒数)。

Naive datetime objects derived from Python's datetime.datetime class represent local time. This is kind of obvious from the docs but can be a brain-teaser to work with nevertheless. If you call the timestamp method on it, the returned POSIX timestamp refers to UTC (seconds since the epoch) as it should.

来自Python datetime对象的天真的 pandas.Timestamp 的行为可能是违反直觉的(而且我认为不是那么明显)。从tz天真字符串衍生出相同的方式,它不表示本地时间。如果您调用 timestamp 方法,则它指的是UTC。您可以通过将 datetime 对象本地化为UTC来进行验证:

Coming from the Python datetime object, the behavior of a naive pandas.Timestamp can be counter-intuitive (and I think it's not so obvious). Derived the same way from a tz-naive string, it doesn't represent local time. It refers to UTC, if you call the timestamp method. You can verify that by localizing the datetime object to UTC:

from datetime import datetime, timezone
import pandas as pd

date = "2020-02-07T16:05:16.000000000"

t1 = datetime.strptime(date[:-3], '%Y-%m-%dT%H:%M:%S.%f')
t2 = pd.to_datetime(date)

print(t1.replace(tzinfo=timezone.utc).timestamp()-t2.timestamp())
# 0.0

另一种方法是使 pandas.Timestamp 时区感知,例如

The other way around you can make the pandas.Timestamp timezone-aware, e.g.

t3 = pd.to_datetime(t1.astimezone())
# e.g. Timestamp('2020-02-07 16:05:16+0100', tz='Mitteleuropäische Zeit')

print(t1.timestamp()-t3.timestamp())
# 0.0

我的底线是,如果您知道您所拥有的时间戳表示一个特定时区,使用时区感知日期时间,例如对于UTC

My bottom line would be that if you know that the timestamps you have represent a certain timezone, work with timezone-aware datetime, e.g. for UTC

import pytz # need to use pytz here since pandas uses that internally

t1 = datetime.strptime(date[:-3], '%Y-%m-%dT%H:%M:%S.%f').replace(tzinfo=pytz.UTC)
t2 = pd.to_datetime(date, utc=True)

print(t1 == t2)
# True
print(t1-t2)
# 0 days 00:00:00
print(t1.timestamp()-t2.timestamp())
# 0.0

这篇关于Python datetime和pandas为同一日期提供不同的时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆