Python datetime和pandas为同一日期提供不同的时间戳 [英] Python datetime and pandas give different timestamps for the same date
问题描述
from datetime import datetime
import pandas as pd
date="2020-02-07T16:05:16.000000000"
#Convert using datetime
t1=datetime.strptime(date[:-3],'%Y-%m-%dT%H:%M:%S.%f')
#Convert using Pandas
t2=pd.to_datetime(date)
#Subtract the dates
print(t1-t2)
#subtract the date timestamps
print(t1.timestamp()-t2.timestamp())
在此示例中,我的理解是datetime和pandas都应使用时区天真日期。谁能解释为什么日期之间的差异为零,但时间戳之间的差异不为零?对我而言,这是5个小时,这是我的时间与GMT时区的偏移。
In this example, my understanding is that both datetime and pandas should use timezone naive dates. Can anyone explain why the difference between the dates is zero, but the difference between the timestamps is not zero? It's off by 5 hours for me, which is my time zone offset from GMT.
推荐答案
从Python的<$派生的原始datetime对象c $ c> datetime.datetime 类表示本地时间。从文档可以明显看出这一点但仍然可以与他人合作。如果在其上调用 timestamp
方法,则返回的POSIX时间戳将按原样引用UTC(自纪元以来的秒数)。
Naive datetime objects derived from Python's datetime.datetime
class represent local time. This is kind of obvious from the docs but can be a brain-teaser to work with nevertheless. If you call the timestamp
method on it, the returned POSIX timestamp refers to UTC (seconds since the epoch) as it should.
来自Python datetime对象的天真的 pandas.Timestamp
的行为可能是违反直觉的(而且我认为不是那么明显)。从tz天真字符串衍生出相同的方式,它不表示本地时间。如果您调用 timestamp
方法,则它指的是UTC。您可以通过将 datetime
对象本地化为UTC来进行验证:
Coming from the Python datetime object, the behavior of a naive pandas.Timestamp
can be counter-intuitive (and I think it's not so obvious). Derived the same way from a tz-naive string, it doesn't represent local time. It refers to UTC, if you call the timestamp
method. You can verify that by localizing the datetime
object to UTC:
from datetime import datetime, timezone
import pandas as pd
date = "2020-02-07T16:05:16.000000000"
t1 = datetime.strptime(date[:-3], '%Y-%m-%dT%H:%M:%S.%f')
t2 = pd.to_datetime(date)
print(t1.replace(tzinfo=timezone.utc).timestamp()-t2.timestamp())
# 0.0
另一种方法是使 pandas.Timestamp
时区感知,例如
The other way around you can make the pandas.Timestamp
timezone-aware, e.g.
t3 = pd.to_datetime(t1.astimezone())
# e.g. Timestamp('2020-02-07 16:05:16+0100', tz='Mitteleuropäische Zeit')
print(t1.timestamp()-t3.timestamp())
# 0.0
我的底线是,如果您知道您所拥有的时间戳表示一个特定时区,使用时区感知日期时间,例如对于UTC
My bottom line would be that if you know that the timestamps you have represent a certain timezone, work with timezone-aware datetime, e.g. for UTC
import pytz # need to use pytz here since pandas uses that internally
t1 = datetime.strptime(date[:-3], '%Y-%m-%dT%H:%M:%S.%f').replace(tzinfo=pytz.UTC)
t2 = pd.to_datetime(date, utc=True)
print(t1 == t2)
# True
print(t1-t2)
# 0 days 00:00:00
print(t1.timestamp()-t2.timestamp())
# 0.0
这篇关于Python datetime和pandas为同一日期提供不同的时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!