将具有多个时区的 Pandas 列转换为单个时区 [英] Convert pandas column with multiple timezones to single timezone

查看:34
本文介绍了将具有多个时区的 Pandas 列转换为单个时区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Pandas DataFrame 中有一个列,其中包含带时区的时间戳.本专栏中有两个不同的时区,我需要确保只有一个.这是列末尾的输出:

I've got a column in a pandas DataFrame that contains timestamps with timezones. There are two different timezones present in this column, and I need to ensure that there's only one. Here's the output of the end of the column:

260003    2019-05-21 12:00:00-06:00
260004    2019-05-21 12:15:00-06:00
Name: timestamp, Length: 260005, dtype: object

就其价值而言,时间戳在 -06:00-07:00 之间有所不同,并具有以下输出:

For what it's worth, the timestamps vary between -06:00 and -07:00, and have the following output:

datetime.datetime(2007, 10, 1, 1, 0, tzinfo=tzoffset(None, -21600)) for -06:00datetime.datetime(2007, 11, 17, 5, 15, tzinfo=tzoffset(None, -25200)) for -07:00

我一直在尝试使用 tz.localize 和 tz.convert,它们过去运行良好,但我认为数据只有一个时区.例如,如果我这样做:

I've been trying to use tz.localize and tz.convert, which have worked fine in the past, but I suppose that data has only ever had one timezone. E.g., if I do:

df['timestamp'].dt.tz_localize('MST', ambiguous='infer').dt.tz_convert('MST')

我明白了:

ValueError: Array must be all same time zone

During handling of the above exception, another exception occurred:

ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

问题

有没有办法将这些转换为 MST?或者任何时区,真的吗?我想我可以按时区分解 DataFrame(不是 100% 确定如何,但我认为这是可能的)并对其进行处理,但我想我想看看是否有更智能的解决方案.谢谢!

Question

Is there a way to convert these to MST? Or any timezone, really? I guess I could break up the DataFrame by timezone (not 100% sure how, but I imagine it's possible) and act on chunks of it, but I figured I'd ask to see if there's a smarter solution out there. Thank you!

推荐答案

我试过了:

df = pd.DataFrame({'timestamp':['2019-05-21 12:00:00-06:00',
                                '2019-05-21 12:15:00-07:00']})
df['timestamp'] = pd.to_datetime(df.timestamp)

df.timestamp.dt.tz_localize('MST')

工作正常并给出:

0   2019-05-21 18:00:00-07:00
1   2019-05-21 19:15:00-07:00
Name: timestamp, dtype: datetime64[ns, MST]

这不是您所期望的吗?

感谢@G.Anderson 的评论,我尝试了具有时区感知时间戳的不同数据:

Thanks to @G.Anderson's comment, I tried the different data with timezone-aware timestamps:

df = pd.DataFrame({'timestamp':[pd.to_datetime('2019-05-21 12:00:00').tz_localize('MST'),
                         pd.to_datetime('2019-05-21 12:15:00').tz_localize('EST')]})

然后

df['timestamp'] = pd.to_datetime(df.timestamp)

确实给出了同样的错误.然后我添加了 utc=True:

did give the same error. Then I added utc=True:

df.timestamp = pd.to_datetime(df.timestamp, utc=True)

# df.timestamp
# 0   2019-05-21 19:00:00+00:00
# 1   2019-05-21 17:15:00+00:00
# Name: timestamp, dtype: datetime64[ns, UTC]

df.timestamp.dt.tz_convert('MST')

工作正常并给出:

0   2019-05-21 12:00:00-07:00
1   2019-05-21 10:15:00-07:00
Name: timestamp, dtype: datetime64[ns, MST]

这篇关于将具有多个时区的 Pandas 列转换为单个时区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆