将可识别 pandas 时区的DateTimeIndex转换为朴素的时间戳,但在特定的时区 [英] Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone

查看:65
本文介绍了将可识别 pandas 时区的DateTimeIndex转换为朴素的时间戳,但在特定的时区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您可以使用函数tz_localize来识别Timestamp或DateTimeIndex时区,但是如何相反:如何在保留时区的情况下将时区感知的Timestamp转换为天真的时间戳?

You can use the function tz_localize to make a Timestamp or DateTimeIndex timezone aware, but how can you do the opposite: how can you convert a timezone aware Timestamp to a naive one, while preserving its timezone?

一个例子:

In [82]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10, freq='s', tz="Europe/Brussels")

In [83]: t
Out[83]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 12:00:00, ..., 2013-05-18 12:00:09]
Length: 10, Freq: S, Timezone: Europe/Brussels

我可以通过将其设置为无"来删除时区,但是结果将转换为UTC(12点变成10点):

I could remove the timezone by setting it to None, but then the result is converted to UTC (12 o'clock became 10):

In [86]: t.tz = None

In [87]: t
Out[87]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 10:00:00, ..., 2013-05-18 10:00:09]
Length: 10, Freq: S, Timezone: None

还有另一种方法可以将DateTimeIndex转换为朴素的时区,但是在保留其时区的同时?

Is there another way I can convert a DateTimeIndex to timezone naive, but while preserving the timezone it was set in?

一些上下文的原因是:我想使用时区朴素的时间序列(以避免额外的时区麻烦,在我正在处理的情况下,我不需要它们).
但是由于某种原因,我必须处理本地时区(欧洲/布鲁塞尔)中的时区感知时间序列.由于我所有其他数据都是时区朴素的(但以本地时区表示),因此我想将此时间序列转换为朴素才能进一步使用,但它也必须以我的本地时区表示(因此,只需删除时区信息,而不将用户可见的时间转换为UTC).

Some context on the reason I am asking this: I want to work with timezone naive timeseries (to avoid the extra hassle with timezones, and I do not need them for the case I am working on).
But for some reason, I have to deal with a timezone-aware timeseries in my local timezone (Europe/Brussels). As all my other data are timezone naive (but represented in my local timezone), I want to convert this timeseries to naive to further work with it, but it also has to be represented in my local timezone (so just remove the timezone info, without converting the user-visible time to UTC).

我知道时间实际上是内部存储为UTC的,并且仅在您表示它时才转换为另一个时区,因此当我要非本地化"时间时,必须进行某种转换.例如,使用python datetime模块,您可以像这样删除"时区:

I know the time is actually internal stored as UTC and only converted to another timezone when you represent it, so there has to be some kind of conversion when I want to "delocalize" it. For example, with the python datetime module you can "remove" the timezone like this:

In [119]: d = pd.Timestamp("2013-05-18 12:00:00", tz="Europe/Brussels")

In [120]: d
Out[120]: <Timestamp: 2013-05-18 12:00:00+0200 CEST, tz=Europe/Brussels>

In [121]: d.replace(tzinfo=None)
Out[121]: <Timestamp: 2013-05-18 12:00:00> 

因此,基于此,我可以执行以下操作,但是我认为当使用较大的时间序列时,这将不是很有效:

So, based on this, I could do the following, but I suppose this will not be very efficient when working with a larger timeseries:

In [124]: t
Out[124]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 12:00:00, ..., 2013-05-18 12:00:09]
Length: 10, Freq: S, Timezone: Europe/Brussels

In [125]: pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
Out[125]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 12:00:00, ..., 2013-05-18 12:00:09]
Length: 10, Freq: None, Timezone: None

推荐答案

为回答我自己的问题,此功能已同时添加到了熊猫中. 从熊猫0.15.0开始,您可以使用tz_localize(None)删除时区,以获取本地时间.
请参阅whatsnew条目: http://pandas.pydata.org/pandas-docs /stable/whatsnew.html#timezone-handling-improvements

To answer my own question, this functionality has been added to pandas in the meantime. Starting from pandas 0.15.0, you can use tz_localize(None) to remove the timezone resulting in local time.
See the whatsnew entry: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#timezone-handling-improvements

从上面的例子来看,

In [4]: t = pd.date_range(start="2013-05-18 12:00:00", periods=2, freq='H',
                          tz= "Europe/Brussels")

In [5]: t
Out[5]: DatetimeIndex(['2013-05-18 12:00:00+02:00', '2013-05-18 13:00:00+02:00'],
                       dtype='datetime64[ns, Europe/Brussels]', freq='H')

使用tz_localize(None)会删除时区信息,该时区信息会导致本地时间天真:

using tz_localize(None) removes the timezone information resulting in naive local time:

In [6]: t.tz_localize(None)
Out[6]: DatetimeIndex(['2013-05-18 12:00:00', '2013-05-18 13:00:00'], 
                      dtype='datetime64[ns]', freq='H')

此外,您还可以使用tz_convert(None)删除时区信息,但转换为UTC,从而产生天生UTC时间:

Further, you can also use tz_convert(None) to remove the timezone information but converting to UTC, so yielding naive UTC time:

In [7]: t.tz_convert(None)
Out[7]: DatetimeIndex(['2013-05-18 10:00:00', '2013-05-18 11:00:00'], 
                      dtype='datetime64[ns]', freq='H')


datetime.replace解决方案相比,这性能更高:


This is much more performant than the datetime.replace solution:

In [31]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10000, freq='H',
                           tz="Europe/Brussels")

In [32]: %timeit t.tz_localize(None)
1000 loops, best of 3: 233 µs per loop

In [33]: %timeit pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
10 loops, best of 3: 99.7 ms per loop

这篇关于将可识别 pandas 时区的DateTimeIndex转换为朴素的时间戳,但在特定的时区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆