将 pandas 时区感知 DateTimeIndex 转换为朴素的时间戳,但在某些时区 [英] Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone

查看:31
本文介绍了将 pandas 时区感知 DateTimeIndex 转换为朴素的时间戳,但在某些时区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您可以使用函数 tz_localize 使 Timestamp 或 DateTimeIndex 时区感知,但您如何做相反的事情:如何将时区感知 Timestamp 转换为朴素的 Timestamp,同时保留其时区?

You can use the function tz_localize to make a Timestamp or DateTimeIndex timezone aware, but how can you do the opposite: how can you convert a timezone aware Timestamp to a naive one, while preserving its timezone?

示例:

In [82]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10, freq='s', tz="Europe/Brussels")

In [83]: t
Out[83]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 12:00:00, ..., 2013-05-18 12:00:09]
Length: 10, Freq: S, Timezone: Europe/Brussels

我可以通过将时区设置为无来删除时区,但结果会转换为 UTC(12 点钟变成 10 点):

I could remove the timezone by setting it to None, but then the result is converted to UTC (12 o'clock became 10):

In [86]: t.tz = None

In [87]: t
Out[87]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 10:00:00, ..., 2013-05-18 10:00:09]
Length: 10, Freq: S, Timezone: None

是否有另一种方法可以将 DateTimeIndex 转换为时区天真,但同时保留它设置的时区?

Is there another way I can convert a DateTimeIndex to timezone naive, but while preserving the timezone it was set in?

关于我提出这个问题的原因的一些上下文:我想使用时区天真的时间序列(以避免时区带来的额外麻烦,我在处理的情况下不需要它们).
但出于某种原因,我必须在本地时区(欧洲/布鲁塞尔)中处理时区感知时间序列.由于我的所有其他数据都是原始时区(但以我的本地时区表示),我想将此时间序列转换为原始数据以进一步使用它,但它也必须以我的本地时区表示(因此只需删除时区信息,无需将用户可见时间转换为 UTC).

Some context on the reason I am asking this: I want to work with timezone naive timeseries (to avoid the extra hassle with timezones, and I do not need them for the case I am working on).
But for some reason, I have to deal with a timezone-aware timeseries in my local timezone (Europe/Brussels). As all my other data are timezone naive (but represented in my local timezone), I want to convert this timeseries to naive to further work with it, but it also has to be represented in my local timezone (so just remove the timezone info, without converting the user-visible time to UTC).

我知道时间实际上是在内部存储为 UTC 并且只有在您表示它时才转换为另一个时区,因此当我想非本地化"它时必须进行某种转换.例如,使用 python datetime 模块,您可以像这样删除"时区:

I know the time is actually internal stored as UTC and only converted to another timezone when you represent it, so there has to be some kind of conversion when I want to "delocalize" it. For example, with the python datetime module you can "remove" the timezone like this:

In [119]: d = pd.Timestamp("2013-05-18 12:00:00", tz="Europe/Brussels")

In [120]: d
Out[120]: <Timestamp: 2013-05-18 12:00:00+0200 CEST, tz=Europe/Brussels>

In [121]: d.replace(tzinfo=None)
Out[121]: <Timestamp: 2013-05-18 12:00:00> 

因此,基于此,我可以执行以下操作,但我想这在处理较大的时间序列时效率不会很高:

So, based on this, I could do the following, but I suppose this will not be very efficient when working with a larger timeseries:

In [124]: t
Out[124]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 12:00:00, ..., 2013-05-18 12:00:09]
Length: 10, Freq: S, Timezone: Europe/Brussels

In [125]: pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
Out[125]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-18 12:00:00, ..., 2013-05-18 12:00:09]
Length: 10, Freq: None, Timezone: None

推荐答案

为了回答我自己的问题,此功能已同时添加到 Pandas 中.从 pandas 0.15.0 开始,您可以使用 tz_localize(None) 删除导致本地时间的时区.
请参阅 whatsnew 条目:http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#timezone-handling-improvements

To answer my own question, this functionality has been added to pandas in the meantime. Starting from pandas 0.15.0, you can use tz_localize(None) to remove the timezone resulting in local time.
See the whatsnew entry: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#timezone-handling-improvements

以我上面的例子为例:

In [4]: t = pd.date_range(start="2013-05-18 12:00:00", periods=2, freq='H',
                          tz= "Europe/Brussels")

In [5]: t
Out[5]: DatetimeIndex(['2013-05-18 12:00:00+02:00', '2013-05-18 13:00:00+02:00'],
                       dtype='datetime64[ns, Europe/Brussels]', freq='H')

使用 tz_localize(None) 删除时区信息导致 朴素的本地时间:

using tz_localize(None) removes the timezone information resulting in naive local time:

In [6]: t.tz_localize(None)
Out[6]: DatetimeIndex(['2013-05-18 12:00:00', '2013-05-18 13:00:00'], 
                      dtype='datetime64[ns]', freq='H')

此外,您还可以使用 tz_convert(None) 删除时区信息但转换为 UTC,从而产生 naive UTC 时间:

Further, you can also use tz_convert(None) to remove the timezone information but converting to UTC, so yielding naive UTC time:

In [7]: t.tz_convert(None)
Out[7]: DatetimeIndex(['2013-05-18 10:00:00', '2013-05-18 11:00:00'], 
                      dtype='datetime64[ns]', freq='H')

<小时>

这比 datetime.replace 解决方案性能更高:

In [31]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10000, freq='H',
                           tz="Europe/Brussels")

In [32]: %timeit t.tz_localize(None)
1000 loops, best of 3: 233 µs per loop

In [33]: %timeit pd.DatetimeIndex([i.replace(tzinfo=None) for i in t])
10 loops, best of 3: 99.7 ms per loop

这篇关于将 pandas 时区感知 DateTimeIndex 转换为朴素的时间戳,但在某些时区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆