pandas 日期时间索引的算术运算 [英] Arithmetic operations on datetime index in pandas

查看:118
本文介绍了 pandas 日期时间索引的算术运算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在大熊猫中,您可以通过经典整数位置/行索引或通过基于日期时间的索引来访问时间序列的特定位置。可以使用基本的算术运算来操纵基于整数的索引。如果对于频率为12小时的时间序列,我有一个 integer_index ,而我想在此之前一天访问该条目,我可以简单地执行 integer_index - 2 。然而,现实世界的数据并不总是完美的,有时候行丢失。在这种情况下,此方法将失败,并且可以使用基于datetime的索引并从此索引中减去一个一天。如何做到这一点?



示例脚本:

 #生成一个样本时间序列
导入熊猫为pd
s = pd.Series([A,B,C,D,E],index = pd.date_range 2000-01-01,期间= 5,频率=12h))
打印s

2000-01-01 00:00:00 A
2000-01 -01 12:00:00 B
2000-01-02 00:00:00 C
2000-01-02 12:00:00 D
2000-01-03 00:00 :00 E
Freq:12H,dtype:object

#这些索引应该访问相同的值(C)
integer_index = 2
date_index = 2000-01-02 00:00

print s [integer_index]#打印C
print s [date_index]#打印C

#我可以从前一天通过从整数索引减去2来获取该值
print s [integer_index - 2]#打印A

#如何从日期索引中减去一天?
print s [date_index - 1]#引发错误

此问题的背景可以在我之前提交的一个提交文件中找到:



填补相邻日期数据的平均数据差距



其中用户JohnE找到了解决方法使用基于整数位置的索引的问题。

解决方案

您的日期时间索引不是基于字符串,这是一个 DatetimeIndex 意味着您可以使用 datetime 对象可以适当地进行索引,而不是像日期那样看起来像的字符串。



以下代码将 date_index 转换为 datetime 对象然后使用 timedelta(days = 1) ,以减去一天。

 #生成一个抽样时间序列
import pandas as pd
from datetime import datetime,timedelta

s = pd.Series([A,B,C,D,E ],index = pd.date_range(2000-01-01,periods = 5,freq =12h))
print(s)

#这两个索引应该访问相同的值(C)
integer_index = 2
#将字符串转换为datetime对象
date_index = datetime.strptime(2000-01-02 00:00,%Y - %m-%d%H:%M)
print(date_index)#2000-01-02 00:00:00

print(s [integer_index])#打印 C
print(s [date_index])#打印C


print(s [integer_index - 2])#打印A

one_day = timedelta(days = 1)
print(s [date_in dex - one_day])#打印A
print(date_index - one_day)#2000-01-01 00:00:00


In pandas, you can access specific positions of a time series either by classical integer position / row based indexing, or by datetime based indexing. The integer based index can be manipulated using basic arithmetic operations, e.g. if I have a integer_index for a time series with frequency 12 hours and I want to access the entry exactly one day prior to this, I can simply do integer_index - 2. However, real world data are not always perfect, and sometimes rows are missing. In this case, this method fails, and it would be helpful to be able to use datetime based indexing and subtract, for example, one day from this index. How can I do this?

Sample script:

# generate a sample time series
import pandas as pd
s = pd.Series(["A", "B", "C", "D", "E"], index=pd.date_range("2000-01-01", periods=5, freq="12h"))
print s

2000-01-01 00:00:00    A
2000-01-01 12:00:00    B
2000-01-02 00:00:00    C
2000-01-02 12:00:00    D
2000-01-03 00:00:00    E
Freq: 12H, dtype: object

# these to indices should access the same value ("C")
integer_index = 2
date_index = "2000-01-02 00:00"

print s[integer_index]  # prints "C"
print s[date_index]  # prints "C"

# I can access the value one day earlier by subtracting 2 from the integer index
print s[integer_index - 2]  # prints A

# how can I subtract one day from the date index?
print s[date_index - 1]  # raises an error 

The background to this question can be found in an earlier submission of mine here:

Fill data gaps with average of data from adjacent days

where user JohnE found a workaround to my problem that uses integer position based indexing. He makes sure that I have equally spaced data by resampling the time series.

解决方案

Your datetime index isn't based on strings, it's a DatetimeIndex meaning you can use datetime objects to index appropriately, rather than a string which looks like a date.

The code below converts date_index into a datetime object and then uses timedelta(days=1) to subtract "one day" away from it.

# generate a sample time series
import pandas as pd
from datetime import datetime, timedelta

s = pd.Series(["A", "B", "C", "D", "E"], index=pd.date_range("2000-01-01", periods=5, freq="12h"))
print(s)

# these two indices should access the same value ("C")
integer_index = 2
# Converts the string into a datetime object
date_index = datetime.strptime("2000-01-02 00:00", "%Y-%m-%d %H:%M")
print(date_index) # 2000-01-02 00:00:00

print(s[integer_index])  # prints "C"
print(s[date_index])  # prints "C"


print(s[integer_index - 2])  # prints "A"

one_day = timedelta(days=1)
print(s[date_index - one_day]) # prints "A"
print(date_index - one_day) # 2000-01-01 00:00:00

这篇关于 pandas 日期时间索引的算术运算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆