pandas 日期时间索引的算术运算 [英] Arithmetic operations on datetime index in pandas
问题描述
integer_index
,而我想在此之前一天访问该条目,我可以简单地执行 integer_index - 2
。然而,现实世界的数据并不总是完美的,有时候行丢失。在这种情况下,此方法将失败,并且可以使用基于datetime的索引并从此索引中减去一个一天
。如何做到这一点? 示例脚本:
#生成一个样本时间序列
导入熊猫为pd
s = pd.Series([A,B,C,D,E],index = pd.date_range 2000-01-01,期间= 5,频率=12h))
打印s
2000-01-01 00:00:00 A
2000-01 -01 12:00:00 B
2000-01-02 00:00:00 C
2000-01-02 12:00:00 D
2000-01-03 00:00 :00 E
Freq:12H,dtype:object
#这些索引应该访问相同的值(C)
integer_index = 2
date_index = 2000-01-02 00:00
print s [integer_index]#打印C
print s [date_index]#打印C
#我可以从前一天通过从整数索引减去2来获取该值
print s [integer_index - 2]#打印A
#如何从日期索引中减去一天?
print s [date_index - 1]#引发错误
此问题的背景可以在我之前提交的一个提交文件中找到:
其中用户JohnE找到了解决方法使用基于整数位置的索引的问题。
您的日期时间索引不是基于字符串,这是一个 DatetimeIndex
意味着您可以使用 datetime
对象可以适当地进行索引,而不是像日期那样看起来像的字符串。
以下代码将 date_index
转换为 datetime
对象然后使用 timedelta(days = 1)
,以减去一天。
#生成一个抽样时间序列
import pandas as pd
from datetime import datetime,timedelta
s = pd.Series([A,B,C,D,E ],index = pd.date_range(2000-01-01,periods = 5,freq =12h))
print(s)
#这两个索引应该访问相同的值(C)
integer_index = 2
#将字符串转换为datetime对象
date_index = datetime.strptime(2000-01-02 00:00,%Y - %m-%d%H:%M)
print(date_index)#2000-01-02 00:00:00
print(s [integer_index])#打印 C
print(s [date_index])#打印C
print(s [integer_index - 2])#打印A
one_day = timedelta(days = 1)
print(s [date_in dex - one_day])#打印A
print(date_index - one_day)#2000-01-01 00:00:00
In pandas, you can access specific positions of a time series either by classical integer position / row based indexing, or by datetime based indexing. The integer based index can be manipulated using basic arithmetic operations, e.g. if I have a integer_index
for a time series with frequency 12 hours and I want to access the entry exactly one day prior to this, I can simply do integer_index - 2
. However, real world data are not always perfect, and sometimes rows are missing. In this case, this method fails, and it would be helpful to be able to use datetime based indexing and subtract, for example, one day
from this index. How can I do this?
Sample script:
# generate a sample time series
import pandas as pd
s = pd.Series(["A", "B", "C", "D", "E"], index=pd.date_range("2000-01-01", periods=5, freq="12h"))
print s
2000-01-01 00:00:00 A
2000-01-01 12:00:00 B
2000-01-02 00:00:00 C
2000-01-02 12:00:00 D
2000-01-03 00:00:00 E
Freq: 12H, dtype: object
# these to indices should access the same value ("C")
integer_index = 2
date_index = "2000-01-02 00:00"
print s[integer_index] # prints "C"
print s[date_index] # prints "C"
# I can access the value one day earlier by subtracting 2 from the integer index
print s[integer_index - 2] # prints A
# how can I subtract one day from the date index?
print s[date_index - 1] # raises an error
The background to this question can be found in an earlier submission of mine here:
Fill data gaps with average of data from adjacent days
where user JohnE found a workaround to my problem that uses integer position based indexing. He makes sure that I have equally spaced data by resampling the time series.
Your datetime index isn't based on strings, it's a DatetimeIndex
meaning you can use datetime
objects to index appropriately, rather than a string which looks like a date.
The code below converts date_index
into a datetime
object and then uses timedelta(days=1)
to subtract "one day" away from it.
# generate a sample time series
import pandas as pd
from datetime import datetime, timedelta
s = pd.Series(["A", "B", "C", "D", "E"], index=pd.date_range("2000-01-01", periods=5, freq="12h"))
print(s)
# these two indices should access the same value ("C")
integer_index = 2
# Converts the string into a datetime object
date_index = datetime.strptime("2000-01-02 00:00", "%Y-%m-%d %H:%M")
print(date_index) # 2000-01-02 00:00:00
print(s[integer_index]) # prints "C"
print(s[date_index]) # prints "C"
print(s[integer_index - 2]) # prints "A"
one_day = timedelta(days=1)
print(s[date_index - one_day]) # prints "A"
print(date_index - one_day) # 2000-01-01 00:00:00
这篇关于 pandas 日期时间索引的算术运算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!