pandas 每月滚动作业 [英] Pandas monthly rolling operation
问题描述
我最终在写出这个问题的时候就弄清楚了,所以无论如何我都会发表并回答我自己的问题,以防其他人需要一点帮助.
I ended up figuring it out while writing out this question so I'll just post anyway and answer my own question in case someone else needs a little help.
假设我们有一个DataFrame
,df
,其中包含此数据.
Suppose we have a DataFrame
, df
, containing this data.
import pandas as pd
from io import StringIO
data = StringIO(
"""\
date spendings category
2014-03-25 10 A
2014-04-05 20 A
2014-04-15 10 A
2014-04-25 10 B
2014-05-05 10 B
2014-05-15 10 A
2014-05-25 10 A
"""
)
df = pd.read_csv(data,sep="\s+",parse_dates=True,index_col="date")
目标
对于每一行,将spendings
加到一个月内的每一行,最好使用DataFrame.rolling
,因为这是一种非常干净的语法.
Goal
For each row, sum the spendings
over every row that is within one month of it, ideally using DataFrame.rolling
as it's a very clean syntax.
df = df.rolling("M").sum()
但这会引发异常
ValueError: <MonthEnd> is a non-fixed frequency
版本:pandas==0.19.2
推荐答案
使用"D"
偏移量而不是"M"
,并专门使用"30D"
30天或大约一个月.
Use the "D"
offset rather than "M"
and specifically use "30D"
for 30 days or approximately one month.
df = df.rolling("30D").sum()
起初,我凭直觉跳到使用"M"
,因为我认为它可以使用一个月,但现在很清楚为什么不起作用.
Initially, I intuitively jumped to using "M"
as I figured it stands for one month, but now it's clear why that doesn't work.
这篇关于 pandas 每月滚动作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!