滚动平均 pandas 数据框的所有值 [英] Rolling average all values of pandas DataFrame
本文介绍了滚动平均 pandas 数据框的所有值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个pandas DataFrame,我想滚动计算所有值的平均值:对于所有列,对于滚动窗口中的所有观察值.
I have a pandas DataFrame and I want to calculate on a rolling basis the average of all the value: for all the columns, for all the observations in the rolling window.
我有一个带循环的解决方案,但感觉效率很低.请注意,我的数据中可以包含NaNs
,因此根据窗口形状计算总和和跳水将是不安全的(因为我需要nanmean
).
I have a solution with loops but feels very inefficient. Note that I can have NaNs
in my data, so calculating the sum and diving by the shape of the window would not be safe (as I want a nanmean
).
还有更好的方法吗?
设置
import numpy as np
import pandas as pd
np.random.seed(1)
df = pd.DataFrame(np.random.randint(0, 10, size=(10, 2)), columns=['A', 'B'])
df[df>5] = np.nan # EDIT: add nans
我的尝试
n_roll = 2
df_stacked = df.values
roll_avg = {}
for idx in range(n_roll, len(df_stacked)+1):
roll_avg[idx-1] = np.nanmean(df_stacked[idx - n_roll:idx, :].flatten())
roll_avg = pd.Series(roll_avg)
roll_avg.index = df.index[n_roll-1:]
roll_avg = roll_avg.reindex(df.index)
所需结果
roll_avg
Out[33]:
0 NaN
1 5.000000
2 1.666667
3 0.333333
4 1.000000
5 3.000000
6 3.250000
7 3.250000
8 3.333333
9 4.000000
谢谢!
推荐答案
这是一个NumPy解决方案,其中滑动窗口已关闭view_as_windows
-
Here's one NumPy solution with sliding windows off view_as_windows
-
from skimage.util.shape import view_as_windows
# Setup o/p array
out = np.full(len(df),np.nan)
# Get sliding windows of length n_roll along axis=0
w = view_as_windows(df.values,(n_roll,1))[...,0]
# Assign nan-ignored mean values computed along last 2 axes into o/p
out[n_roll-1:] = np.nanmean(w, (1,2))
使用views
的内存效率-
In [62]: np.shares_memory(df,w)
Out[62]: True
这篇关于滚动平均 pandas 数据框的所有值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文