Pandas - 滚动坡度计算 [英] Pandas - Rolling slope calculation

查看:108
本文介绍了Pandas - 滚动坡度计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何计算每列的滚动(窗口 = 60)值的斜率,步长为 5?

How to calculate slope of each columns' rolling(window=60) value, stepped by 5?

我想计算每 5 分钟的值,而且我不需要每条记录的结果.

I'd like to calculate every 5 minutes' value, and I don't need every record's results.

这是示例数据框和结果:

Here's sample dataframe and results:

df
Time                A    ...      N
2016-01-01 00:00  1.2    ...    4.2
2016-01-01 00:01  1.2    ...    4.0
2016-01-01 00:02  1.2    ...    4.5
2016-01-01 00:03  1.5    ...    4.2
2016-01-01 00:04  1.1    ...    4.6
2016-01-01 00:05  1.6    ...    4.1
2016-01-01 00:06  1.7    ...    4.3
2016-01-01 00:07  1.8    ...    4.5
2016-01-01 00:08  1.1    ...    4.1
2016-01-01 00:09  1.5    ...    4.1
2016-01-01 00:10  1.6    ...    4.1
....

result
Time                A    ...      N
2016-01-01 00:04  xxx    ...    xxx
2016-01-01 00:09  xxx    ...    xxx
2016-01-01 00:14  xxx    ...    xxx
...

df.rolling 函数可以应用于这个问题吗?

Can df.rolling function be applied to this problem?

如果窗口中有 NaN 就好了,这意味着子集可能小于 60.

It's fine if NaN is in the window, meaning subset could be less than 60.

推荐答案

看来您想要的是以特定步长滚动.但是,根据熊猫文档rolling 目前不支持步长.

It seems that what you want is rolling with a specific step size. However, according to the documentation of pandas, step size is currently not supported in rolling.

如果数据量不是太大,只需对所有数据进行滚动,并使用索引选择结果.

If the data size is not too large, just perform rolling on all data and select the results using indexing.

这是一个示例数据集.为简单起见,时间列使用整数表示.

Here's a sample dataset. For simplicity, the time column is represented using integers.

data = pd.DataFrame(np.random.rand(500, 1) * 10, columns=['a'])

            a
0    8.714074
1    0.985467
2    9.101299
3    4.598044
4    4.193559
..        ...
495  9.736984
496  2.447377
497  5.209420
498  2.698441
499  3.438271

然后,滚动并计算斜率,

Then, roll and calculate slopes,

def calc_slope(x):
    slope = np.polyfit(range(len(x)), x, 1)[0]
    return slope

# set min_periods=2 to allow subsets less than 60.
# use [4::5] to select the results you need.
result = data.rolling(60, min_periods=2).apply(calc_slope)[4::5]

结果是,

            a
4   -0.542845
9    0.084953
14   0.155297
19  -0.048813
24  -0.011947
..        ...
479 -0.004792
484 -0.003714
489  0.022448
494  0.037301
499  0.027189

或者,你可以参考这篇文章.第一个答案提供了一种 numpy 方法来实现这一点:步长在 pandas.DataFrame.rolling

Or, you can refer to this post. The first answer provides a numpy way to achieve this: step size in pandas.DataFrame.rolling

这篇关于Pandas - 滚动坡度计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆