与Pandas的Groupby滚动关联 [英] Rolling Correlation with Groupby in Pandas

查看：624 发布时间：2018/5/30 14:19:55 python pandas group-by correlation

本文介绍了与Pandas的Groupby滚动关联的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有一个类似于下面的Pandas数据框，我如何通过'ID'列获得2个特定列和组之间的滚动关联（在这个例子中为2天）？我对Pandas rolling_corr（）函数很熟悉，但我无法弄清楚如何将它与groupby（）子句结合在一起。

我有什么：

  ID日期Val1 Val2 
 A 1-Jan 45 22 
 A 2-Jan 15 66 
 A 3- 1月55 13 
 B 1  -  1月41 12 
 B 2  -  1月87 45 
 B 3  -  1月82 66 
 C 1  -  1月33 34 
 C 2- 1月15日67 
 C 3  -  1月46 22

我需要：

  ID日期Val1 Val2 Rolling_Corr 
 A 1-Jan 45 22 
 A 2-Jan 15 66 0.1 
 A 3年1月1日3月1日3月1日3月1日3月1日1月1日1月1日1月1日3月1日3月1日3月1日3月1日3月1日3月1日b $ b C 2-Jan 15 67 0.09 
 C 3-Jan 46 22 0.11

谢谢！

解决方案

您可以实际请从这里的简单方法开始：
熊猫关联组

，然后像这样添加 rolling（3）：

  df.groupby（'ID'）[['Val1'，'Val2']]。rolling（3）.corr（）

我将窗口从2更改为3，因为您只能得到1或-1的窗口大小为2.不幸的是，该输出（未显示）有点儿因为当你需要的只是一个标量时它输出一个2×2的相关矩阵。
但增加一行可以使输出更加简洁：

  df2 = df.groupby（'ID' ）[['Val1'，'Val2']]。rolling（3）.corr（）
 
 df2.groupby（level = [0,1]）。last（）['Val1'] 
 
 ID 
 A 0 NaN 
 1 NaN 
 2 -0.996539 
 B 3 NaN 
 4 NaN 
 5 0.879868 
 C 6 NaN 
 7 NaN 
 8 -0.985529

Assuming I have a Pandas dataframe similar to the below, how would I get the rolling correlation (for 2 days in this example) between 2 specific columns and group by the 'ID' column? I am familiar with the Pandas rolling_corr() function but I cannot figure out how to combine that with the groupby() clause.

What I have:
ID Date Val1 Val2 A 1-Jan 45 22 A 2-Jan 15 66 A 3-Jan 55 13 B 1-Jan 41 12 B 2-Jan 87 45 B 3-Jan 82 66 C 1-Jan 33 34 C 2-Jan 15 67 C 3-Jan 46 22
What I need:
ID Date Val1 Val2 Rolling_Corr A 1-Jan 45 22 A 2-Jan 15 66 0.1 A 3-Jan 55 13 0.16 B 1-Jan 41 12 B 2-Jan 87 45 0.15 B 3-Jan 82 66 0.05 C 1-Jan 33 34 C 2-Jan 15 67 0.09 C 3-Jan 46 22 0.11
Thanks!
解决方案
You can actually start with the simple approach here: Pandas Correlation Groupby

and then add rolling(3) like this:
df.groupby('ID')[['Val1','Val2']].rolling(3).corr()
I've changed the window from 2 to 3 because you'll only get 1 or -1 with a window size of 2. Unfortunately, that output (not shown) is a bit verbose because it outputs a 2x2 correlation matrix when all you need is a scalar. But with an additional line you can make the output more concise:
df2 = df.groupby('ID')[['Val1','Val2']].rolling(3).corr() df2.groupby(level=[0,1]).last()['Val1'] ID A 0 NaN 1 NaN 2 -0.996539 B 3 NaN 4 NaN 5 0.879868 C 6 NaN 7 NaN 8 -0.985529

这篇关于与Pandas的Groupby滚动关联的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

与Pandas的Groupby滚动关联 [英] Rolling Correlation with Groupby in Pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

与Pandas的Groupby滚动关联 [英] Rolling Correlation with Groupby in Pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭