按组计算行中的值之间的差异 [英] Calculate difference between values in rows by group
问题描述
我有这样的df
:
我需要为每个公司计算X1, X2, X3, X4 and X5
之间的差异,但是我只知道如何计算整列之间的差异.
I need to calculate difference between X1, X2, X3, X4 and X5
for each company, but I know only how to calculate difference between whole column.
df['dX1'] = df['X1'].shift(-1) - df['X1']
df['dX2'] = df['X2'].shift(-1) - df['X2']
df['dX3'] = df['X3'].shift(-1) - df['X3']
...
这是一种不好的方法,因为它是在74
行中用X1
行中的X1
减去X1
(这是两家不同的公司,这没有意义).
It's bad approach because it's substracting X1
on line 74
with X1
on line 73
(and that's 2 different companies, that makes no sense).
我的问题是,如何计算每个期间和每个公司的行值之间的差异.例如.
My question is, how to calculate difference between row values but for every period and for every company. E.g.
(我可以一张一张地加载csvs并计算每个公司的差异,然后将所有合并到一个列表中,但是因为我有700个csv文件,所以将需要2天的时间.)
(I can load csvs one by one and calculate diff for each company, then merge all into one list, but it will take 2 days because i have 700 csv files).
推荐答案
I think you need DataFrameGroupBy.diff
:
cols = ['X1','X2','X3','X4','X5']
df1 = df.groupby('company')[cols].diff(-1)
这篇关于按组计算行中的值之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!