按组计算行中的值之间的差异 [英] Calculate difference between values in rows by group

查看:89
本文介绍了按组计算行中的值之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的df:

我需要为每个公司计算X1, X2, X3, X4 and X5之间的差异,但是我只知道如何计算整列之间的差异.

I need to calculate difference between X1, X2, X3, X4 and X5 for each company, but I know only how to calculate difference between whole column.

df['dX1'] = df['X1'].shift(-1) - df['X1']
df['dX2'] = df['X2'].shift(-1) - df['X2']
df['dX3'] = df['X3'].shift(-1) - df['X3']
...

这是一种不好的方法,因为它是在74行中用X1行中的X1减去X1(这是两家不同的公司,这没有意义).

It's bad approach because it's substracting X1 on line 74 with X1 on line 73 (and that's 2 different companies, that makes no sense).

我的问题是,如何计算每个期间和每个公司的行值之间的差异.例如.

My question is, how to calculate difference between row values but for every period and for every company. E.g.

(我可以一张一张地加载csvs并计算每个公司的差异,然后将所有合并到一个列表中,但是因为我有700个csv文件,所以将需要2天的时间.)

(I can load csvs one by one and calculate diff for each company, then merge all into one list, but it will take 2 days because i have 700 csv files).

推荐答案

我认为您需要

I think you need DataFrameGroupBy.diff:

cols = ['X1','X2','X3','X4','X5']
df1 = df.groupby('company')[cols].diff(-1)

这篇关于按组计算行中的值之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆