根据条件合并行 pandas 数据框 [英] merge rows pandas dataframe based on condition

查看:51
本文介绍了根据条件合并行 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个数据框df

包含一组事件(行).

df = pd.DataFrame(data=[[1, 2,   7, 10],
                   [10, 22, 1, 30],
                   [30, 42, 2, 10],  
                   [100,142, 22,1],
                   [143, 152, 2, 10],
                   [160, 162, 12, 11]],columns=['Start','End','Value1','Value2'])

 df
Out[15]: 
   Start  End  Value1  Value2
0      1    2       7      10
1     10   22       1      30
2     30   42       2      10
3    100  142      22       1
4    143  152       2      10
5    160  162      12      11

如果2个(或更多)连续事件相距< = 10,我想合并2个(或更多)事件(即,使用第一个事件的开始,最后一个事件的结束并将Value1中的值相加和Value2).

If 2 (or more) consecutive events are <= 10 far apart I would like to merge the 2 (or more) events (i.e. use the start of the first event, end of the last and sum the values in Value1 and Value2).

在上面的示例中,df变为:

In the example above df becomes:

 df
Out[15]: 
   Start  End  Value1  Value2
0      1   42      10      50
1    100  162      36      22

推荐答案

完全有可能:

df.groupby(((df.Start  - df.End.shift(1)) > 10).cumsum()).agg({'Start':min, 'End':max, 'Value1':sum, 'Value2': sum})

说明:

start_end_differences = df.Start  - df.End.shift(1) #shift moves the series down
threshold_selector = start_end_differences > 10 # will give you a boolean array where true indicates a point where the difference more than 10.
groups = threshold_selector.cumsum() # sums up the trues (1) and will create an integer series starting from 0
df.groupby(groups).agg({'Start':min}) # the aggregation is self explaining


这是一个通用解决方案,与其他列无关:


Here is a generalized solution that remains agnostic of the other columns:

cols = df.columns.difference(['Start', 'End'])
grps = df.Start.sub(df.End.shift()).gt(10).cumsum()
gpby = df.groupby(grps)
gpby.agg(dict(Start='min', End='max')).join(gpby[cols].sum())

   Start  End  Value1  Value2
0      1   42      10      50
1    100  162      36      22

这篇关于根据条件合并行 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆