如何根据 pandas 另一行中的值来合并一行中的值 [英] How to combine values in a row depending on value in another row in pandas

查看:56
本文介绍了如何根据 pandas 另一行中的值来合并一行中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有几列的Pandas数据框(单词,开始时间,停止时间,说话者).我想合并单词"列中的所有值,而扬声器"列中的值不变.另外,我想在组合中保留第一个单词的开始"值和最后一个单词的停止"值.

I have a pandas dataframe with several columns (words, start time, stop time, speaker). I want to combine all values in the 'word' column while the values in the 'speaker' column do not change. In addition, I want to keep the 'start' value for the first word and the 'stop' value for the last word in the combination.

我目前有:

      word        start  stop      speaker
0      but   2.72  2.85        2
1   that's   2.85  3.09        2
2  alright   3.09  3.47        2
3    we'll   8.43  8.69        1
4     have   8.69  8.97        1
5       to   8.97  9.07        1
6     okay   9.19 10.01        2
7     sure  10.02 11.01        2
8    what?  11.02 12.00        1

但是,我想将其转换为:

However, I would like to turn this into:

       word        start start speaker
0  but that's alright  2.72  3.47  2
1       we'll have to  8.43  9.07  1
2           okay sure  9.19 11.01  2
3               what? 11.02 12.00  1

推荐答案

我们将使用GroupBy.agg和aggfuncs字典:

We'll use GroupBy.agg with a dict of aggfuncs:

(df.groupby('speaker', as_index=False, sort=False)
   .agg({'word': ' '.join, 'start': 'min', 'stop': 'max',}))

   speaker                word  start  stop
0        2  but that's alright   2.72  3.47
1        1       we'll have to   8.43  9.07


要按连续出现的次数分组,请使用移位的累积技巧,然后将其与扬声器"一起用作第二个分组者:


To group by consecutive occurrences, use the shifting cumsum trick, then use that as the second grouper along with "speaker":

gp1 = df['speaker'].ne(df['speaker'].shift()).cumsum()

(df.groupby(['speaker', gp1], as_index=False, sort=False)
   .agg({'word': ' '.join, 'start': 'min', 'stop': 'max',}))

   speaker                word  start   stop
0        2  but that's alright   2.72   3.47
1        1       we'll have to   8.43   9.07
2        2           okay sure   9.19  11.01
3        1               what?  11.02  12.00

这篇关于如何根据 pandas 另一行中的值来合并一行中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆