使用pandas GroupBy聚合字符串列 [英] Aggregating string columns using pandas GroupBy

查看:306
本文介绍了使用pandas GroupBy聚合字符串列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下的DF:

df =

vid   pos      value       sente
1     a         A           21
2     b         B           21
3     b         A           21
3     a         A           21
1     d         B           22
1     a         C           22
1     a         D           22
2     b         A           22
3     a         A           22

现在,我想将sentevid具有相同值的所有行合并为一行,并以" "

Now I want to combine all rows with the same value for sente and vid into one row with the values for value joined by an " "

df2 =

vid   pos      value       sente
1     a         A           21
2     b         B           21
3     b a       A A         21
1     d a a     B C D       22
2     b         A           22
3     a         A           22

我想对此进行修改即可解决问题:

I suppose a modification of this should do the trick:

df2 = df.groupby["sente"].agg(lambda x: " ".join(x))

但是我似乎无法弄清楚如何在语句中添加第二列.

But I can't seem to figure out how to add the second column to the statement.

推荐答案

石斑鱼可以作为列表传递.此外,您可以通过消除lambda的代码来简化您的解决方案,这是不必要的.

Groupers can be passed as lists. Furthermore, you can simplify your solution a bit by ridding your code of the lambda—it isn't needed.

df.groupby(['vid', 'sente'], as_index=False, sort=False).agg(' '.join)

   vid  sente    pos  value
0    1     21      a      A
1    2     21      b      B
2    3     21    b a    A A
3    1     22  d a a  B C D
4    2     22      b      A
5    3     22      a      A

其他一些注意事项:指定 as_index=False 意味着您的石斑鱼将在结果中显示为列(而不是默认的索引).此外, sort=False 将保留列的原始顺序.

Some other notes: specifying as_index=False means your groupers will be present as columns in the result (and not as the index, as is the default). Furthermore, sort=False will preserve the original order of the columns.

这篇关于使用pandas GroupBy聚合字符串列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆