复制pandas.DataFrame的GROUP_CONCAT [英] Replicating GROUP_CONCAT for pandas.DataFrame
问题描述
我有一个熊猫DataFrame df:
I have a pandas DataFrame df:
+------+---------+
| team | user |
+------+---------+
| A | elmer |
| A | daffy |
| A | bugs |
| B | dawg |
| A | foghorn |
| B | speedy |
| A | goofy |
| A | marvin |
| B | pepe |
| C | petunia |
| C | porky |
+------+---------
我想使用以下方法找到或编写函数以返回要在MySQL中返回的DataFrame:
I want to find or write a function to return a DataFrame that I would return in MySQL using the following:
SELECT
team,
GROUP_CONCAT(user)
FROM
df
GROUP BY
team
获得以下结果:
+------+---------------------------------------+
| team | group_concat(user) |
+------+---------------------------------------+
| A | elmer,daffy,bugs,foghorn,goofy,marvin |
| B | dawg,speedy,pepe |
| C | petunia,porky |
+------+---------------------------------------+
我可以想到通过遍历行并添加到字典来实现此目的的讨厌方法,但是必须有一种更好的方法.
I can think of nasty ways to do this by iterating over rows and adding to a dictionary, but there's got to be a better way.
推荐答案
执行以下操作:
df.groupby('team').apply(lambda x: ','.join(x.user))
获取字符串Series
或
df.groupby('team').apply(lambda x: list(x.user))
可获得Series
个list
个字符串.
结果如下:
In [33]: df.groupby('team').apply(lambda x: ', '.join(x.user))
Out[33]:
team
a elmer, daffy, bugs, foghorn, goofy, marvin
b dawg, speedy, pepe
c petunia, porky
dtype: object
In [34]: df.groupby('team').apply(lambda x: list(x.user))
Out[34]:
team
a [elmer, daffy, bugs, foghorn, goofy, marvin]
b [dawg, speedy, pepe]
c [petunia, porky]
dtype: object
请注意,通常对这些类型的Series
进行的任何进一步操作都将很慢并且通常不建议使用.如果还有另一种汇总方法,而无需在Series
内放入list
,则应考虑改用这种方法.
Note that in general any further operations on these types of Series
will be slow and are generally discouraged. If there's another way to aggregate without putting a list
inside of a Series
you should consider using that approach instead.
这篇关于复制pandas.DataFrame的GROUP_CONCAT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!