pandas 数据框-如何将多行合并为一个 [英] Pandas Dataframe - How to combine multiple rows to one
问题描述
我有一个数据集,格式为:
I have a dataset in the form:
A B
0 30 60538815980
1 30 7410811099
2 26 2238403510
3 26 2006613744
4 26 2006618783
5 26 2006613743
我想合并A值匹配的行并产生类似的结果
I want to combine the rows where the value of A matches and produce something like that
C_1 C_2 C_3 C_4
A
26 2238403510 2006613744 2006618783 2006613743
30 60538815980 7410811099 NaN NaN
我尝试用连接或合并来表达它,但到目前为止却失败了.有什么简单的方法可以表达这一点,还是我必须使用apply并创建一个新的DataFrame?
I have tried expressing it in terms of join or merge but have failed so far. Is there any simple way to express that or will I have to use apply and create a new DataFrame?
推荐答案
首先,基于列A
创建一个groupby
对象.然后创建一个新的数据框df2
,该数据框使用ix
根据列A
中的值n
为每个组的列B
编制索引.将此数据帧的索引设置为等于groupby
中的键值(即列A
中的唯一值).
First, create a groupby
object based on column A
. Then create a new dataframe df2
which uses ix
to index column B
of each group based on the value n
from column A
. Set the index of this dataframe equal to the key values from the groupby
(i.e. the unique values from column A
).
最后,使用列表推导将新的列值设置为等于C_1
,C_2
,...等.
Finally, use a list comprehension to set the new column values equal to C_1
, C_2
, ..., etc.
df = pd.DataFrame({'A': [30, 30, 26, 26, 26, 26],
'B': [60538815980, 7410811099, 2238403510,
2006613744, 2006618783, 2006613743]})
gb = df.groupby('A')
df2 = pd.DataFrame([df.ix[gb.groups[n], 'B'].values for n in gb.groups],
index=gb.groups.keys())
df2.columns = ["C_" + str(i + 1) for i in df2.columns]
df2.index.name = "A"
>>> df2
C_1 C_2 C_3 C_4
A
26 2238403510 2006613744 2006618783 2006613743
30 60538815980 7410811099 NaN NaN
这篇关于 pandas 数据框-如何将多行合并为一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!