如何在 pandas 上堆叠这一特定行? [英] How to stack this specific row on pandas?
问题描述
考虑下面的df
df_dict = {'name': {0: 'john',1:'约翰',4:'达芙妮'},地址":{0:约翰的地址",1:'约翰的地址',4:'达芙妮地址'},'phonenum1':{0:7870395,1:7870450,4:7373209},'phonenum2':{0:无,1:123450,4:无},'phonenum3':{0:无,1:123456,4:无}}df = pd.DataFrame(df_dict)姓名 地址 phonenum1 phonenum2 phonenum30 john johns 地址 7870395 NaN NaN1 john johns 地址 7870450 123450.0 123456.04 daphne daphne 地址 7373209 NaN NAN
如何解开 phonenum
数据的堆栈,以便找到相同 full_name 和地址的条目的输出如下所示?
你可以使用 set_index
和 stack
,然后使用 groupby.cumcount
> 按名称和地址获取后面的列名,然后 unstack
并执行一些 reset_index
和 rename_axis
以进行修饰.
df_ = (df.set_index(['name', 'address']).堆().reset_index(级别=-1).assign(cc=lambda x: x.groupby(level=['name', 'address']).cumcount()+1).set_index('cc', append=True)[0].unstack().add_prefix('phonenum').reset_index().rename_axis(columns=None))打印 (df_)姓名 地址 phonenum1 phonenum2 phonenum3 phonenum40 约翰约翰地址 7870395.0 7870450.0 123450.0 123456.01 daphne daphne 地址 7373209.0 NaN NaN NaN
代码的方式是,您可以在关闭括号之前从第二行到最后一行注释,然后逐行取消注释以查看每次发生的情况.
Consider the below df
df_dict = {'name': {0: ' john',
1: ' john',
4: ' daphne '},
'address': {0: 'johns address',
1: 'johns address',
4: 'daphne address'},
'phonenum1': {0: 7870395,
1: 7870450,
4: 7373209},
'phonenum2': {0: None, 1: 123450 , 4: None},
'phonenum3': {0: None, 1: 123456, 4: None}
}
df = pd.DataFrame(df_dict)
name address phonenum1 phonenum2 phonenum3
0 john johns address 7870395 NaN NaN
1 john johns address 7870450 123450.0 123456.0
4 daphne daphne address 7373209 NaN NAN
How to unstack the phonenum
data so the output is presented as below for entries where the same full_name and address is found?
name address phonenum1 phonenum2 phonenum3 phonenum4
0 john johns address 7870395 7870450 123450.0 123456.0
4 daphne daphne address 7373209 NaN NaN NaN
you can do it using set_index
and stack
, then groupby.cumcount
per name and address to get the later column names, then unstack
and do some reset_index
and rename_axis
for cosmetic.
df_ = (df.set_index(['name', 'address'])
.stack()
.reset_index(level=-1)
.assign(cc=lambda x: x.groupby(level=['name', 'address']).cumcount()+1)
.set_index('cc', append=True)
[0].unstack()
.add_prefix('phonenum')
.reset_index()
.rename_axis(columns=None)
)
print (df_)
name address phonenum1 phonenum2 phonenum3 phonenum4
0 john johns address 7870395.0 7870450.0 123450.0 123456.0
1 daphne daphne address 7373209.0 NaN NaN NaN
The way the code is, you can comment from second line to the last one before closing the parenthesis, then un-comment each line one after the other to see what is happening each time.
这篇关于如何在 pandas 上堆叠这一特定行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!