融化大 pandas DataFrame [英] Melt a pandas DataFrame

查看:100
本文介绍了融化大 pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的pandas DataFrame:

df = pd.DataFrame({'custid':[1,2,3,4],
...: 'prod1':['jeans','tshirt','jacket','tshirt'],
...: 'prod1_hnode1':[1,2,3,2],
...: 'prod1_hnode2':[6,7,8,7],
...: 'prod2':['tshirt','jeans','jacket','shirt'],
...: 'prod2_hnode1':[2,1,3,4],
...: 'prod2_hnode2':[7,6,8,7]})

In [54]: df
Out[54]: 
    custid   prod1  prod1_hnode1  prod1_hnode2   prod2  prod2_hnode1  \
0       1   jeans             1             6  tshirt             2   
1       2  tshirt             2             7   jeans             1   
2       3  jacket             3             8  jacket             3   
3       4  tshirt             2             7   shirt             4   

   prod2_hnode2  
0             7  
1             6  
2             8  
3             7  

如何将其转换为以下格式:

How can I convert this to the following format:

dfnew = pd.DataFrame({'custid':[1,1,2,2,3,3,4,4],
...: 'prod':['prod1','prod2','prod1','prod2','prod1','prod2','prod1','prod2'],
...: 'rec':['jeans','tshirt','tshirt','jeans','jacket','jacket','tshirt','shirt'],
...: 'hnode1':[1,2,2,1,3,3,2,4],
...: 'hnode2':[6,7,7,6,8,8,7,7]})


In [56]: dfnew
Out[56]: 
   custid  hnode1  hnode2   prod     rec
0       1       1       6  prod1   jeans
1       1       2       7  prod2  tshirt
2       2       2       7  prod1  tshirt
3       2       1       6  prod2   jeans
4       3       3       8  prod1  jacket
5       3       3       8  prod2  jacket
6       4       2       7  prod1  tshirt
7       4       4       7  prod2   shirt

推荐答案

使用:

  • set_index by column custid
  • create MultiIndex in columns by split
  • replace NaNs in columns by rec
  • stack by first level
  • reset_index for columns from MultiIndex
  • rename column
df = df.set_index('custid')
df.columns = df.columns.str.split('_', expand=True)
df = df.rename(columns={np.nan:'rec'})
cols = ['custid','hnode1','hnode2','prod','rec']
df = df.stack(0).reset_index().rename(columns={'level_1':'prod'}).reindex(columns=cols)
print (df)
   custid  hnode1  hnode2   prod     rec
0       1       1       6  prod1   jeans
1       1       2       7  prod2  tshirt
2       2       2       7  prod1  tshirt
3       2       1       6  prod2   jeans
4       3       3       8  prod1  jacket
5       3       3       8  prod2  jacket
6       4       2       7  prod1  tshirt
7       4       4       7  prod2   shirt

这篇关于融化大 pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆