使用数据透视表 pandas 后如何摆脱多级索引? [英] How to get rid of multilevel index after using pivot table pandas?

查看:150
本文介绍了使用数据透视表 pandas 后如何摆脱多级索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据帧(实际数据帧比该数据帧大得多):

I had following data frame (the real data frame is much more larger than this one ) :

sale_user_id    sale_product_id count
1                 1              1
1                 8              1
1                 52             1
1                 312            5
1                 315            1

然后使用以下代码对其进行重塑,以将sale_product_id中的值作为列标题移动:

Then reshaped it to move the values in sale_product_id as column headers using the following code:

reshaped_df=id_product_count.pivot(index='sale_user_id',columns='sale_product_id',values='count')

,结果数据帧为:

sale_product_id -1057   1   2   3   4   5   6   8   9   10  ... 98  980 981 982 983 984 985 986 987 99
sale_user_id                                                                                    
1                NaN    1.0 NaN NaN NaN NaN NaN 1.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3                NaN    1.0 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4                NaN    NaN 1.0 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

如您所见,我们有一个多级索引,我需要的是第一列中有sale_user_is而没有多级索引:

as you can see we have a multililevel index , what i need is to have sale_user_is in the first column without multilevel indexing:

我采用以下方法:

reshaped_df.reset_index()

结果将是这样,我仍然有sale_product_id列,但我不再需要它:

the the result would be like this i still have the sale_product_id column , but i do not need it anymore:

sale_product_id sale_user_id    -1057   1   2   3   4   5   6   8   9   ... 98  980 981 982 983 984 985 986 987 99
0                          1    NaN 1.0 NaN NaN NaN NaN NaN 1.0 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1                          3    NaN 1.0 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2                          4    NaN NaN 1.0 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 

我可以对该数据帧进行子集化处理以消除sale_product_id,但我认为这样做并不有效.我正在寻找一种在重塑原始数据帧的同时摆脱多级索引的有效方法

i can subset this data frame to get rid of sale_product_id but i don't think it would be efficient.I am looking for an efficient way to get rid of multilevel indexing while reshaping the original data frame

推荐答案

您只需要删除索引名,请使用

You need remove only index name, use rename_axis (new in pandas 0.18.0):

print (reshaped_df)
sale_product_id  1    8    52   312  315
sale_user_id                            
1                  1    1    1    5    1

print (reshaped_df.index.name)
sale_user_id

print (reshaped_df.rename_axis(None))
sale_product_id  1    8    52   312  315
1                  1    1    1    5    1

另一种在 0.18.0 以下的熊猫中运行的解决方案:

Another solution working in pandas below 0.18.0:

reshaped_df.index.name = None
print (reshaped_df)

sale_product_id  1    8    52   312  315
1                  1    1    1    5    1


如果需要,还删除列名:

print (reshaped_df.columns.name)
sale_product_id

print (reshaped_df.rename_axis(None).rename_axis(None, axis=1))
   1    8    52   312  315
1    1    1    1    5    1

另一种解决方案:

reshaped_df.columns.name = None
reshaped_df.index.name = None
print (reshaped_df)
   1    8    52   312  315
1    1    1    1    5    1

通过评论

您需要 重置索引与参数 drop = True :

reshaped_df = reshaped_df.reset_index(drop=True)
print (reshaped_df)
sale_product_id  1    8    52   312  315
0                  1    1    1    5    1

#if need reset index nad remove column name
reshaped_df = reshaped_df.reset_index(drop=True).rename_axis(None, axis=1)
print (reshaped_df)
   1    8    52   312  315
0    1    1    1    5    1

如果只需要删除列名称:

Of if need remove only column name:

reshaped_df = reshaped_df.rename_axis(None, axis=1)
print (reshaped_df)
              1    8    52   312  315
sale_user_id                         
1               1    1    1    5    1

Edit1:

因此,如果需要从 index 创建新列并删除列名:

So if need create new column from index and remove columns names:

reshaped_df =  reshaped_df.rename_axis(None, axis=1).reset_index() 
print (reshaped_df)
   sale_user_id  1  8  52  312  315
0             1  1  1   1    5    1

这篇关于使用数据透视表 pandas 后如何摆脱多级索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆