重塑 pandas 中的数据框 [英] reshaping data frame in pandas

查看:66
本文介绍了重塑 pandas 中的数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这个数据框:

df = pd.DataFrame({'n':[0 ,1 ,0 ,0 ,1 ,1 ,0 ,1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14], 'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})

这就是我想要得到的:

col1    col2
l   n   l   n
12  0   32  1
16  1   47  1
92  0   22  0
77  0   14  1

我一直在玩set_indexstack/unstack方法,但是没有成功...

I've been playing around with set_index and stack/unstack methods but with no success...

推荐答案

import pandas as pd

df = pd.DataFrame(
    {'n':[0 ,1 ,0 ,0 ,1 ,1 ,0 ,1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14],
     'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})

df['index'] = df.groupby(['cols']).cumcount()
result = df.pivot(index='index', columns='cols')
print(result)
#           l           n      
# cols   col1  col2  col1  col2
# index                        
# 0        12    32     0     1
# 1        16    47     1     1
# 2        92    22     0     0
# 3        77    14     0     1

如果您关心MultiIndex列中标签的顺序,则可以使用 叠放和叠放,以完全重现您发布的结果:

If you care about the order of the labels in the MultiIndex column, you could use stack and unstack to exactly reproduce result you posted:

result = result.stack(level=0).unstack(level=1)
print(result)

# cols   col1     col2   
#           l  n     l  n
# index                  
# 0        12  0    32  1
# 1        16  1    47  1
# 2        92  0    22  0
# 3        77  0    14  1


在寻找解决方案时,回头思考通常会很有用.


When looking for a solution it is often useful to think backwards.

从所需的DataFrame开始,问问自己可能进行什么操作 产生所需的DataFrame.在这种情况下,想到的操作 是pd.pivot.然后问题变成了什么是DataFrame, 需要something,这样

Start with the desired DataFrame and ask yourself what operation might result in the desired DataFrame. In this case, the operation that came to mind was pd.pivot. Then the question becomes, what DataFrame, something, is needed so that

desired = something.pivot(index='index', columns='cols') 

通过查看其他示例pivot中的作用,显然something必须等于

By looking at other examples of pivot in action, it became clear than something had to equal

   cols   l  n  index
0  col1  12  0      0
1  col1  16  1      1
2  col1  92  0      2
3  col1  77  0      3
4  col2  32  1      0
5  col2  47  1      1
6  col2  22  0      2
7  col2  14  1      3

然后,您会发现是否可以找到将df再次按摩到something的方法 向后工作,将something按摩到df ...从这个角度来看, 在这种情况下,丢失的链接变得很明显:something具有一个indexdf缺少的.

Then you see if you can find a way to massage df into something, or again working backwards, massage something into df... From this point of view, in this case, the missing link became apparent: something has an index column that df lacked.

这篇关于重塑 pandas 中的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆