重塑 pandas 中的数据框 [英] reshaping data frame in pandas
问题描述
假设我有这个数据框:
df = pd.DataFrame({'n':[0 ,1 ,0 ,0 ,1 ,1 ,0 ,1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14], 'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})
这就是我想要得到的:
col1 col2
l n l n
12 0 32 1
16 1 47 1
92 0 22 0
77 0 14 1
我一直在玩set_index
和stack
/unstack
方法,但是没有成功...
I've been playing around with set_index
and stack
/unstack
methods but with no success...
推荐答案
import pandas as pd
df = pd.DataFrame(
{'n':[0 ,1 ,0 ,0 ,1 ,1 ,0 ,1],'l':[12 ,16 ,92, 77 ,32 ,47, 22, 14],
'cols':['col1','col1','col1','col1','col2','col2','col2','col2']})
df['index'] = df.groupby(['cols']).cumcount()
result = df.pivot(index='index', columns='cols')
print(result)
# l n
# cols col1 col2 col1 col2
# index
# 0 12 32 0 1
# 1 16 47 1 1
# 2 92 22 0 0
# 3 77 14 0 1
如果您关心MultiIndex列中标签的顺序,则可以使用 叠放和叠放,以完全重现您发布的结果:
If you care about the order of the labels in the MultiIndex column, you could use stack and unstack to exactly reproduce result you posted:
result = result.stack(level=0).unstack(level=1)
print(result)
# cols col1 col2
# l n l n
# index
# 0 12 0 32 1
# 1 16 1 47 1
# 2 92 0 22 0
# 3 77 0 14 1
在寻找解决方案时,回头思考通常会很有用.
When looking for a solution it is often useful to think backwards.
从所需的DataFrame开始,问问自己可能进行什么操作
产生所需的DataFrame.在这种情况下,想到的操作
是pd.pivot
.然后问题变成了什么是DataFrame,
需要something
,这样
Start with the desired DataFrame and ask yourself what operation might
result in the desired DataFrame. In this case, the operation that came to mind
was pd.pivot
. Then the question becomes, what DataFrame,
something
, is needed so that
desired = something.pivot(index='index', columns='cols')
通过查看其他示例在pivot
中的作用,显然something
必须等于
By looking at other examples of pivot
in action, it became clear than something
had to equal
cols l n index
0 col1 12 0 0
1 col1 16 1 1
2 col1 92 0 2
3 col1 77 0 3
4 col2 32 1 0
5 col2 47 1 1
6 col2 22 0 2
7 col2 14 1 3
然后,您会发现是否可以找到将df
再次按摩到something
,或的方法
向后工作,将something
按摩到df
...从这个角度来看,
在这种情况下,丢失的链接变得很明显:something
具有一个index
列
df
缺少的.
Then you see if you can find a way to massage df
into something
, or again
working backwards, massage something
into df
... From this point of view, in
this case, the missing link became apparent: something
has an index
column
that df
lacked.
这篇关于重塑 pandas 中的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!