Python Pandas:转置还是堆栈? [英] Python Pandas: Transpose or Stack?

查看:138
本文介绍了Python Pandas:转置还是堆栈?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我在下面有一个示例数据框.我无法通过转置获得所需的结果....

Hello I have an example data frame below. I am having trouble obtain the desired results through transpose....

x = ('P', 'P', 'O', 'DNP', 'D')
y = ('O', 'O', 'D', 'DNP', 'DNP')
z = ('P', 'P', 'O', 'U', 'DNP')
a = ('O', 'O', 'D', 'DNP', 'DNP')
b = ('P', 'DNP', 'O', 'U', 'DNP')
ID = ['ID1', 'ID2', 'ID3', 'ID4', 'ID5']
df = DataFrame(zip(ID, a, b, x, y, z), columns = ['id', 'a', 'b', 'x', 'y', 'z'])

    id    a    b    x    y    z
0  ID1    O    P    P    O    P
1  ID2    O  DNP    P    O    P
2  ID3    D    O    O    D    O
3  ID4  DNP    U  DNP  DNP    U
4  ID5  DNP  DNP    D  DNP  DNP

一个简单的df.transpose()产生...

A simple df.transpose() produces...

0    1    2    3    4
id  ID1  ID2  ID3  ID4  ID5
a     O    O    D  DNP  DNP
b     P  DNP    O    U  DNP
x     P    P    O  DNP    D
y     O    O    D  DNP  DNP
z     P    P    O    U  DNP

所需的输出如下....

The desired output is as follows....

   ID1    a    O
   ID1    b    P
   ID1    x    P
   ID1    y    O
   ID1    z    P
   ID2    a    O
   ID2    b    DNP
   ID2    x    P
   ID2    y    O
   ID2    z    P

以此类推.....感谢您的帮助!

and so on and so forth..... I appreciate any help!

推荐答案

您可以使用 pd.melt :

You could use pd.melt:

In [23]: pd.melt(df, id_vars=['id'], var_name='colvals', value_name='DOPU')
Out[23]: 
     id colvals DOPU
0   ID1       a    O
1   ID2       a    O
2   ID3       a    D
...
21  ID2       z    P
22  ID3       z    O
23  ID4       z    U
24  ID5       z  DNP


或者,您也可以在调用id设置为索引. ="nofollow"> stack :


Or, alternatively, you could set id as the index before calling stack:

In [21]: df.set_index('id').stack()
Out[21]: 
id    
ID1  a      O
     b      P
     x      P
     y      O
     z      P
...         
ID5  a    DNP
     b    DNP
     x      D
     y    DNP
     z    DNP
dtype: object

stack将列级别值移动到索引中.由于预期的结果 索引中也具有id值,自然使用

stack moves the column level values into the index. Since the desired result has id values in the index as well, it is natural to use set_index to move the id column into the index first, and then to call stack.

调用 reset_index 进行移动索引级别进入DataFrame列:

Call reset_index to move the index levels into DataFrame columns:

In [164]: df.columns.name = 'colvals'
In [165]: df.set_index('id').stack().reset_index()
Out[165]: 
     id colvals    0
0   ID1       a    O
1   ID1       b    P
2   ID1       x    P
3   ID1       y    O
4   ID1       z    P
...
20  ID5       a  DNP
21  ID5       b  DNP
22  ID5       x    D
23  ID5       y  DNP
24  ID5       z  DNP

这篇关于Python Pandas:转置还是堆栈?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆