如何旋转Pandas DataFrame列以创建二进制“值表"? [英] How to pivot pandas DataFrame column to create binary "value table"?

查看:51
本文介绍了如何旋转Pandas DataFrame列以创建二进制“值表"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下熊猫数据框:

import pandas as pd
df = pd.read_csv("filename.csv")

df 
     A   B         C         D        E    
0    a  0.469112 -0.282863 -1.509059  cat  
1    c -1.135632  1.212112 -0.173215  dog   
2    e  0.119209 -1.044236 -0.861849  dog   
3    f -2.104569 -0.494929  1.071804  bird   
4    g -2.224569 -0.724929  2.234213  elephant
...

我想基于列E 中类别值的标识创建更多列,以使数据框如下所示:

I would like to create more columns based on the identity of categorical values in column E such that the dataframe looks like this:

 df 
         A   B         C         D        cat    dog     bird    elephant ....    
    0    a  0.469112 -0.282863 -1.509059  -1      0       0       0
    1    c -1.135632  1.212112 -0.173215   0     -1       0       0
    2    e  0.119209 -1.044236 -0.861849   0     -1       0       0
    3    f -2.104569 -0.494929  1.071804   0      0      -1       0
    4    g -2.224569 -0.724929  2.234213   0      0       0       0
    ...

也就是说,我将 E 列的值设置为基于 E 值的二进制矩阵,如果 1 值是否存在,对于所有其他不存在的值,是否为 0 (在这里,我希望它为 -1 或负二进制矩阵")?

That is, I pivot the values for column E to be a binary matrix based on the values of E, giving 1 if the value exists, and 0 for all others where it doesn't (here, I would like it to be -1 or a "negative binary matrix")?

我不确定熊猫中哪个函数最能做到这一点:也许 pandas.DataFrame.unstack()?

I'm not sure which function in pandas best does this: maybe pandas.DataFrame.unstack()?

任何见识表示赞赏!

推荐答案

使用 pd.concat drop get_Dummies

pd.concat([df.drop('E', 1), pd.get_dummies(df.E).mul(-1)], axis=1)

这篇关于如何旋转Pandas DataFrame列以创建二进制“值表"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆