从另一列的值列表创建多个列 [英] create multiple columns from list of values of another column

查看：84 发布时间：2020/5/23 23:40:31 python parsing pandas split dataframe

本文介绍了从另一列的值列表创建多个列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个看起来像这样的数据框:

I have the dataframe that looks like:

Groupe       Id   MotherName   FatherName    Field
Advanced    56    Laure         James        English-107,Economics, Management, History, Philosophy
Middle      11    Ann           Nicolas      Web-development, Java-2
Advanced    6     Helen         Franc        Literature, English-2
Beginner    43    Laure         James        Mathematics, History, Philosophy, Literature
Middle      14    Naomi         Franc        Java-2, Management, English-107

为进一步处理数据，我需要拆分Field列，并将其替换为如下所示的多列:

For farther work with the data, I need to split the Field column, and replace it with multiple columns that will look like:

Id English-107 Economics Management History Web-development Java-2 Literature English-2 Mathematics Philosophy
56     1         1          1           1           0          0       0             0          0         1
11     0         0          0           0           1           1      0             0            0          0

因此，这些列可以附加到初始数据框.我不知道该怎么做，因为像

So these columns could be append to the initial dataframe. I don't know how to make it, because just basic splitting like

pd.DataFrame(df.Field.str.split(',',1).tolist())

不能解决我的问题，因为我不仅需要基于列表中位置的列，还需要基于列表中每个唯一值的列.你知道我该如何处理吗?

doesn't resolve my probleme, because I need the columns based not just on the position in the list, but based on every unique value in the list. Have you any idea how I can approach it?

推荐答案

您可以使用 concat 和

You can use concat and str.get_dummies:

print pd.concat([df['Id'], df['Field'].str.get_dummies(sep=",")], axis=1)
   Id  Economics  English-107  English-2  History  Java-2  Literature  \
0  56          1            1          0        1       0           0   
1  11          0            0          0        0       1           0   
2   6          0            0          1        0       0           1   
3  43          0            0          0        1       0           1   
4  14          0            1          0        0       1           0   

   Management  Mathematics  Philosophy  Web-development  
0           1            0           1                0  
1           0            0           0                1  
2           0            0           0                0  
3           0            1           1                0  
4           1            0           0                0

如果需要计数值，则可以使用 pivot_table (我添加了一个字符串Economics进行测试):

If you need count values, you can use pivot_table (I add one string Economics for testing):

df1 = df['Field'].str.split(',',expand=True).stack()
                                            .groupby(level=0)
                                            .value_counts()
                                            .reset_index()
df1.columns=['a','b','c']
print df1.pivot_table(index='a',columns='b',values='c').fillna(0)
b  Economics  English-107  English-2  History  Java-2  Literature  Management  \
a                                                                               
0          2            1          0        1       0           0           1   
1          0            0          0        0       1           0           0   
2          0            0          1        0       0           1           0   
3          0            0          0        1       0           1           0   
4          0            1          0        0       1           0           1   

b  Mathematics  Philosophy  Web-development  
a                                            
0            0           1                0  
1            0           0                1  
2            0           0                0  
3            1           1                0  
4            0           0                0

这篇关于从另一列的值列表创建多个列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从另一列的值列表创建多个列 [英] create multiple columns from list of values of another column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从另一列的值列表创建多个列 [英] create multiple columns from list of values of another column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭