将Pandas Dataframe列转换为一个热门标签 [英] Converting a Pandas Dataframe column into one hot labels

查看:99
本文介绍了将Pandas Dataframe列转换为一个热门标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与此类似的熊猫数据框:

I have a pandas dataframe similar to this:

  Col1   ABC
0  XYZ    A
1  XYZ    B
2  XYZ    C

通过使用ABC列上的pandas get_dummies()函数,我可以获得此信息:

By using the pandas get_dummies() function on column ABC, I can get this:

  Col1   A   B   C
0  XYZ   1   0   0
1  XYZ   0   1   0
2  XYZ   0   0   1

虽然我需要类似的内容,但ABC列的数据类型为list / array:

While I need something like this, where the ABC column has a list / array datatype:

  Col1    ABC
0  XYZ    [1,0,0]
1  XYZ    [0,1,0]
2  XYZ    [0,0,1]

我尝试使用get_dummies函数,然后将所有列组合到所需的列中.我找到了很多答案,解释了如何将多个列组合为字符串,例如:

I tried using the get_dummies function and then combining all the columns into the column which I wanted. I found lot of answers explaining how to combine multiple columns as strings, like this: Combine two columns of text in dataframe in pandas/python. But I cannot figure out a way to combine them as a list.

此问题介绍了使用sklearn的OneHotEncoder的想法,但我无法使其正常工作. 如何一键编码一个熊猫数据框的列?

This question introduced the idea of using sklearn's OneHotEncoder, but I couldn't get it to work. How do I one-hot encode one column of a pandas dataframe?

另一件事:我遇到的所有答案都有解决方案,在合并它们时必须手动键入列名称.有没有一种使用Dataframe.iloc()或拼接机制将列组合到列表中的方法?

One more thing: All the answers I came across had solutions where the column names had to be manually typed while combining them. Is there a way to use Dataframe.iloc() or splicing mechanism to combine columns into a list?

推荐答案

以下是使用熊猫替代品:

In [370]: df['new'] = df['ABC'].str.get_dummies().values.tolist()

In [371]: df
Out[371]:
  Col1 ABC        new
0  XYZ   A  [1, 0, 0]
1  XYZ   B  [0, 1, 0]
2  XYZ   C  [0, 0, 1]

这篇关于将Pandas Dataframe列转换为一个热门标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆