如何将一键编码转换为整数? [英] How to convert one-hot encodings into integers?

查看:90
本文介绍了如何将一键编码转换为整数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个形状为(100,10)的numpy数组数据集.每一行都是单次编码.我想将其传输到形状为(100,)的nd数组中,以便将每个向量行都转换为表示非零索引的整数.有没有使用numpy或tensorflow做到这一点的快速方法?

正如Franck Dernoncourt指出的那样,由于一种热编码只有单个1,其余为零,因此您可以在此特定示例中使用argmax.通常,如果要在numpy数组中查找值,则可能要查询是否有NumPy函数返回数组中某物的第一个索引?

由于一个热点向量是一个全为0且一个为1的向量,因此您可以执行以下操作:

>>> import numpy as np
>>> a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
>>> [np.where(r==1)[0][0] for r in a]
[1, 0, 3]

这只是建立索引列表,每行索引为1. [0] [0]索引只是放弃np.where返回的结构(具有数组的元组),该结构比您要的要多.

对于任何特定的行,您只想索引到a.例如,在第零行的索引1中找到1.

>>> np.where(a[0]==1)[0][0]
1

I have a numpy array data set with shape (100,10). Each row is a one-hot encoding. I want to transfer it into a nd-array with shape (100,) such that I transferred each vector row into a integer that denote the index of the nonzero index. Is there a quick way of doing this using numpy or tensorflow?

解决方案

As pointed out by Franck Dernoncourt, since a one hot encoding only has a single 1 and the rest are zeros, you can use argmax for this particular example. In general, if you want to find a value in a numpy array, you'll probabaly want to consult numpy.where. Also, this stack exchange question:

Is there a NumPy function to return the first index of something in an array?

Since a one-hot vector is a vector with all 0s and a single 1, you can do something like this:

>>> import numpy as np
>>> a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
>>> [np.where(r==1)[0][0] for r in a]
[1, 0, 3]

This just builds a list of the index which is 1 for each row. The [0][0] indexing is just to ditch the structure (a tuple with an array) returned by np.where which is more than you asked for.

For any particular row, you just want to index into a. For example in the zeroth row the 1 is found in index 1.

>>> np.where(a[0]==1)[0][0]
1

这篇关于如何将一键编码转换为整数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆