向量化序列说明 [英] Vectorize Sequences explanation

查看：135 发布时间：2020/4/25 10:26:56 python python-3.x deep-learning keras

本文介绍了向量化序列说明的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

学习使用Python进行深度学习，我无法理解下面的简单代码，这些代码将整数序列编码为二进制矩阵.

Studying Deep Learning with Python, I can't comprehend the following simple batch of code which encodes the integer sequences into a binary matrix.

def vectorize_sequences(sequences, dimension=10000):
    # Create an all-zero matrix of shape (len(sequences), dimension)
    results = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
       results[i, sequence] = 1.  # set specific indices of results[i] to 1s
    return results

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

x_train = vectorize_sequences(train_data)

x_train的输出类似于

And the output of x_train is something like

x_train [0] array([0.，1.，1.，...，0.，0.，0.])

x_train[0] array([ 0., 1.,1., ...,0.,0.,0.])

有人可以在x_train数组中略述0.存在的情况，而在每个下一个i迭代中仅附加1.吗? 我的意思是不应该全为1?

Can someone put some light of the 0.'s existance in x_train array while only 1.'s are appending in each next i iteration? I mean shouldn't be all 1's?

推荐答案

此处的for循环未处理所有矩阵.如您所见，它枚举了序列的元素，因此它仅在一维上循环. 让我们举一个简单的例子:

The for loop here is not processing all the matrix. As you can see, it enumerates elements of the sequence, so it's looping only on one dimension. Let's take a simple example :

t = np.array([1,2,3,4,5,6,7,8,9]) r = np.zeros((len(t), 10))

输出

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

然后我们以与您相同的方式修改元素:

then we modify elements with the same way you have :

for i, s in enumerate(t): r[i,s] = 1.

array([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])

您会看到for循环仅修改了一组索引为[i，s]的元素(len(t))(在这种情况下为(0，1)，(1，2)，(2， 3)，依此类推))

you can see that the for loop modified only a set of elements (len(t)) which has index [i,s] (in this case ; (0, 1), (1, 2), (2, 3), an so on))

这篇关于向量化序列说明的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

向量化序列说明 [英] Vectorize Sequences explanation

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

向量化序列说明 [英] Vectorize Sequences explanation

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭