尝试使用Tensorflow理解用于NLP的CNN教程 [英] Trying to understand CNNs for NLP tutorial using Tensorflow

查看：127 发布时间：2020/5/18 0:58:13 tensorflow nlp conv-neural-network

本文介绍了尝试使用Tensorflow理解用于NLP的CNN教程的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在关注此教程，以了解NLP中的CNN.尽管有我面前的代码，但有些事情我还是不明白.我希望有人可以在这里清除一些内容.

I am following this tutorial in order to understand CNNs in NLP. There are a few things which I don't understand despite having the code in front of me. I hope somebody can clear a few things up here.

首要的事情是TextCNN对象的sequence_length参数.在github上的示例中，这只是56，我认为这是训练数据中所有句子的最大长度.这意味着self.input_x是一个56维向量，其中将仅包含每个单词的句子字典中的索引.

The first rather minor thing is the sequence_lengthparameter of the TextCNN object. In the example on github this is just 56 which I think is the max-length of all sentences in the training data. This means that self.input_x is a 56-dimensional vector which will contain just the indices from the dictionary of a sentence for each word.

此列表进入tf.nn.embedding_lookup(W, self.intput_x)，它将返回一个矩阵，该矩阵由self.input_x给出的那些单词的单词嵌入组成.根据此答案，此操作类似于对numpy使用索引:

This list goes into tf.nn.embedding_lookup(W, self.intput_x) which will return a matrix consisting of the word embeddings of those words given by self.input_x. According to this answer this operation is similar to using indexing with numpy:

matrix = np.random.random([1024, 64]) 
ids = np.array([0, 5, 17, 33])
print matrix[ids]

但是这里的问题是self.input_x大多数时候看起来像[1 3 44 25 64 0 0 0 0 0 0 0 .. 0 0].如果我假设tf.nn.embedding_lookup忽略值0，我是否正确?

But the problem here is that self.input_x most of the time looks like [1 3 44 25 64 0 0 0 0 0 0 0 .. 0 0]. So am I correct if I assume that tf.nn.embedding_lookup ignores the value 0?

我不明白的另一件事是tf.nn.embedding_lookup在这里的工作方式:

Another thing I don't get is how tf.nn.embedding_lookup is working here:

# Embedding layer
with tf.device('/cpu:0'), tf.name_scope("embedding"):
    W = tf.Variable(
        tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
            name="W")
    self.embedded_chars = tf.nn.embedding_lookup(W, self.input_x)
    self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

我假设taht self.embedded_chars是输入到CNN的 actual 的矩阵，其中每一行代表一个单词的单词嵌入.但是tf.nn.embedding_lookup如何知道self.input_x给定的那些索引?

I assume, taht self.embedded_chars is the matrix which is the actual input to the CNN where each row represents the word embedding of one word. But how can tf.nn.embedding_lookup know about those indices given by self.input_x?

我在这里不明白的最后一件事是

The last thing which I don't understand here is

W是我们在训练期间学习的嵌入矩阵.我们使用随机均匀分布对其进行初始化. tf.nn.embedding_lookup创建实际的嵌入操作.嵌入操作的结果是形状为[None, sequence_length, embedding_size]的3维张量.

W is our embedding matrix that we learn during training. We initialize it using a random uniform distribution. tf.nn.embedding_lookup creates the actual embedding operation. The result of the embedding operation is a 3-dimensional tensor of shape [None, sequence_length, embedding_size].

这是否意味着我们实际上在此处学习词嵌入?本教程在开始时指出:

Does this mean that we are actually learning the word embeddings here? The tutorial states at the beginning:

我们不会将预训练的word2vec向量用于我们的词嵌入.相反，我们从头开始学习嵌入.

We will not used pre-trained word2vec vectors for our word embeddings. Instead, we learn embeddings from scratch.

但是我看不到这实际发生的代码行. 嵌入层的代码看起来好像没有什么在训练或学习的-发生在哪里?

But I don't see a line of code where this is actually happening. The code of the embedding layer does not look like as if there is anything being trained or learned - so where is it happening?

尝试使用Tensorflow理解用于NLP的CNN教程 [英] Trying to understand CNNs for NLP tutorial using Tensorflow

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

尝试使用Tensorflow理解用于NLP的CNN教程 [英] Trying to understand CNNs for NLP tutorial using Tensorflow

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭