如何在TensorFlow中的SparseTensor中选择一行? [英] How can I select a row from a SparseTensor in TensorFlow?

查看:595
本文介绍了如何在TensorFlow中的SparseTensor中选择一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说,如果我有两个 SparseTensor 如下:

Say, if I have two SparseTensors as following:

[[1, 0, 0, 0],
 [2, 0, 0, 0],
 [1, 2, 0, 0]]

[[1.0, 0, 0, 0],
 [1.0, 0, 0, 0],
 [0.3, 0.7, 0, 0]]

,我想从其中提取前两行。我需要索引和非零项的值都为 SparseTensor s,以便将结果传递给 tf.nn.embedding_lookup_sparse 。我该怎么办?

and I want to extract the first two rows out of them. I need both indices and values of non-zeros entries as SparseTensors so that I can pass the result to tf.nn.embedding_lookup_sparse. How can I do this?

我的应用程序是:
我想使用单词嵌入,这在TensorFlow中非常简单。但是现在我想使用稀疏嵌入,即:对于普通单词,它们具有自己的嵌入。对于稀有词,它们的嵌入是常见词的嵌入的稀疏线性组合。
所以我需要两本食谱来说明稀疏的嵌入是如何组成的。在上述示例中,菜谱说:对于第一个单词,其嵌入由权重为1.0的自身嵌入组成。第二个单词的情况相似。对于最后一个单词,它说:该单词的嵌入是前两个单词的嵌入的线性组合,并且相应的权重分别为0.3和0.7。
我需要提取一行,然后将索引和权重输入到 tf.nn.embedding_lookup_sparse 以获得最终的嵌入。我该如何在TensorFlow中做到这一点?

My application is: I want to use word embeddings, which is quite straight forward in TensorFlow. But now I want to use sparse embeddings, i.e.: for common words, they have their own embeddings. For rare words, their embeddings are a sparse linear combination of embeddings of common words. So I need two cookbooks to indicate how sparse embeddings are composed. In the aforementioned example, the cookbook says: For the first word, it's embedding consists of its own embedding with weight 1.0. Things are similar for the second word. For the last word, it says: the embedding of this word is a linear combination of the embeddings of the first two words, and the corresponding weights are 0.3 and 0.7 respectively. I need to extract a row, then feed the indices and weights to tf.nn.embedding_lookup_sparse to obtain the final embeddings. How can I do that in TensorFlow?

还是我需要解决它,即:从TensorFlow中预处理我的数据并处理菜谱?

Or I need to work around it, i.e.: preprocess my data and deal with the cookbook out of TensorFlow?

推荐答案

我与这里一位对这方面有更多了解的工程师签到,这是他所传递的:

I checked in with one of the engineers here who knows more about this area, and here's what he passed on:

我不确定我们是否可以有效地实现此功能,但这是使用dynamic_partition并收集操作的不太理想的实现。

I am not sure if we have an efficient implementation of the this, but here is a not-so-optimal implementation using dynamic_partition and gather ops.

def sparse_slice(indices, values, needed_row_ids):
   num_rows = tf.shape(indices)[0]
   partitions = tf.cast(tf.equal(indices[:,0], needed_row_ids), tf.int32)
   rows_to_gather = tf.dynamic_partition(tf.range(num_rows), partitions, 2)[1]
   slice_indices = tf.gather(indices, rows_to_gather)
   slice_values = tf.gather(values, rows_to_gather)
   return slice_indices, slice_values

with tf.Session().as_default():
  indices = tf.constant([[0,0], [1, 0], [2, 0], [2, 1]])
  values = tf.constant([1.0, 1.0, 0.3, 0.7], dtype=tf.float32)
  needed_row_ids = tf.constant([1])
  slice_indices, slice_values = sparse_slice(indices, values, needed_row_ids)
  print(slice_indices.eval(), slice_values.eval())

更新:

工程师也发送了一个示例来帮助多行,谢谢指出! / p>

The engineer sent on an example to help with multiple rows too, thanks for pointing that out!

def sparse_slice(indices, values, needed_row_ids):
  needed_row_ids = tf.reshape(needed_row_ids, [1, -1])
  num_rows = tf.shape(indices)[0]
  partitions = tf.cast(tf.reduce_any(tf.equal(tf.reshape(indices[:,0], [-1, 1]), needed_row_ids), 1), tf.int32)
  rows_to_gather = tf.dynamic_partition(tf.range(num_rows), partitions, 2)[1]
  slice_indices = tf.gather(indices, rows_to_gather)
  slice_values = tf.gather(values, rows_to_gather)
  return slice_indices, slice_values

with tf.Session().as_default():
  indices = tf.constant([[0,0], [1, 0], [2, 0], [2, 1]])
  values = tf.constant([1.0, 1.0, 0.3, 0.7], dtype=tf.float32)
  needed_row_ids = tf.constant([0, 2])

这篇关于如何在TensorFlow中的SparseTensor中选择一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆