tensorflow 如何处理不能存储在一个盒子中的大变量 [英] How tensorflow deals with large Variables which can not be stored in one box

查看:26
本文介绍了tensorflow 如何处理不能存储在一个盒子中的大变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过训练具有超过 10 亿个特征维度的数据来训练 DNN 模型.所以第一层权重矩阵的形状将是 (1,000,000,000, 512).这个权重矩阵太大,不能放在一个盒子里.

I want to train a DNN model by training data with more than one billion feature dimensions. So the shape of the first layer weight matrix will be (1,000,000,000, 512). this weight matrix is too large to be stored in one box.

现在有没有办法处理这么大的变量,比如把大权重矩阵分割成多个框.

By now, is there any solution to deal with such large variables, for example partition the large weight matrix to multiple boxes.

感谢奥利维尔和凯夫曼.让我添加更多关于我的问题的细节.这个例子非常稀疏,所有特征都是二进制值:0 或 1.参数权重看起来像 tf.Variable(tf.truncated_normal([1 000 000 000, 512],stddev=0.1))

Thanks Olivier and Keveman. let me add more detail about my problem. The example is very sparse and all features are binary value: 0 or 1. The parameter weight looks like tf.Variable(tf.truncated_normal([1 000 000 000, 512],stddev=0.1))

kaveman 给出的解决方案看起来很合理,我会在尝试后更新结果.

The solutions kaveman gave seem reasonable, and I will update results after trying.

推荐答案

这个问题的答案很大程度上取决于你想对权重矩阵执行什么操作.

The answer to this question depends greatly on what operations you want to perform on the weight matrix.

处理如此大量特征的典型方法是将每个特征的 512 向量视为嵌入.如果数据集中的每个示例都只有 10 亿个特征中的一个,那么您可以使用 tf.nn.embedding_lookup 函数用于查找小批量示例中存在的特征的嵌入.如果每个示例都有多个功能,但大概只有少数几个,那么您可以使用 tf.nn.embedding_lookup_sparse 查找嵌入.

The typical way to handle such a large number of features is to treat the 512 vector per feature as an embedding. If each of your example in the data set has only one of the 1 billion features, then you can use the tf.nn.embedding_lookup function to lookup the embeddings for the features present in a mini-batch of examples. If each example has more than one feature, but presumably only a handful of them, then you can use the tf.nn.embedding_lookup_sparse to lookup the embeddings.

在这两种情况下,您的权重矩阵都可以分布在多台机器上.也就是说,这两个函数的 params 参数是一个张量列表.您可以对大型权重矩阵进行分片,并将分片定位在不同的机器上.请查看 tf.device 和 <关于分布式执行的 href="https://www.tensorflow.org/deploy/distributed" rel="nofollow noreferrer">入门,以了解数据和计算如何分布在多台机器上.

In both these cases, your weight matrix can be distributed across many machines. That is, the params argument to both of these functions is a list of tensors. You would shard your large weight matrix and locate the shards in different machines. Please look at tf.device and the primer on distributed execution to understand how data and computation can be distributed across many machines.

如果你真的想对权重矩阵进行一些密集运算,比如将矩阵与另一个矩阵相乘,这仍然是可以想象的,尽管 TensorFlow 中没有现成的配方来处理这个问题.你仍然会在机器上分片你的权重矩阵.但是,您必须在权重矩阵的分布式块上手动构建矩阵乘法序列,并将结果合并.

If you really want to do some dense operation on the weight matrix, say, multiply the matrix with another matrix, that is still conceivable, although there are no ready-made recipes in TensorFlow to handle that. You would still shard your weight matrix across machines. But then, you have to manually construct a sequence of matrix multiplies on the distributed blocks of your weight matrix, and combine the results.

这篇关于tensorflow 如何处理不能存储在一个盒子中的大变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆