为许多小的矩​​阵向量乘法优化 Tensorflow [英] Optimizing Tensorflow for many small matrix-vector multiplications

查看:57
本文介绍了为许多小的矩​​阵向量乘法优化 Tensorflow的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要构建胶囊网络训练脚本,我需要计算许多小的矩​​阵向量乘法.每个权重矩阵的大小最多为 20 x 20.
权重矩阵数超过900个.

To build up a capsule network training script, I need to compute many small matrix-vector multiplications. The size of each weight matrix is at most 20 by 20.
The number of weight matrices is more more than 900.

我很好奇 tf.matmultf.linalg.matvec 是最好的选择.有人能给我一个优化训练脚本的提示吗?

I'm curious tf.matmul or tf.linalg.matvec is the best option for this. Could anybody give me a hint to optimize the training script?

推荐答案

查看您所指的笔记本,看来你有以下参数:

Looking at the notebook that you are referring to, it seems you have the following parameters:

batch_size = 50
caps1_n_caps = 1152
caps1_n_dims = 8
caps2_n_caps = 10
caps2_n_dims = 16

然后你有一个形状为 (caps1_n_caps, caps2_n_caps, caps2_n_dims, caps1_n_dims) 的张量 w (在笔记本中它有一个大小为 1 的初始尺寸 我正在跳过)和另一个张量 caps1_output 形状 (batch_size, caps1_n_caps, caps1_n_dims).并且您需要将它们组合以生成具有形状 (batch_size, caps1_n_caps, caps1_n_dims, caps2_n_dims)caps2_predicted.

And then you have a tensor w with shape (caps1_n_caps, caps2_n_caps, caps2_n_dims, caps1_n_dims) (in the notebook it has an initial dimension with size 1 that I am skipping) and another tensor caps1_output with shape (batch_size, caps1_n_caps, caps1_n_dims). And you need to combine them to produce caps2_predicted with shape (batch_size, caps1_n_caps, caps1_n_dims, caps2_n_dims).

在笔记本中,他们平铺张量以便使用 tf.linalg.matmul,但实际上您只需使用 tf.einsum:

In the notebook they tile the tensors in order to operate them with tf.linalg.matmul, but actually you can compute the same result without any tiling just using tf.einsum:

import tensorflow as tf

batch_size = 50
caps1_n_caps = 1152
caps1_n_dims = 8
caps2_n_caps = 10
caps2_n_dims = 16
w = tf.zeros((caps1_n_caps, caps2_n_caps, caps2_n_dims, caps1_n_dims), dtype=tf.float32)
caps1_output = tf.zeros((batch_size, caps1_n_caps, caps1_n_dims), dtype=tf.float32)
caps2_predicted = tf.einsum('ijkl,bil->bilk', w, caps1_output)
print(caps2_predicted.shape)
# (50, 1152, 8, 16)


我不确定我是否完全理解你想要什么,但你说你想要计算类似的东西:


I'm not sure if I have understood exactly what you want, but you say you want to compute something like:

ûij = Wij× ui

对于几个矩阵<​​em>W和向量u的集合.假设您有 900 个矩阵和向量,矩阵的大小为 20×20,向量的大小为 20,您可以将它们表示为两个张量 ws,形状为 (900, 20, 20)us,形状为 (900, 20).如果这样做,结果 us_hat,形状为 (900, 20, 20),将简单地计算为:

For a collection of several matrices W and vectors u. Assuming you have 900 matrices and vectors, matrices have size 20×20 and vectors have size 20, you can represent them as two tensors, ws, with shape (900, 20, 20), and us, with shape (900, 20). If you do that, you result us_hat, with shape (900, 20, 20), would be computed simply as:

us_hat = ws * tf.expand_dims(us, axis=-1)

这篇关于为许多小的矩​​阵向量乘法优化 Tensorflow的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆