从其上三角初始化对称Theano矩阵 [英] Initializing a symmetric Theano dmatrix from its upper triangle

查看:152
本文介绍了从其上三角初始化对称Theano矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图拟合一个由对称矩阵A部分参数化的Theano模型.为了增强A的对称性,我希望能够通过仅传递上三角中的值来构造A.

I'm trying to fit a Theano model that is parametrized in part by a symmetric matrix A. In order to enforce the symmetry of A, I want to be able to construct A by passing in just the values in the upper triangle.

等效的numpy代码可能看起来像这样:

The equivalent numpy code might look something like this:

import numpy as np

def make_symmetric(p, n):
    A = np.empty((n, n), P.dtype)
    A[np.triu_indices(n)] = p
    A.T[np.triu_indices(n)] = p

# output matrix will be (n, n)
n = 4

# parameter vector
P = np.arange(n * (n + 1) / 2)

print make_symmetric(P, n)
# [[ 0.  1.  2.  3.]
#  [ 1.  4.  5.  6.]
#  [ 2.  5.  7.  8.]
#  [ 3.  6.  8.  9.]]    

但是,由于符号张量变量不支持项目分配,因此我在Theano中努力寻找一种方法.

However, since symbolic tensor variables don't support item assignment, I'm struggling to find a way to do this in Theano.

我能找到的最接近的东西是theano.tensor.diag,它使我能够从对角线构造符号矩阵:

The closest thing I could find is theano.tensor.diag, which allows me to construct a symbolic matrix from its diagonal:

import theano
from theano import tensor as te

P = te.dvector('P')
D = te.diag(P)
get_D = theano.function([P], D)

print get_D(np.arange(1, 5))
# [[ 1.  0.  0.  0.]
#  [ 0.  2.  0.  0.]
#  [ 0.  0.  3.  0.]
#  [ 0.  0.  0.  4.]]

虽然还有一个theano.tensor.triu函数,但该函数不能用于从上三角形构造矩阵,而是返回下三角形元素为零的数组的副本.

Whilst there is also a theano.tensor.triu function, this cannot be used to construct a matrix from the upper triangle, but rather returns a copy of an array with the lower triangular elements zeroed.

有什么办法可以从其上三角构造一个Theano符号矩阵?

Is there any way to construct a Theano symbolic matrix from its upper triangle?

推荐答案

您可以使用theano.tensor.triu并将结果添加到其转置中,然后减去对角线.

You could use the theano.tensor.triu and add the result to its transpose, then subtract the diagonal.

复制+粘贴代码:

import numpy as np
import theano
import theano.tensor as T
theano.config.floatX = 'float32'

mat = T.fmatrix()
sym1 = T.triu(mat) + T.triu(mat).T
diag = T.diag(T.diagonal(mat))
sym2 = sym1 - diag

f_sym1 = theano.function([mat], sym1)
f_sym2 = theano.function([mat], sym2)

m = np.arange(9).reshape(3, 3).astype(np.float32)

print m
# [[ 0.  1.  2.]
#  [ 3.  4.  5.]
#  [ 6.  7.  8.]]
print f_sym1(m)
# [[  0.   1.   2.]
#  [  1.   8.   5.]
#  [  2.   5.  16.]]
print f_sym2(m)
# [[ 0.  1.  2.]
#  [ 1.  4.  5.]
#  [ 2.  5.  8.]]

这有帮助吗?这种方法将需要传递完整的矩阵,但会忽略对角线以下的所有内容,并使用上三角来对称.

Does this help? This approach would require a full matrix to be passed, but would ignore everything below the diagonal and symmetrize using the upper triangle.

我们也可以看一下这个函数的派生.为了不处理多维输出,我们可以例如看矩阵项之和的梯度

We can also take a look at the derivative of this function. In order not to deal with a multidimensional output, we can e.g. look at the gradient of the sum of the matrix entries

sum_grad = T.grad(cost=sym2.sum(), wrt=mat)
f_sum_grad = theano.function([mat], sum_grad)

print f_sum_grad(m)
# [[ 1.  2.  2.]
#  [ 0.  1.  2.]
#  [ 0.  0.  1.]]

这反映了一个事实,即上三角条目在总数中的数字翻倍.

This reflects the fact that the upper triangular entries figure doubly in the sum.

更新:您可以进行常规索引编制:

Update: You can do normal indexing:

n = 4
num_triu_entries = n * (n + 1) / 2

triu_index_matrix = np.zeros([n, n], dtype=int)
triu_index_matrix[np.triu_indices(n)] = np.arange(num_triu_entries)
triu_index_matrix[np.triu_indices(n)[::-1]] = np.arange(num_triu_entries)

triu_vec = T.fvector()
triu_mat = triu_vec[triu_index_matrix]

f_triu_mat = theano.function([triu_vec], triu_mat)

print f_triu_mat(np.arange(1, num_triu_entries + 1).astype(np.float32))

# [[  1.   2.   3.   4.]
#  [  2.   5.   6.   7.]
#  [  3.   6.   8.   9.]
#  [  4.   7.   9.  10.]]


更新:要动态地完成所有这些操作,一种方法是编写triu_index_matrix的符号版本.这可以通过对arange进行一些改组来完成.但是可能我太复杂了.


Update: To do all of this dynamically, one way is to write a symbolic version of triu_index_matrix. This can be done with some shuffling of aranges. But probably I am overcomplicating.

n = T.iscalar()
n_triu_entries = (n * (n + 1)) / 2
r = T.arange(n)

tmp_mat = r[np.newaxis, :] + (n_triu_entries - n - (r * (r + 1)) / 2)[::-1, np.newaxis]
triu_index_matrix = T.triu(tmp_mat) + T.triu(tmp_mat).T - T.diag(T.diagonal(tmp_mat))

triu_vec = T.fvector()
sym_matrix = triu_vec[triu_index_matrix]

f_triu_index_matrix = theano.function([n], triu_index_matrix)
f_dynamic_sym_matrix = theano.function([triu_vec, n], sym_matrix)

print f_triu_index_matrix(5)
# [[ 0  1  2  3  4]
#  [ 1  5  6  7  8]
#  [ 2  6  9 10 11]
#  [ 3  7 10 12 13]
# [ 4  8 11 13 14]]
print f_dynamic_sym_matrix(np.arange(1., 16.).astype(np.float32), 5)
# [[  1.   2.   3.   4.   5.]
#  [  2.   6.   7.   8.   9.]
#  [  3.   7.  10.  11.  12.]
#  [  4.   8.  11.  13.  14.]
#  [  5.   9.  12.  14.  15.]]

这篇关于从其上三角初始化对称Theano矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆