快数组上的random.multivariate_normal? [英] random.multivariate_normal on a dask array?

查看:116
本文介绍了快数组上的random.multivariate_normal?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力寻找一种方法来获取适用于繁琐工作流程的计算。

I've been struggling to find a way to get this calc that works for a dask workflow.

我有使用np.random.mulivariate_normal函数和虽然许多类型可以在快速数组中使用,但似乎没有。如此...。我试图根据dask中提供的示例创建自己的文档

I have code that uses np.random.mulivariate_normal function and while many of these types are available to us on dask array it seems this one it not. Sooo.... I attempted to create my own based on an example provided in the dask documentation.

这是我的尝试,出现了我很难理解的错误。我还提供了随机输入变量以使其易于复制:

Here is my attempt which is giving errors that I am having difficulty understanding. I also provided random input variables to make it easy to replicate:

import numpy as np
from dask.distributed import Client
import dask.array as da

def mvn(mu, sigma, n, blocksize):
    chunks = ((blocksize,) * (n // blocksize),
              (blocksize,) * (n // blocksize))

    name = 'mvn'   # unique identifier

    dsk = {(name, i, j): (np.random.multivariate_normal(mu,sigma, blocksize))
                         if i == j else
                         (np.zeros, (blocksize, blocksize))
             for i in range(n // blocksize)
             for j in range(n // blocksize)}

    dtype = np.random.multivariate_normal(0).dtype  # take dtype default from numpy

    return da.Array(dsk, name, chunks, dtype)

n = 10000
A = da.random.normal(0, 1, size=(n,n), chunks=(1000, 1000))
sigma = da.dot(A,A.transpose())
mu = 4.0*da.ones(n, chunks = 1000)
R =  da.numpy.random.mvn(mu, sigma, n, chunks=(100))

任何建议,或者我在这里远远超出了我应放弃的所有希望?谢谢!

Any suggestions or am I so far off the mark here that I should abandon all hope? Thanks!

推荐答案

如果您要在上面运行群集,则可以使用此信息,此处复制以作参考:

If you have a cluster to run this on, you can use my answer from this post, copied here for refrence:

目前的一项工作是使用cholesky分解。注意,任何协方差矩阵C都可以表示为C = G * G'。然后,如果y为标准正态,则x = G'* y如C中指定的那样相关(请参阅此关于StackExchange数学的出色文章)。在代码中:

An work arround for now, is to use a cholesky decomposition. Note that any covariance matrix C can be expressed as C=G*G'. It then follows that x = G'*y is correlated as specified in C if y is standard normal (see this excellent post on StackExchange Mathematic). In code:

Numpy

n_dim =4
size = 100000
A = np.random.randn(n_dim, n_dim)
covm = A.dot(A.T)

x=  np.random.multivariate_normal(size=size, mean=np.zeros(len(covm)),cov=covm)
## verify numpys covariance is correct
np.cov(x, rowvar=False)
covm

黄昏

## create covariance matrix
A = da.random.standard_normal(size=(n_dim, n_dim),chunks=(2,2))
covm = A.dot(A.T)

## get cholesky decomp
L = da.linalg.cholesky(covm, lower=True)

## drawn standard normal 
sn= da.random.standard_normal(size=(size, n_dim),chunks=(100,100))

## correct for correlation
x =L.dot(sn.T)
x.shape

## verify
covm.compute()
da.cov(x, rowvar=True).compute()

这篇关于快数组上的random.multivariate_normal?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆