从任意多元函数有效采样 [英] Efficiently sample from arbitrary multivariate function

查看：114 发布时间：2020/10/22 19:20:30 python distribution sampling

本文介绍了从任意多元函数有效采样的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从Python中的任意函数中采样。

在

以下函数以提案的多元正态变量实现MH

  def metropolis_hastings（target_density，size = 500000）：
 burnin_size = 10000 
 size + = burnin_size 
 x0 = np.array（[[0，0]]）
 xt = x0 
样本= [] 
对于范围（大小）中的i：
 xt_candidate = np.array（[np.random.multivariate_normal（xt [0]，np.eye（2））]）
 accept_prob =（target_density（xt_candidate））/（target_density（xt））
如果np.random.uniform（0，1）< accept_prob：
 xt = xt_候选
个samples.append（xt）
个samples = np.array（samples [burnin_size：]）
个samples = np.reshape（samples，[samples。 shape [0]，2]）
返回样本

运行MH并绘制样本

  samples = metropolis_hastings（density1）
 plt.hexbin（samples [：，0]，samples [：，1]， cmap ='rainbow'）
 plt.gca（）。set_aspect（'equal'，Adjustable ='box'）
 plt.xlim（[-3，3]）
 plt.ylim （[-3，3]）
 plt.show（）

签出我的这个仓库以获取详细信息。

I would like to sample from an arbitrary function in Python.

In Fast arbitrary distribution random sampling it was stated that one could use inverse transform sampling and in Pythonic way to select list elements with different probability it was mentioned that one should use inverse cumulative distribution function. As far as I undestand those methods only work the univariate case. My function is multivariate though and too complex that any of the suggestions in https://stackoverflow.com/a/48676209/4533188 would apply.

Prinliminaries: My function is based on Rosenbrock's banana function, which value we can get the value of the function with

import scipy.optimize
scipy.optimize.rosen([1.1,1.2])

(here [1.1,1.2] is the input vector) from scipy, see https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.optimize.rosen.html.

Here is what I came up with: I make a grid over my area of interest and calculate for each point the function value. Then I sort the resulting data frame by the value and make a cumulative sum. This way we get "slots" which have different sizes - points which have large function values have larger slots than points with small function values. Now we generate random values and look into which slot the random value falls into. The row of the data frame is our final sample.

Here is the code:

import scipy.optimize
from itertools import product
from dfply import *

nb_of_samples = 50
nb_of_grid_points = 30

rosen_data = pd.DataFrame(array([item for item in product(*[linspace(fm[0], fm[1], nb_of_grid_points) for fm in zip([-2,-2], [2,2])])]), columns=['x','y'])
rosen_data['z'] = [np.exp(-scipy.optimize.rosen(row)**2/500) for index, row in rosen_data.iterrows()]
rosen_data = rosen_data >> \
    arrange(X.z) >> \
    mutate(z_upperbound=cumsum(X.z)) >> \
    mutate(z_upperbound=X.z_upperbound/np.max(X.z_upperbound))
value = np.random.sample(1)[0]

def get_rosen_sample(value):
    return (rosen_data >> mask(X.z_upperbound >= value) >> select(X.x, X.y)).iloc[0,]

values = pd.DataFrame([get_rosen_sample(s) for s in np.random.sample(nb_of_samples)])

This works well, but I don't think it is very efficient. What would be a more efficient solution to my problem?

I read that Markov chain Monte Carlo might helping, but here I am in over my head for now on how to do this in Python.

解决方案

I was in a similar situation, so, I implemented a rudimentary version of Metropolis-Hastings (which is an MCMC method) to sample from a bivariate distribution. An example follows.

Say, we want to sample from the following denisty:

def density1(z):
    z = np.reshape(z, [z.shape[0], 2])
    z1, z2 = z[:, 0], z[:, 1]
    norm = np.sqrt(z1 ** 2 + z2 ** 2)
    exp1 = np.exp(-0.5 * ((z1 - 2) / 0.8) ** 2)
    exp2 = np.exp(-0.5 * ((z1 + 2) / 0.8) ** 2)
    u = 0.5 * ((norm - 4) / 0.4) ** 2 - np.log(exp1 + exp2)
    return np.exp(-u)

which looks like this

The following function implements MH with multivariate normal as the proposal

def metropolis_hastings(target_density, size=500000):
    burnin_size = 10000
    size += burnin_size
    x0 = np.array([[0, 0]])
    xt = x0
    samples = []
    for i in range(size):
        xt_candidate = np.array([np.random.multivariate_normal(xt[0], np.eye(2))])
        accept_prob = (target_density(xt_candidate))/(target_density(xt))
        if np.random.uniform(0, 1) < accept_prob:
            xt = xt_candidate
        samples.append(xt)
    samples = np.array(samples[burnin_size:])
    samples = np.reshape(samples, [samples.shape[0], 2])
    return samples

Run MH and plot samples

samples = metropolis_hastings(density1)
plt.hexbin(samples[:,0], samples[:,1], cmap='rainbow')
plt.gca().set_aspect('equal', adjustable='box')
plt.xlim([-3, 3])
plt.ylim([-3, 3])
plt.show()

Check out this repo of mine for details.

这篇关于从任意多元函数有效采样的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从任意多元函数有效采样 [英] Efficiently sample from arbitrary multivariate function

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从任意多元函数有效采样 [英] Efficiently sample from arbitrary multivariate function

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭