使用PyMC3的贝叶斯概率矩阵因式分解(BPMF):使用`NUTS`的PositiveDefiniteError [英] Bayesian Probabilistic Matrix Factorization (BPMF) with PyMC3: PositiveDefiniteError using `NUTS`

查看:164
本文介绍了使用PyMC3的贝叶斯概率矩阵因式分解(BPMF):使用`NUTS`的PositiveDefiniteError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经实现了贝叶斯概率矩阵分解算法在Python中使用pymc3.我还实现了它的前身,概率矩阵分解(PMF). 查看我以前的问题以了解参考此处使用的数据.

I've implemented the Bayesian Probabilistic Matrix Factorization algorithm using pymc3 in Python. I also implemented it's precursor, Probabilistic Matrix Factorization (PMF). See my previous question for a reference to the data used here.

我无法使用NUTS采样器绘制MCMC样本.我使用来自PMF的MAP初始化模型参数,并使用高斯随机绘制将超参数散布在0左右.但是,在为采样器设置步骤对象时会得到PositiveDefiniteError.我已经验证了PMF的MAP估计是合理的,因此我希望它与初始化超参数的方式有关.这是PMF模型:

I'm having trouble drawing MCMC samples using the NUTS sampler. I initialize the model parameters using the MAP from PMF, and the hyperparameters using Gaussian random draws sprinkled around 0. However, I get a PositiveDefiniteError when setting up the step object for the sampler. I've verified that the MAP estimate from PMF is reasonable, so I expect it has something to do with the way the hyperparameters are being initialized. Here is the PMF model:

import pymc3 as pm
import numpy as np
import pandas as pd
import theano
import scipy as sp

data = pd.read_csv('jester-dense-subset-100x20.csv')    
n, m = data.shape
test_size = m / 10
train_size = m - test_size

train = data.copy()
train.ix[:,train_size:] = np.nan  # remove test set data
train[train.isnull()] = train.mean().mean()  # mean value imputation
train = train.values

test = data.copy()
test.ix[:,:train_size] = np.nan  # remove train set data
test = test.values    

# Low precision reflects uncertainty; prevents overfitting
alpha_u = alpha_v = 1/np.var(train)
alpha = np.ones((n,m)) * 2  # fixed precision for likelihood function
dim = 10  # dimensionality

# Specify the model.
with pm.Model() as pmf:
    pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),
                        shape=(n, dim), testval=np.random.randn(n, dim)*.01)
    pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim),
                        shape=(m, dim), testval=np.random.randn(m, dim)*.01)
    pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T),
                      tau=alpha, observed=train)

    # Find mode of posterior using optimization
    start = pm.find_MAP(fmin=sp.optimize.fmin_powell)

这是BPMF:

n, m = data.shape
dim = 10  # dimensionality
beta_0 = 1  # scaling factor for lambdas; unclear on its use
alpha = np.ones((n,m)) * 2  # fixed precision for likelihood function

logging.info('building the BPMF model')
std = .05  # how much noise to use for model initialization
with pm.Model() as bpmf:
    # Specify user feature matrix
    lambda_u = pm.Wishart(
        'lambda_u', n=dim, V=np.eye(dim), shape=(dim, dim),
        testval=np.random.randn(dim, dim) * std)
    mu_u = pm.Normal(
        'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
        testval=np.random.randn(dim) * std)
    U = pm.MvNormal(
        'U', mu=mu_u, tau=lambda_u, shape=(n, dim),
        testval=np.random.randn(n, dim) * std)

    # Specify item feature matrix
    lambda_v = pm.Wishart(
        'lambda_v', n=dim, V=np.eye(dim), shape=(dim, dim),
        testval=np.random.randn(dim, dim) * std)
    mu_v = pm.Normal(
        'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
         testval=np.random.randn(dim) * std)
    V = pm.MvNormal(
        'V', mu=mu_v, tau=lambda_v, shape=(m, dim),
        testval=np.random.randn(m, dim) * std)

    # Specify rating likelihood function
    R = pm.Normal(
        'R', mu=theano.tensor.dot(U, V.T), tau=alpha,
        observed=train)

# `start` is the start dictionary obtained from running find_MAP for PMF.
for key in bpmf.test_point:
    if key not in start:
        start[key] = bpmf.test_point[key]

with bpmf:
    step = pm.NUTS(scaling=start)

在最后一行,出现以下错误:

At the last line, I get the following error:

PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [   0    2   ...  2206  2207  ]

据我了解,对于具有超优先级(例如BPMF)的模型,我不能使用find_MAP.这就是为什么我尝试使用PMF中的MAP值进行初始化的原因,该值使用U和V上的参数而不是参数化的超优先级使用点估计.

As I understand it, I can't use find_MAP with models that have hyperpriors like BPMF. This is why I'm attempting to initialize with the MAP values from PMF, which uses point estimates for the parameters on U and V rather than parameterized hyperpriors.

推荐答案

不幸的是,Wishart发行版无法正常运行.我最近在这里添加了警告: https://github.com/pymc-devs/pymc3/commit/642f63973ec9f807fb6e55a0fc4b31bdfa1f261e

Unfortunately the Wishart distribution is non-functional. I recently added a warning here: https://github.com/pymc-devs/pymc3/commit/642f63973ec9f807fb6e55a0fc4b31bdfa1f261e

有关此棘手分发的更多讨论,请参见此处: https://github.com/pymc-devs/pymc3/issues/538

See here for more discussions on this tricky distribution: https://github.com/pymc-devs/pymc3/issues/538

您可以通过固定协方差矩阵来确认这是来源.如果是这种情况,我会尝试使用JKL先前的发行版:

You could confirm that that's the source by fixing the covariance matrix. If that's the case, I'd try using the JKL prior distribution: https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/LKJ_correlation.py

这篇关于使用PyMC3的贝叶斯概率矩阵因式分解(BPMF):使用`NUTS`的PositiveDefiniteError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆