带有 scipy.sparse 的马尔可夫链平稳分布? [英] Markov chain stationary distributions with scipy.sparse?

查看:25
本文介绍了带有 scipy.sparse 的马尔可夫链平稳分布?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个马尔可夫链,它是一个大的稀疏scipy矩阵A.(我已经以 scipy.sparse.dok_matrix 格式构建了矩阵,但转换为其他格式或将其构建为 csc_matrix 都可以.)

I have a Markov chain given as a large sparse scipy matrix A. (I've constructed the matrix in scipy.sparse.dok_matrix format, but converting to other ones or constructing it as csc_matrix are fine.)

我想知道这个矩阵的任何平稳分布 p,它是特征值 1 的特征向量.此特征向量中的所有条目都应为正且相加为 1,以表示概率分布.

I'd like to know any stationary distribution p of this matrix, which is an eigenvector to the eigenvalue 1. All entries in this eigenvector should be positive and add up to 1, in order to represent a probability distribution.

这意味着我想要系统的任何解决方案(AI) p = 0, p.sum()=1 (其中I=scipy.sparse.eye(*A.shape)是恒等矩阵),但(AI) 不会是满秩的,甚至整个系统都可能是欠定的.此外,可以生成具有负项的特征向量,这些特征向量不能归一化为有效的概率分布.防止 p 中的否定条目会很好.

This means I want any solution for the system (A-I) p = 0, p.sum()=1 (where I=scipy.sparse.eye(*A.shape) is the idententy matrix), but (A-I) will not be of full rank, and even the whole system may be under-determined. In addition, eigenvectors with negative entries can be generated, which cannot be normalized to be valid probability distributions. Preventing negative entries in p would be nice.

  • 使用 scipy.sparse.linalg.eigen.eigs 不是解决方案:它不允许指定附加约束.(如果特征向量包含负项,归一化无济于事.)此外,它与真实结果有很大的偏差,有时会出现收敛问题,表现得比 scipy.linalg.eig 更糟糕.(另外,我使用了 shift-invert 模式,它可以改进找到我想要的特征值类型,但不是它们的质量.如果我不使用它,那就更加矫枉过正了,因为我只对一个特定的特征值感兴趣,1.)

  • Using scipy.sparse.linalg.eigen.eigs is not a solution: It does not permit specifying the additive constraint. (Normalizing does not help if the eigenvectors contain negative entries.) Furthermore, it deviates quite a bit from the true results, and sometimes has problems converging, behaving worse than scipy.linalg.eig. (Also, I used shift-invert mode, which improves finding the types of eigenvalues I want, but not their quality. If I don't use it, it's even more overkill, because I am only interested in one particular eigenvalue, 1.)

转换为稠密矩阵并使用 scipy.linalg.eig 不是解决方案:除了负项问题,矩阵太大.

Converting to dense matrices and using scipy.linalg.eig is not a solution: In addition to the negative entry problem, the matrix is too large.

使用 scipy.sparse.spsolve 不是一个明显的解决方案:矩阵要么不是方阵(结合加性约束和特征向量条件时),要么不是满秩(试图以某种方式分别指定它们时),有时两者都不是.

Using scipy.sparse.spsolve is not an obvious solution: The matrix is either not square (when combining the additive constraint and the eigenvector condition) or not of full rank (when trying to specify them separately in some way), sometimes neither.

有没有一种好方法可以使用python从数值上获得作为稀疏矩阵给出的马尔可夫链的平稳状态?如果有一种方法可以获得详尽的列表(也可能是几乎静止的状态),这是值得赞赏的,但不是必需的.

Is there a good way to numerically get a stationary state of a Markov chain given as sparse matrix using python? If there is a way to get an exhaustive list (and possibly also nearly stationary states), that's appreciated, but not necessary.

推荐答案

谷歌学者可以找到几篇文章,其中总结了可能的方法,这里有一篇:http://www.ima.umn.edu/preprints/pp1992/932.pdf

Several articles with summaries of possible approaches can be found with Google scholar, here's one: http://www.ima.umn.edu/preprints/pp1992/932.pdf

下面所做的是结合上面@Helge Dietert 关于首先拆分为强连接组件的建议,以及上面链接的论文中的方法#4.

What's done below is a combination of the suggestion by @Helge Dietert above on splitting to strongly connected components first, and approach #4 in the paper linked above.

import numpy as np
import time

# NB. Scipy >= 0.14.0 probably required
import scipy
from scipy.sparse.linalg import gmres, spsolve
from scipy.sparse import csgraph
from scipy import sparse 


def markov_stationary_components(P, tol=1e-12):
    """
    Split the chain first to connected components, and solve the
    stationary state for the smallest one
    """
    n = P.shape[0]

    # 0. Drop zero edges
    P = P.tocsr()
    P.eliminate_zeros()

    # 1. Separate to connected components
    n_components, labels = csgraph.connected_components(P, directed=True, connection='strong')

    # The labels also contain decaying components that need to be skipped
    index_sets = []
    for j in range(n_components):
        indices = np.flatnonzero(labels == j)
        other_indices = np.flatnonzero(labels != j)

        Px = P[indices,:][:,other_indices]
        if Px.max() == 0:
            index_sets.append(indices)
    n_components = len(index_sets)

    # 2. Pick the smallest one
    sizes = [indices.size for indices in index_sets]
    min_j = np.argmin(sizes)
    indices = index_sets[min_j]

    print("Solving for component {0}/{1} of size {2}".format(min_j, n_components, indices.size))

    # 3. Solve stationary state for it
    p = np.zeros(n)
    if indices.size == 1:
        # Simple case
        p[indices] = 1
    else:
        p[indices] = markov_stationary_one(P[indices,:][:,indices], tol=tol)

    return p


def markov_stationary_one(P, tol=1e-12, direct=False):
    """
    Solve stationary state of Markov chain by replacing the first
    equation by the normalization condition.
    """
    if P.shape == (1, 1):
        return np.array([1.0])

    n = P.shape[0]
    dP = P - sparse.eye(n)
    A = sparse.vstack([np.ones(n), dP.T[1:,:]])
    rhs = np.zeros((n,))
    rhs[0] = 1

    if direct:
        # Requires that the solution is unique
        return spsolve(A, rhs)
    else:
        # GMRES does not care whether the solution is unique or not, it
        # will pick the first one it finds in the Krylov subspace
        p, info = gmres(A, rhs, tol=tol)
        if info != 0:
            raise RuntimeError("gmres didn't converge")
        return p


def main():
    # Random transition matrix (connected)
    n = 100000
    np.random.seed(1234)
    P = sparse.rand(n, n, 1e-3) + sparse.eye(n)
    P = P + sparse.diags([1, 1], [-1, 1], shape=P.shape)

    # Disconnect several components
    P = P.tolil()
    P[:1000,1000:] = 0
    P[1000:,:1000] = 0

    P[10000:11000,:10000] = 0
    P[10000:11000,11000:] = 0
    P[:10000,10000:11000] = 0
    P[11000:,10000:11000] = 0

    # Normalize
    P = P.tocsr()
    P = P.multiply(sparse.csr_matrix(1/P.sum(1).A))

    print("*** Case 1")
    doit(P)

    print("*** Case 2")
    P = sparse.csr_matrix(np.array([[1.0, 0.0, 0.0, 0.0],
                                    [0.5, 0.5, 0.0, 0.0],
                                    [0.0, 0.0, 0.5, 0.5],
                                    [0.0, 0.0, 0.5, 0.5]]))
    doit(P)

def doit(P):
    assert isinstance(P, sparse.csr_matrix)
    assert np.isfinite(P.data).all()

    print("Construction finished!")

    def check_solution(method):
        print("

-- {0}".format(method.__name__))
        start = time.time()
        p = method(P)
        print("time: {0}".format(time.time() - start))
        print("error: {0}".format(np.linalg.norm(P.T.dot(p) - p)))
        print("min(p)/max(p): {0}, {1}".format(p.min(), p.max()))
        print("sum(p): {0}".format(p.sum()))

    check_solution(markov_stationary_components)


if __name__ == "__main__":
    main()

编辑:发现了一个错误 --- csgraph.connected_components 也返回纯衰减的组件,需要过滤掉.

EDIT: spotted a bug --- csgraph.connected_components returns also purely decaying components, which need to be filtered out.

这篇关于带有 scipy.sparse 的马尔可夫链平稳分布?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆