防止scipy特征向量因计算机而异 [英] Preventing scipy eigenvectors differing from computer to computer

查看:114
本文介绍了防止scipy特征向量因计算机而异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关注此问题

Following up on this question about how to find the Markov steady state, I'm now running into the problem that it works perfectly on my lab computer, but it doesn't work on any other computer. Specifically, it always finds the correct number of near-one eigenvalues, and thus which nodes are attractor nodes, but it doesn't consistently find all of them and they aren't grouped properly. For example, using the 64x64 transition matrix below, the computers in which it doesn't work it always produces one of three different wrong collections of attractors at random. On the smaller matrix M1 below, all computers tested get the same, correct result to attractor groups and stationary distribution.

所有测试的计算机都运行Win7x64和WinPython-64bit-2.7.9.4.一台计算机总是正确,而其他三台计算机总是以相同的方式出错.根据几篇文章,我发现像这样,这听起来可能是由于通过计算浮点精度的差异.不幸的是,我不知道该如何解决.我的意思是,我不知道如何更改从矩阵中提取左特征值的代码,以强制实现所有计算机都可以处理的特定精度(并且我认为这样做不必非常准确)

All the machines tested are running Win7x64 and WinPython-64bit-2.7.9.4. One computer always gets it right, three others always get it wrong in the same ways. Based on several posts I found like this and this, this sounds like it might be caused by differences in the floating point accuracy of the calculations. Unfortunately I don't know how to fix this; I mean, I don't know how to alter the code for pulling the left eigenvalues from the matrix in order to force a specific accuracy that all computer can handle (and I don't think it has to be very accurate for this purpose).

这只是我目前对结果可能会有何不同的最佳猜测.如果您对为什么会发生这种情况以及如何阻止这种情况的发生有了更好的了解,那也很好.

That's just my current best guess for how the results could differ. If you have a better idea of why this is happening and how to stop it form happening then that's great too.

如果有一种方法可以使运行与运行之间以及计算机与计算机之间保持一致,那么我认为这并不取决于我的方法的细节,但是因为有人提出要求,所以就可以了.这两个矩阵都有3个吸引子.在M1中,第一个[1,2]是两个状态的轨道,另外两个[7]和[8]是平衡的. M2是 64x64转换矩阵,平衡点为[2]. [26]以及使用[7,8]的轨道.

If there is a way to make scipy consistent from run-to-run and computer to computer, I don't think it would depend on the details of my method, but because it was requested, here it is. Both of the matrices have 3 attractors. In M1 the first one [1,2] is an orbit of two states, the other two [7] and [8] are equilibria. M2 is a 64x64 transition matrix with equilibria at [2] and [26] as well as an orbit using [7,8].

但它并没有找到那组吸引子,而是报告了[[26],[2],[26]],有时报告了[[2,7,8,26],[2],[26]]和有时...每次运行都不会得到相同的答案,也永远不会得到[[2],[7,8],[26]](以任何顺序).

But instead of finding that set of attractors, it sometimes reports [[26],[2],[26]] and sometimes [[2,7,8,26],[2],[26]] and sometimes ... it's not getting the same answer each run, and it's never getting [[2],[7,8],[26]] (in any order).

import numpy as np
import scipy.linalg

M1 = np.array([[0.2, 0.8, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
              [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
              [0.6, 0.4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
              [0.0, 0.0, 0.2, 0.0, 0.1, 0.1, 0.3, 0.3],
              [0.0, 0.0, 0.2, 0.2, 0.2, 0.2, 0.1, 0.1],
              [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5],
              [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],
              [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])

M2 = np.genfromtxt('transitionMatrix1.csv', delimiter=',')

# For easy switching
M = M2
# Confirm the matrix is a valid Markov transition matrix
#print np.sum(M,axis=1)

其余部分与上一个问题的代码相同,为方便起见,请参见此处.

The rest is the same code from the previous question, included here for your convenience.

#create a list of the left eigenvalues and a separate array of the left eigenvectors
theEigenvalues, leftEigenvectors = scipy.linalg.eig(M, right=False, left=True)  
# for stationary distribution the eigenvalues and vectors are always real, and this speeds it up a bit
theEigenvalues = theEigenvalues.real                 
#print theEigenvalues 
leftEigenvectors = leftEigenvectors.real
#print leftEigenvectors 
# set how close to zero is acceptable as being zero...1e-15 was too low to find one of the actual eigenvalues
tolerance = 1e-10
# create a filter to collect the eigenvalues that are near enough to zero                                    
mask = abs(theEigenvalues - 1) < tolerance
# apply that filter           
theEigenvalues = theEigenvalues[mask]
# filter out the eigenvectors with non-zero eigenvalues                
leftEigenvectors = leftEigenvectors[:, mask]         
# convert all the tiny and negative values to zero to isolate the actual stationary distributions
leftEigenvectors[leftEigenvectors < tolerance] = 0   
# normalize each distribution by the sum of the eigenvector columns
attractorDistributions = leftEigenvectors / leftEigenvectors.sum(axis=0, keepdims=True)   
# this checks that the vectors are actually the left eigenvectors
attractorDistributions = np.dot(attractorDistributions.T, M).T      
# convert the column vectors into row vectors (lists) for each attractor     
attractorDistributions = attractorDistributions.T                        
print attractorDistributions
# a list of the states in any attractor with the stationary distribution within THAT attractor
#theSteadyStates = np.sum(attractorDistributions, axis=1)                
#print theSteadyStates 

推荐答案

不幸的答案是,无法固定种子以进行密密麻麻,因此无法强制其输出一致的值.这也意味着它没有办法可靠地产生正确答案,因为只有一个答案是正确的.我试图从scipy人那里获得最终答案或解决的方法被完全驳回了,但是当面对这个问题时,有人可能会在这些话语中找到一些智慧.

The unfortunate answer is that there is no way to fix the seed for scipy, and therefore no way to force it to output consistent values. This also means that there is no way for it to reliably produce correct answers because only one answer is correct. My attempts to get a definitive answer or fix from the scipy people were completely dismissed, but somebody may find some wisdom in those words when facing this issue.

作为问题的一个具体示例,当您在上面的代码中运行时,有时可能会得到以下本征向量集,这些特征向量表示系统中每个吸引子的稳态.我的家用计算机始终会产生此结果(这与我的笔记本电脑和实验室计算机不同).如问题所述,正确的吸引子是[[2],[7,8],[26]].正确识别了[2][6]的平衡,但是[7,8]的分布返回了[2,26]上的无效概率分布.正确的答案分别是[0.19835, 0.80164]而不是[7,8].我的实验室计算机可以正确找到该解决方案,但是到目前为止,还有六台计算机没有找到该解决方案.

As a concrete example of the problem, when you run the code above you may sometimes get the following set of eigenvectors supposedly representing the steady states of each of the attractors in the system. My home computer always produces this result (which is different form my laptop and lab computer). As stated in the question, the correct attractors are [[2],[7,8],[26]]. The equilibria of [2] and [6] are correctly identified, but the distribution for [7,8] instead returns a non-valid probability distribution over [2,26]. The correct answer is [0.19835, 0.80164] over [7,8] respectively. My lab computer correctly finds that solution, but so far six other computers have failed to do so.

这意味着(除非我的代码中存在其他一些未识别的错误)scipy.linalg对于查找马尔可夫模型的稳定状态毫无价值.即使它起作用 some

What this means is that (unless there is some other unidentified error in my code) scipy.linalg is worthless for finding steady states of Markov models. Even though it works some of the time, it cannot be relied upon to provide the correct answer, and therefore should be avoided completely...at least for Markov model steady states, and probably for everything to do with eigenvectors. It just doesn't work.

如果有人提出疑问,我将发布有关如何使用scipy可靠地生成马尔可夫模型 的平稳分布的代码.它的运行速度稍慢,但始终相同且始终正确.

I will post code on how to reliably generate the stationary distribution of a Markov model without using scipy if anybody asks a question about it. It runs a bit slower, but it's always the same and always correct.

[[ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.25707958  1.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.06867772  0.          1.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
...
 [ 0.          0.          0.        ]]

这篇关于防止scipy特征向量因计算机而异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆