跟踪 1 参数矩阵族的特征向量 [英] Tracking eigenvectors of a 1-parameter family of matrices

查看:62
本文介绍了跟踪 1 参数矩阵族的特征向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是:我正在尝试通过(截断的)Karhunen-Loeve 变换对随机过程进行谱分解,但我的协方差矩阵实际上是一个 1 参数矩阵族,我需要一种方法来估计/可视化我的随机过程如何取决于此参数.为此,我需要一种方法来跟踪 numpy.linalg.eigh() 生成的特征向量.

为了让您了解我的问题,这里有一个示例玩具问题:假设我有一组点 {xs},以及一个具有协方差 C(x,y) = 1/(1+a*) 的随机过程 R(xy)^2) 这取决于参数 a.对于 [0,1] 范围内的随机网格点样本和给定的 a(例如 a=1)选择,我可以填充协方差矩阵并使用以下方法实现 Karhunen-Loeve 变换:

num_x = 1000xs = numpy.array([numpy.random.uniform() for i in range(num_x)])z=numpy.random.standard_normal(num_x)a0=1def cov(x,y,a=a0): 返回 1/(1+a*(x-y)^2)cov_m = numpy.array(map(lambda y: map(lambda x: cov(x,y),xs),xs))w,v=numpy.linalg.eigh(cov_m)R=numpy.dot(v,z*w^0.5)

这将使我实现 R 并在我的每个随机网格点 xs 上定义值.然而,我需要能够做的是 - 对于特定的实现(这意味着我的网格 xs 和我的随机系数 z 的特定选择) - 跟踪 R 如何相对于我的协方差函数中的参数 a 变化.

如果我可以象征性地计算协方差矩阵并在事后指定 a ,这将很容易做到.然而,对于大型矩阵,这不是一个合理的选择.另一种方法是找到一种方法来跟踪 numpy.linalg.eigh() 返回的每个特征向量.不幸的是,numpy 似乎正在对它们进行重新排序,以便始终首先列出最小的特征值;这意味着,当我改变 a 时,特征向量将不可预测地重新排序,并且点积 numpy.dot(v,z*w^0.5) 不再将相同的系数分配给相同的特征向量.

有没有办法解决这个问题?

(这是来自 ASKSAGE 的交叉帖子.我的实现是使用 Sage,但由于问题不是 Sage 特定的,而且这里似乎有更多活动,我想我会重新发布.如果交叉发布是我的歉意不可接受;如果是,请删除此内容.)

根据下面的对话,我认为我需要添加有关此问题性质的更多详细信息.

Karhunen-Loeve 变换背后的想法是在频谱上分解随机过程 R(x),如下所示:

R(x) = \sum_{i} Z_i \lambda_i^{1/2} \phi^{(i)}(x),

其中每个 Z_i 是一个 i.i.d.具有标准正态分布的随机变量,每个 \phi^{(i)}(x) 是 x 的非随机函数,由协方差矩阵的特征向量的解确定,每个 \lambda_i 是对应的特征值\phi^{(i)}.

为了证明 R 对参数 a 的依赖性,我需要能够唯一地为每个 \phi^{(i)} 分配一个系数 Z_i.我如何进行这个分配并不重要,因为所有的 Z 都是相同分布的,但是分配需要是唯一的,它不能依赖于 lambda_i 的对应值,因为这将依赖于参数 a.

解析解很简单:只需计算任意 a 的特征向量和特征值,通过指定我的 Z 来选择 R 的特定实现,然后观察 R 的实现如何依赖于 a.(对于玩具模型,R 通常会随着 a 的增加而变化得更快.)

这里的问题是如何在数值上实现这一点.Numpy 在计算时显然会打乱特征向量,因此无法唯一地标记它们.我猜想做到这一点的唯一方法是深入研究底层代码并专门实现某种类型的任意标记功能.

简而言之:我需要一种对 numpy 生成的特征向量进行排序的方法,该方法不依赖于相关特征值的大小.

有没有办法做到这一点?

更新:我已经设法找到了这个问题的部分答案,但我需要做更多的研究才能得到完整的答案.看来我将不得不为此实现我自己的算法.

对于我正在考虑的类型的矩阵,Lanczos 算法(对于给定的初始向量选择)是确定性的,并且步骤不依赖于我的参数选择.这给了我一个对称的三对角矩阵来求解特征值.

分而治之可能在这里奏效.似乎我可以实现它的一个版本,它允许我独立于关联的特征值跟踪特征向量.除法"部分至少可以以一种确定性的、与参数无关的方式实现,但我需要更多地了解算法的征服"部分才能确定.

解决方案

经过一番研究,我设法针对这个问题提出了两个部分答案.

第一个是,对于一个没有零特征向量的实对称矩阵(也可能需要指定非退化矩阵),生成一个算法来解决特征对问题应该是可行的,该算法以固定顺序生成特征对与矩阵的选择无关.给定一个常量起始向量,Lanczos 算法 将为任意实对称矩阵生成三对角矩阵以一种确定性的方式.分治算法的分治"部分同样具有确定性,这意味着算法的唯一迭代次数取决于矩阵元素值的部分是征服"部分,求解长期方程:

1+\sum_j^m w_j^2/(d_j-\lambda)=0

因此,对于每个 2x2 块,问题归结为如何以实际上不依赖于原始矩阵的值的方式对长期方程的两个根进行排序.

第二种部分解决方案更容易实现,但更容易失败.回想起来,也是显而易见的.

同一矩阵的两个不同特征向量将始终相互正交.因此,如果特征向量作为单个参数 a 的函数平滑变化,则:

v_i(a).v_j(a+da) = \delta_{ij} + O(da)

因此,当参数 a 变化时,这给出了特征向量之间的自然映射.

这类似于 David Zwicker 和 jorgeca 建议的测量特征向量对之间的全局距离的想法,但更容易实现.然而,在特征向量快速变化或参数 a 变化过大的区域,这种方法的实现很容易失败.

此外,在特征值交叉时会发生什么的问题很有趣,因为在每次这样的交叉时,系统都会退化.但是,在允许的跨越简并的特征向量集合中,将有两个满足点积条件,可以作为跨越简并的基,从而保持映射.

当然,这是假设将特征向量视为参数空间上的平滑连续函数是正确的,这(正如 jorgeca 指出的)我不确定是否可以假设.

My problem is this: I'm attempting a spectral decomposition of a random process via a (truncated) Karhunen-Loeve transform, but my covariance matrix is actually a 1-parameter family of matrices, and I need a way to estimate / visualize how my random process depends on this parameter. To do this, I need a way to track the eigenvectors produced by numpy.linalg.eigh().

To give you an idea of my issue, here's a sample toy problem: Suppose I have a set of points {xs}, and a random process R with covariance C(x,y) = 1/(1+a*(x-y)^2) which depends on the parameter a. For a random sample of grid points in the range [0,1] and a given choice of a (say, a=1), I can populate my covariance matrix and implement the Karhunen-Loeve transform using:

num_x = 1000
xs = numpy.array([numpy.random.uniform() for i in range(num_x)])
z=numpy.random.standard_normal(num_x)
a0=1
def cov(x,y,a=a0): return 1/(1+a*(x-y)^2)
cov_m = numpy.array(map(lambda y: map(lambda x: cov(x,y),xs),xs))
w,v=numpy.linalg.eigh(cov_m)
R=numpy.dot(v,z*w^0.5)

This will give me realizations of R with values defined at each of my random grid points xs. What I need to be able to do, however, is - for a specific realization (which means a specific choice of my grid xs and my random coefficients z) - track how R varies with respect to the parameter a in my covariance function.

This would be easy to do if I could compute the covariance matrix symbolically and specify a after the fact. However, for large matrices this is not a plausible option. The alternative method is to find a way to keep track of each eigenvector returned by numpy.linalg.eigh(). Unfortunately, numpy appears to be reordering them so as to always list the smallest eigenvalue first; this means that, when I vary a, the eigenvectors get reordered unpredictably, and the dot product numpy.dot(v,z*w^0.5) is no longer assigning the same coefficient to the same eigenvector.

Is there a way around this?

(This is a cross-post from ASKSAGE. My implementation is using Sage, but as the question is not Sage specific and as there seems to be more activity here I thought I'd repost. My apologies if cross-posting is not acceptable; if so please delete this.)

EDIT: Based on the conversation below, I see I need to add more detail about the nature of this problem.

The idea behind a Karhunen-Loeve transform is to decompose a random process R(x) spectrally, as follows:

R(x) = \sum_{i} Z_i \lambda_i^{1/2} \phi^{(i)}(x),

where each Z_i is an i.i.d. random variable with the standard normal distribution, each \phi^{(i)}(x) is a non-random function of x determined by the solution of the eigenvectors of the covariance matrix, and each \lambda_i is the corresponding eigenvalue associated with \phi^{(i)}.

To demonstrate dependence of R on the parameter a, I need to be able to uniquely assign each \phi^{(i)} a coefficient Z_i. It doesn't matter how I do this assignation, since all of the Z's are identically distributed, but the assignation needs to be unique and it cannot depend on the corresponding value of \lambda_i, since that will depend on the parameter a.

The analytic solution is easy: just compute the eigenvectors and eigenvalues for arbitrary a, choose a specific realization of R by specifying my Z's, and watch how that realization of R depends on a. (For the toy model, R will generally become more rapidly varying as a increases.)

The problem here is how to implement this numerically. Numpy apparently scrambles the eigenvectors as it does its computation, so there's no way to uniquely tag them. I'm guessing that the only way to do this is to dig into the underlying code and specifically implement some type of arbitrary tagging function.

To put the problem succinctly: I need a way of ordering the eigenvectors produced by numpy which doesn't depend upon the magnitude of the associated eigenvalue.

Is there a way to do this?

Update: I've managed to find a partial answer to this problem, but I'll need to do some more research for the full answer. It looks like I'm going to have to implement my own algorithm for this.

For matrices of the type I'm considering, the Lanczos algorithm (for a given choice of initial vector) is deterministic and the steps do not depend on choice of my parameter. That gives me a symmetric, tridiagonal matrix to solve eigenvalues for.

Divide-and-conquer might work here. It seems like I could implement a version of it which would allow me to track the eigenvectors independent of the associated eigenvalue. The "divide" part, at least, can be implemented in a deterministic, parameter-independent way, but I'll need to learn much more about the "conquer" part of the algorithm to know for sure.

解决方案

After a bit of research, I've managed to come up with two partial answers to this problem.

The first is that, for a real symmetric matrix with no zero eigenvectors (it might also be necessary to specify non-degenerate), it should be feasible to generate an algorithm for solving the eigenpair problem which generates eigenpairs in a fixed order independent of the choice of matrix. Given a constant starting vector, the Lanczos algorithm will produce a tridiagonal matrix for an arbitrary real, symmetric matrix in a deterministic way. The "divide" part of the divide-and-conquer algorithm is similarly deterministic, which means that the only part of the algorithm for which the number of iterations depends on the values of the elements of the matrix is the "conquer" portion, solving the secular equation:

1+\sum_j^m w_j^2/(d_j-\lambda)=0

Thus, for each 2x2 block, the problem boils down to how to order the two roots of the secular equation in a way that doesn't actually depend on the values of the original matrix.

The second partial solution is much easier to implement, but more prone to fail. In retrospect, it's also obvious.

Two different eigenvectors of the same matrix will always be orthogonal to each other. Thus, if the eigenvectors smoothly vary as a function of a single parameter a, then:

v_i(a).v_j(a+da) = \delta_{ij} + O(da)

Thus, this gives a natural mapping between eigenvectors as the parameter a varies.

This is similar to the idea David Zwicker and jorgeca suggested of measuring global distance between pairs of eigenvectors, but much easier to implement. However, implementations of this will be prone to failure in regions where the eigenvectors are rapidly varying or if the change in the parameter a is too large.

Also, the question of what happens at crossings of eigenvalues is interesting, because at each such crossing the system becomes degenerate. However, within the set of allowed eigenvectors spanning the degeneracy, there will be two which satisfy the dot product condition and which can be used as the basis spanning the degeneracy, thus maintaining the mapping.

This is, of course, assuming that it is correct to treat the eigenvectors as smooth continuous functions on the parameter space, which (as jorgeca pointed out) I'm not certain can be assumed.

这篇关于跟踪 1 参数矩阵族的特征向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆