如何在MATLAB中没有任何内置函数或循环的情况下计算协方差矩阵? [英] How do I calculate the covariance matrix without any built-in functions or loops in MATLAB?

查看:299
本文介绍了如何在MATLAB中没有任何内置函数或循环的情况下计算协方差矩阵?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在不使用MATLAB中任何内置函数或循环的情况下找到矩阵的协方差?我对解决这个问题的想法一无所知.

我在想类似的东西

cov(x,y) = 1/(n-1) .* (x*y)

但是,我认为这行不通.有什么想法吗?

解决方案

下面是一个很好的示例,说明如何对协方差矩阵进行数值计算. http://www.itl.nist.gov/div898/handbook /pmc/section5/pmc541.htm .但是,为了完整起见,我们将其放在本文中. 我对内置"函数的含义有些困惑,因为协方差要求您对矩阵的各列求和.如果您不能使用任何内置函数来总结这些元素,那么我不明白如果不使用for循环就无法做到这一点. 我想出了不使用内置函数或循环的方法,但是您需要使用 无偏估计量 .您还会注意到该矩阵是对称的,因为即使您翻转顺序(即先查看列j然后查看列i),答案也应该相同.我假设您也不能使用MATLAB中的 mean 因此,让我们从首要原则开始做吧.

首先,计算一个行向量,该向量计算每列的平均值.不使用 sum ,因为它也是一个内置函数,所以将这个1s的行向量与矩阵A相乘,输出将是一个包含所有列之和的行向量.因此,请执行以下操作:

one_vector(1:size(A,1)) = 1;
mu = (one_vector * A) / size(A,1);

第一行代码的诀窍是,我们正在动态创建一个数组,该数组的长度与矩阵A中的行数相同.我们将其完全填充为1s.请注意,您可能已经使用过 ones ,但是您说可以不要使用任何内置功能. mu将在所有列中包含我们的向量.

现在,让我们通过用均值减去每一列来预处理数据,因为这就是定义所要执行的操作.要在没有任何内置函数的情况下执行此操作,您可以做的是用各自的均值减去所有列,将mu复制与one_vector中的1一样多的次数.因此:

A_mean_subtract = A - mu(one_vector, :);

在这里有点棘手(很酷).如果我们转置矩阵A,您会看到行变成列,而列变成行.如果我们对此转置并乘以原始矩阵,则实际上将获得矩阵A的列i和列j之间的乘积之和.这是我们协方差计算的第一部分.然后,我们将其除以n - 1.因此,我们的协方差就是:

covA = (A_mean_subtract.' * A_mean_subtract) / (size(A,1) - 1);

这是一个简单的示例,以及我上面向您展示的该网站上看到的内容.假设A是这样的:

A = [4 2 0.5; 4.2 2.1 0.59; 3.9 2.0 0.58; 4.3 2.1 0.62; 4.1 2.2 0.63]

A =

    4.0000    2.0000    0.5000
    4.2000    2.1000    0.5900
    3.9000    2.0000    0.5800
    4.3000    2.1000    0.6200
    4.1000    2.2000    0.6300

通过上面的代码运行,这就是我们得到的:

covA =

    0.0250    0.0075    0.0042
    0.0075    0.0070    0.0034
    0.0042    0.0034    0.0026

您还将看到它也与MATLAB中的cov函数相匹配:

>> cov(A)

ans =

0.0250    0.0075    0.0042
0.0075    0.0070    0.0034
0.0042    0.0034    0.0026

提示位

如果您在MATLAB命令提示符下键入edit cov,则实际上可以看到它们如何计算协方差矩阵而没有任何for循环....这基本上就是我给您的答案:)

如果您想更有效地做到这一点

假设您可以使用sumbsxfun,我们可以用更少(和更高效..)的代码行来做到这一点.首先,像上面使用sum一样计算您的均值向量:

mu = sum(A) / size(A,1);

现在,要用每列的相应平均值减去矩阵A,您可以使用此Wikipedia页面,该页面通过各种算法来计算两个n长度之间的协方差向量更稳定.

Is it possible to find the covariance of a matrix without using any built-in functions or loops in MATLAB? I'm completely clueless about the idea of solving this problem.

I was thinking of something like:

cov(x,y) = 1/(n-1) .* (x*y)

However, I don't think this will work. Any ideas?

解决方案

Here's a great example of how to numerically compute the covariance matrix. http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc541.htm. However, let's put this in this post for the sake of completeness. I'm a bit confused with what you mean by "built-in" functions because the covariance requires that you sum over columns of a matrix. If you can't use any built-in functions to sum up these elements, then I don't see how you can do this without using for loops. Edit: I figured out how to do it without using built-in functions or loops, but you need to use size to determine how many rows in the matrix you have... unless you specify this as a constant in your function.

Numerically, you compute the covariance matrix like so:

Essentially, the ith row and the jth column of your covariance matrix is such that you take the sum of products of the column i minus the mean of column i with column j minus the mean of column j. Now, add these up, then divide by n - 1. This is known as the unbiased estimator. You'll also notice that this matrix is symmetric because even if you flip the order around (i.e. looking at column j then column i after), the answer should still be the same. I'm assuming you can't use mean from MATLAB either so let's do this from first principles.

First, compute a row vector that computes the mean of every column. What you can do to compute the sum over all of the columns without using sum, as it is also a built-in function, is multiply this row vector of 1s with your matrix A, The output will be a row vector that contains the sum over all of the columns. As such, do this:

one_vector(1:size(A,1)) = 1;
mu = (one_vector * A) / size(A,1);

The trick with the first line of code is that we are dynamically creating an array that is of the same length as the number of rows in your matrix A. We fill this completely full of 1s. Note that you could have used ones, but you said you can't use any built-in functions. mu will contain our vector over all columns.

Now, let's pre-process the data by subtracting every column with the mean, since that's what the definition says we do. To do this without any built-in functions, what you can do is to subtract all of the columns with their own respective means, duplicate mu for as many times as we have 1s in the one_vector. Therefore:

A_mean_subtract = A - mu(one_vector, :);

Here's where it gets a bit tricky (and cool). If we transpose the matrix A, you'll see that the rows become the columns and the columns become the rows. If we took this transpose and multiplied by the original matrix, we would actually get the sum of products between column i and column j of our matrix A. That's the first part of our covariance calculation. We then divide by n - 1. Therefore, our covariance is simply:

covA = (A_mean_subtract.' * A_mean_subtract) / (size(A,1) - 1);

Here's a quick example, as well as what is seen on that website I showed you above. Supposing A was this:

A = [4 2 0.5; 4.2 2.1 0.59; 3.9 2.0 0.58; 4.3 2.1 0.62; 4.1 2.2 0.63]

A =

    4.0000    2.0000    0.5000
    4.2000    2.1000    0.5900
    3.9000    2.0000    0.5800
    4.3000    2.1000    0.6200
    4.1000    2.2000    0.6300

Running through the above code, this is what we get:

covA =

    0.0250    0.0075    0.0042
    0.0075    0.0070    0.0034
    0.0042    0.0034    0.0026

You'll see that this also matches with the cov function in MATLAB too:

>> cov(A)

ans =

0.0250    0.0075    0.0042
0.0075    0.0070    0.0034
0.0042    0.0034    0.0026

Bit of a hint

If you type in edit cov in your MATLAB command prompt, you can actually see how they compute the covariance matrix without any for loops.... and this is essentially the same answer I gave you :)

If you want to do this more efficiently

Assuming you can use sum and bsxfun, we can do this in fewer (and more efficiently..) lines of code. First, compute your mean vector like we did above using sum:

mu = sum(A) / size(A,1);

Now, to subtract your matrix A with each column's corresponding mean, you can use bsxfun to help you facilitate this subtraction:

A_mean_subtract = bsxfun(@minus, A, mu);

Now, compute your covariance matrix like you did before:

covA = (A_mean_subtract.' * A_mean_subtract) / (size(A,1) - 1);

You should get exactly the same result as we saw before.

Minor note on stability

We are using the straight up definition of calculating the covariance between two columns using the definition. However, it has been shown that using the straight up definition can lend to numerical instability if you provide certain types of data. Consult this Wikipedia page that goes through various algorithms on computing the covariance between two n length vectors that are more stable.

这篇关于如何在MATLAB中没有任何内置函数或循环的情况下计算协方差矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆