解释(和比较)numpy.correlate的输出 [英] Interpreting (and comparing) output from numpy.correlate
问题描述
我查看了这个问题,但是它并没有真正给我任何答案.
I have looked at this question but it hasn't really given me any answers.
本质上,如何使用 np.correlate
确定是否存在强相关性?我希望我能理解的输出与从Matlab的 xcorr
和带有 coeff
选项的输出相同(1是滞后 l
的强相关性,滞后 l
时0是不相关的),但是即使输入矢量已在0到1之间进行归一化, np.correlate
也会产生大于1的值.
Essentially, how can I determine if a strong correlation exists or not using np.correlate
? I expect the same output as I get from matlab's xcorr
with the coeff
option which I can understand (1 is a strong correlation at lag l
and 0 is no correlation at lag l
), but np.correlate
produces values greater than 1, even when the input vectors have been normalised between 0 and 1.
示例输入
import numpy as np
x = np.random.rand(10)
y = np.random.rand(10)
np.correlate(x, y, 'full')
这将提供以下输出:
array([ 0.15711279, 0.24562736, 0.48078652, 0.69477838, 1.07376669,
1.28020871, 1.39717118, 1.78545567, 1.85084435, 1.89776181,
1.92940874, 2.05102884, 1.35671247, 1.54329503, 0.8892999 ,
0.67574802, 0.90464743, 0.20475408, 0.33001517])
如果我不知道最大可能的相关值是多少,如何分辨强相关性是什么?
How can I tell what is a strong correlation and what is weak if I don't know the maximum possible correlation value is?
另一个例子:
In [10]: x = [0,1,2,1,0,0]
In [11]: y = [0,0,1,2,1,0]
In [12]: np.correlate(x, y, 'full')
Out[12]: array([0, 0, 1, 4, 6, 4, 1, 0, 0, 0, 0])
编辑:这是一个很难回答的问题,但带标记的答案确实可以回答所提出的问题.我认为重要的是要注意我在该区域进行挖掘时发现的内容,您无法比较互相关的输出.换句话说,使用互相关的输出来说信号 x 与信号 y 的相关性比信号 z .互相关不提供此类信息
This was a badly asked question, but the marked answer does answer what was asked. I think it is important to note what I have found whilst digging around in this area, you cannot compare outputs from cross-correlation. In other words, it would not be valid to use the outputs from cross-correlation to say signal x is better correlated to signal y than signal z. Cross-correlation does not provide this kind of information
推荐答案
numpy.correlate
位于 之下.我认为我们可以理解.让我们从示例案例开始:
numpy.correlate
is under-documented. I think that we can make sense of it, though. Let's start with your sample case:
>>> import numpy as np
>>> x = [0,1,2,1,0,0]
>>> y = [0,0,1,2,1,0]
>>> np.correlate(x, y, 'full')
array([0, 0, 1, 4, 6, 4, 1, 0, 0, 0, 0])
这些数字是每个滞后的互相关.为了更清楚地说明这一点,让我们将延迟数字放在相关性之上:
Those numbers are the cross-correlations for each of the possible lags. To make that more clear, let's put the lag numbers above the correlations:
>>> np.concatenate((np.arange(-5, 6)[None,...], np.correlate(x, y, 'full')[None,...]), axis=0)
array([[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[ 0, 0, 1, 4, 6, 4, 1, 0, 0, 0, 0]])
在这里,我们可以看到互相关以-1的滞后达到其峰值.如果您看一下上面的 x
和 y
,那是有道理的:将 y
左移一位,它与完全是
.
Here, we can see that the cross-correlation reaches its peak at a lag of -1. If you look at x
and y
above, that makes sense: it one shifts y
to the left by one place, it matches x
exactly.
要验证这一点,让我们再试一次,这次将 y
进一步移动:
To verify this, let's try again, this time shifting y
further:
>>> y = [0, 0, 0, 0, 1, 2]
>>> np.concatenate((np.arange(-5, 6)[None,...], np.correlate(x, y, 'full')[None,...]), axis=0)
array([[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5],
[ 0, 2, 5, 4, 1, 0, 0, 0, 0, 0, 0]])
现在,相关性以-3的滞后性达到峰值,这意味着当 y
移动时, x
和 y
之间的最佳匹配向左3个地方.
Now, the correlation peaks at a lag of -3, meaning that the best match between x
and y
occurs when y
is shifted to the left by 3 places.
这篇关于解释(和比较)numpy.correlate的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!