从numpy.correlate输出中找到最佳滞后 [英] Find the best lag from the numpy.correlate output

查看：354 发布时间：2020/5/18 21:08:21 python numpy correlation

本文介绍了从numpy.correlate输出中找到最佳滞后的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用以下代码对data_1和data_2进行自动关联:

I am using the following code to do auto-correlation on data_1 and data_2:

result = numpy.correlate(data_1, data_2, mode='full')

结果也是时间序列.我也将结果标准化为result1:

The result is also the time series. I also normalized result to result1:

result1 = StandardScaler().fit_transform(result.astype('float32').reshape(-1, 1))

这是情节，data_1是黑色，data_2是红色，result1是绿色:

Then here is the plot, data_1 is black, data_2 is red, result1 is green:

我知道data_1和data_2之间存在滞后，所以我想知道找到滞后的最佳方法是什么?谢谢！

I know there is a lag between data_1 and data_2, so I am wondering what's the best way to find the lag? Thanks!

推荐答案

numpy.correlate不会将数据居中，因此应在调用该方法之前先进行处理:

numpy.correlate does not center the data, so one should do it prior to calling the method:

corr = np.correlate(data_1 - np.mean(data_1), 
                    data_2 - np.mean(data_2),
                    mode='full')

这只会将corr更改一个常数，但这样做仍然是合理的:不相关的移位将显示为0.

This only changes corr by a constant, but still, a reasonable thing to do: uncorrelated shifts will show up as 0.

第二，将所有三样东西放在一个水平刻度上的图表似乎没有帮助；使用mode='full'时，相关数组的长度大约是原始数组的两倍.

Second, your chart with all three things on one horizontal scale doesn't seem helpful; with mode='full' the length of correlation array is about twice the length of original ones.

用corr.argmax()拾取corr的最大值是合理的做法.只需知道索引在这里是如何工作的即可.当mode ='full'时，corr的第0个索引对应于公式sum_n a[n+k] * conj(v[n]) 为1 - len(a)，表示a极左移，因此在移位的a和v之间只有一个重叠元素.因此，从该索引中减去len(a) - 1将得出a相对于v的实际偏移.

Picking the maximum of corr with corr.argmax() is a reasonable thing to do. One just has to be aware of how the index works here. With mode='full' the 0th index of corr corresponds to the shift k in the formula sum_n a[n+k] * conj(v[n]) being 1 - len(a), meaning a is moved extremely far to the left so that there is just one element of overlap between shifted a and v. So, subtracting len(a) - 1 from this index gives the actual shift of a with respect to v.

一个虚构示例:

import numpy as np
import matplotlib.pyplot as plt
data_1 = np.sin(np.linspace(0, 10, 100))
data_1 += np.random.uniform(size=data_1.shape)   # noise
data_2 = np.cos(np.linspace(0, 7, 70))
data_2 += np.random.uniform(size=data_2.shape)   # noise
corr = np.correlate(data_1 - np.mean(data_1), 
                    data_2 - np.mean(data_2),
                    mode='full')
plt.plot(corr)
plt.show()
lag = corr.argmax() - (len(data_1) - 1)
print(lag)
plt.plot(data_1, 'r*')
plt.plot(data_2, 'b*')
plt.show()

在这里，滞后被打印为-14或-15(取决于随机噪声)，在此等级上表示-1.4或-1.5.这是合理的，因为罪恶使cos落后pi/2，即1.57.换句话说，将红点向左移动14到15个元素可以最大化与蓝点的匹配.

Here the lag is printed as -14 or -15 (depending on random noise) which on this scale means -1.4 or -1.5. This is reasonable, as sin is trailing cos by pi/2, or about 1.57. In other words, moving the red dots to the left by 14-15 elements maximizes the match with the blue dots.

数据:

从numpy.correlate输出中找到最佳滞后 [英] Find the best lag from the numpy.correlate output

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从numpy.correlate输出中找到最佳滞后 [英] Find the best lag from the numpy.correlate output

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭