Numpy Correlate没有提供补偿 [英] Numpy Correlate is not providing an offset

查看:122
本文介绍了Numpy Correlate没有提供补偿的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python查看天文光谱,并且正在使用numpy.correlate尝试查找径向速度偏移。我正在将每个光谱与一个模板光谱进行比较。我遇到的问题是,无论我使用哪种光谱,numpy.correlate都指出相关函数的最大值随零像素的偏移而发生,即光谱已经对齐,这显然是不正确的。以下是一些相关代码:

  corr = np.correlate(temp_data,imag_data,mode ='same')
ax1.plot(delta_data,corr,c ='g')
ax1.plot(delta_data,100 * temp_data,c ='b')
ax1.plot(delta_data,100 * imag_data, c ='r')

此代码的输出如下所示:





f 的峰值在t = 90附近,而 g 的峰值在t = 180附近。因此,我们希望 g f 的相关性在90个时间步长的滞后周围会出现一个尖峰(或频率仓,或者您要关联的函数的任何参数)。 )



但是为了获得与输入形状相同的输出,如 np.correlate(g,f,mode = 'same'),我们必须在任一侧填充 g ,将其长度的一半用零填充(默认情况下;您可以填充其他值。) em>不要填充 g (例如 np.correlate(g,f,mode ='valid')),我们将仅获得一个值(零偏移的相关性),因为 f g 的长度相同,并且没有



当计算 g f 的相关性时在填充之后,您会发现当信号的非零部分完全对齐时,即在原始点之间没有 offset 时,它达到峰值。 l f g 。这是因为信号的RMS值远高于零- f g 的重叠大小在很大程度上取决于重叠元素的数量在较高的RMS级别,而不是每个函数周围具有相对较小的波动。我们可以通过从每个序列中减去RMS水平来消除对相关性的巨大贡献。在下图中,右侧的灰线显示了两个系列在零中心之前的互相关,而蓝绿色线显示了之后的互相关。像您第一次尝试一样,灰线是三角形的,两个非零信号重叠。蓝绿色线可以更好地反映两个信号波动之间的相关性。



  xcorr = np.correlate(g,f,'same')
xcorr_rms = np.correlate(g-40,f-40, 'same')
图,轴= plt.subplots(5,2,figsize =(18,18),gridspec_kw = {'width_ratios':[5,2]})
对于n,轴inumerate(axes):
offset =(0,75,125,215,250)[n]
fp = np.pad(f,[offset,250-offset],mode ='constant',constant_values = 0。 )
gp = np.pad(g,[125,125],mode ='constant',constant_values = 0。)

axis [0] .plot(fp,color ='purple' ,lw = 1.65)
axis [0] .plot(gp,color ='orange',lw = lw)
axis [0] .axvspan(max(125,offset),min(375, offset + 250),color ='blue',alpha = 0.06)
axis [0] .axvspan(0,max(125,offset),color ='brown',alpha = 0.03)
axis [0] .axvspan(min(375,offset + 250),500,color ='brown ',alpha = 0.03)如果n == 0,则

轴[0] .legend(['f','g'])
轴[0] .set_title('offset = {}'。format(offset-125))


轴[1] .plot(xcorr /(40 * 40),color ='gray')
轴[1] .plot(xcorr_rms,color ='teal')
轴[1] .axvline(offset,-100,350,color ='maroon',lw = 5,alpha = 0.5)
如果n == 0:
轴[1] .legend([ $ g \star f $, $ g'\star f'$, offset],loc ='左上角')

plt.show()


I am trying to look at astronomical spectra using Python, and I'm using numpy.correlate to try and find a radial velocity shift. I'm comparing each spectrum I have to one template spectrum. The problem that I'm encountering is that, no matter which spectra I use, numpy.correlate states that the maximal value of the correlation function occurs with a shift of zero pixels, i.e. the spectra already line up, which is very clearly not true. Here is some of the relevant code:

corr = np.correlate(temp_data, imag_data, mode='same')
ax1.plot(delta_data, corr, c='g')
ax1.plot(delta_data, 100*temp_data, c='b')
ax1.plot(delta_data, 100*imag_data, c='r')

The output of this code is shown here:

What I Have

Note that the cross-correlation function peaks at an offset of zero pixels despite the template (blue) and observed (red) spectra clearly showing an offset. What I would expect to see would be something a bit like (albeit not exactly like; this is merely the closest representation I could produce):

What I Want

Here I have introduced an artificial offset of 50 pixels in the template data, and they more or less line up now. What I would like is, for a case like this, for a peak to appear at an offset of 50 pixels rather than at zero (I don't care if the spectra at the bottom appear lined up; that is merely for visual representation). However, despite several hours of work and research online, I can't find someone who even describes this problem, let alone a solution. I've attempted to use ScyPy's correlate and MatLib's xcorr, and bot show this same thing (although I'm led to believe that they are essentially the same function).

Why is the cross-correlation not acting the way I expect, and how to do I get it to act in a useful way?

解决方案

The issue you're experiencing is probably because your spectra are not zero-centered; their RMS value looks to be about 100 in whichever units you're plotting. The reason this is an issue is because the convolution/cross-correlation functions have to pad your spectra with zeroes in order to compute the full response in "same" mode. So even though your signals are most similar with an offset around 50 samples, when the two signals are not perfectly aligned, you're integrating the product of only their overlap, and discarding all the offset values since they're multiplied by zero. This is problematic because your spectra are not zero-mean, and their correlation increases nearly linearly in their overlap.

Notice that your cross-correlation result looks like a triangular pulse, which is what you might expect from the cross-correlation of two square pulses (c.f. Convolution of a Rectangular "Pulse" With Itself. That's because your spectra, once padded, look like a step function from zero up to a pulse of slightly noisy values around 100--effectively the convolution of a rectangular pulse with Gaussian noise. You can try convolving with mode='full' to see the entire response of the two spectra you're correlating, or, notice that with mode='valid' that you should only get one value in return, since your two spectra are the exact same length, so there is only one offset (zero!) where you can entirely line them up.

To sidestep this issue, you can try either subtracting away the RMS value of the spectra so that they are zero-centered, or padding both spectra with their length in the RMS value on either side.

Edit: In response to your questions in the comments, I thought I'd include a graphic to make the point I'm trying to describe a little clearer.

Say we have two vectors of values, not entirely unlike your spectra, each with some large offset from zero.

# Generate two noisy, but correlated series
t = np.linspace(0,250,250)
f = 10*np.exp(-((t-90)**2)/8) + np.random.randn(250) + 40
g = 10*np.exp(-((t-180)**2)/8) + np.random.randn(250) + 40

f has a spike around t=90, and g has a spike around t=180. So we expect the correlation of g and f to have a spike around a lag of 90 timesteps (or frequency bins, or whatever the argument of the functions you're correlating.)

But in order to get an output that is the same shape as our inputs, as in np.correlate(g,f,mode='same'), we have to "pad" g on either side with half its length in zeros (by default; you could pad with other values.) If we don't pad g (as in np.correlate(g,f,mode='valid')), we will only get one value in return (the correlation with zero offset), because f and g are the same length, and there is no room to shift one of the signals relative to the other.

When you calculate the correlation of g and f after that padding, you find that it peaks when the non-zero portion of signals aligns completely, that is, when there is no offset between the original f and g. This is because the RMS value of the signals is so much higher than zero--the size of the overlap of f and g depends much more strongly on the number of elements overlapping at this high RMS level than on the relatively small fluctuations each function has around it. We can remove this large contribution to the correlation by subtracting the RMS level from each series. In the graph below, the gray line on the right shows the cross-correlation the two series before zero-centering, and the teal line shows the cross-correlation after. The gray line is, like your first attempt, triangular with the overlap of the two non-zero signals. The teal line better reflects the correlation between the fluctuation of the two signals, as we desired.

xcorr = np.correlate(g,f,'same')
xcorr_rms = np.correlate(g-40,f-40,'same')
fig, axes = plt.subplots(5,2,figsize=(18,18),gridspec_kw={'width_ratios':[5,2]})
for n, axis in enumerate(axes):
    offset = (0,75,125,215,250)[n]
    fp = np.pad(f,[offset,250-offset],mode='constant',constant_values=0.)
    gp = np.pad(g,[125,125],mode='constant',constant_values=0.)

    axis[0].plot(fp,color='purple',lw=1.65)
    axis[0].plot(gp,color='orange',lw=lw)
    axis[0].axvspan(max(125,offset),min(375,offset+250),color='blue',alpha=0.06)
    axis[0].axvspan(0,max(125,offset),color='brown',alpha=0.03)
    axis[0].axvspan(min(375,offset+250),500,color='brown',alpha=0.03)
    if n==0:
        axis[0].legend(['f','g'])
    axis[0].set_title('offset={}'.format(offset-125))


    axis[1].plot(xcorr/(40*40),color='gray')
    axis[1].plot(xcorr_rms,color='teal')
    axis[1].axvline(offset,-100,350,color='maroon',lw=5,alpha=0.5)
    if n == 0:
        axis[1].legend(["$g \star f$","$g' \star f'$","offset"],loc='upper left')

plt.show()

这篇关于Numpy Correlate没有提供补偿的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆