用numpy计算向量的自相关 [英] Computing autocorrelation of vectors with numpy

查看：163 发布时间：2020/5/18 23:38:35 python numpy statistics stochastic-process

本文介绍了用numpy计算向量的自相关的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在努力提出一种使用numpy来计算一组3D向量中的自相关函数的有效方法.

I'm struggling to come up with a non-obfuscating, efficient way of using numpy to compute a self correlation function in a set of 3D vectors.

我在3d空间中有一组向量，保存在数组中

I have a set of vectors in a 3d space, saved in an array

a = array([[ 0.24463039,  0.58350592,  0.77438803],
       [ 0.30475903,  0.73007075,  0.61165238],
       [ 0.17605543,  0.70955876,  0.68229821],
       [ 0.32425896,  0.57572195,  0.7506    ],
       [ 0.24341381,  0.50183697,  0.83000565],
       [ 0.38364726,  0.62338687,  0.68132488]])

它们的自相关函数定义为

their self correlation function is defined as

如果上面的图像不可用，该公式也会显示在下面: C(t，{v} n)= \ frac 1 {nt} \ sum {i = 0} ^ {n-1-t} \ vec v_i \ cdot \ vec v_ {i + t }

in case the image above doesn't stay available, the formula is also printed below: C(t,{v}n) = \frac 1{n-t}\sum{i=0}^{n-1-t}\vec v_i\cdot\vec v_{i+t}

我正在努力以一种高效，无混淆的方式对此进行编码.我可以使用两个嵌套的for循环来显式地进行计算，但这很慢.通过使用numpy中的一个嵌入式函数，有一种快速的方法，但是它们似乎在使用完全不同的相关函数定义.此处已解决了类似的问题，如何使用numpy.进行自相关?，但是不能处理向量.你知道我该怎么解决吗?

I'm struggling to code this up in an efficient way, non-obfuscating way. I can compute this expliticly with two nested for loops, but that's slow. There is a fast way of doing it by using one of the embedded functions in numpy, but they seem to be using an entirely different definition of correlation function. A similar problem has been solved here, How can I use numpy.correlate to do autocorrelation? but that doesn't handle vectors. Do you know how can I solve this?

推荐答案

NumPy例程用于一维数组.作为最小"的改进，请对归一化步骤使用向量化操作(在前至后一行中使用np.arange):

The NumPy routines are for 1D arrays. As a "minimal" improvement, use a vectorized operation for the normalisation step (use of np.arange in the before-to-last line):

def vector_autocorrelate(t_array):
  n_vectors = len(t_array)
  # correlate each component indipendently
  acorr = np.array([np.correlate(t_array[:,i],t_array[:,i],'full') for i in xrange(3)])[:,n_vectors-1:]
  # sum the correlations for each component
  acorr = np.sum(acorr, axis = 0)
  # divide by the number of values actually measured and return
  acorr /= (n_vectors - np.arange(n_vectors))
  return acorr

对于更大的数组，应该考虑使用傅里叶变换算法进行关联.如果您对此库感兴趣，请查看库 tidynamics 的示例(免责声明:我写了该库，它仅取决于NumPy.

For larger array sizes, you should consider using the Fourier Transform algorithm for correlation. Check out the examples of the library tidynamics if you are interested in that (disclaimer: I wrote the library, it depends only on NumPy).

作为参考，以下是NumPy代码(我为测试编写的代码)，您的代码vector_autocorrelate和tidynamics的时间安排.

For reference, here are the timings for a NumPy-code (that I wrote for testing), your code vector_autocorrelate and tidynamics.

size = [2**i for i in range(4, 17, 2)]

np_time = []
ti_time = []
va_time = []
for s in size:
    data = np.random.random(size=(s, 3))
    t0 = time.time()
    correlate = np.array([np.correlate(data[:,i], data[:,i], mode='full') for i in range(data.shape[1])])[:,:s]
    correlate = np.sum(correlate, axis=0)/(s-np.arange(s))
    np_time.append(time.time()-t0)
    t0 = time.time()
    correlate = tidynamics.acf(data)
    ti_time.append(time.time()-t0)
    t0 = time.time()
    correlate = vector_autocorrelate(data)
    va_time.append(time.time()-t0)

您可以看到结果:

print("size", size)
print("np_time", np_time)
print("va_time", va_time)
print("ti_time", ti_time)

大小[16、64、256、1024、4096、16384、65536]

size [16, 64, 256, 1024, 4096, 16384, 65536]

np_time [0.00023794174194335938，0.0002703666687011719，0.0002713203430175781， 0.001544952392578125、0.0278470516204834、0.36094141006469727、6.922360420227051]

np_time [0.00023794174194335938, 0.0002703666687011719, 0.0002713203430175781, 0.001544952392578125, 0.0278470516204834, 0.36094141006469727, 6.922360420227051]

va_time [0.00021696090698242188，0.0001690387725830078，0.000339508056640625，0.0014629364013671875，0.024930953979492188，0.34442687034606934，7.005630731582642]

va_time [0.00021696090698242188, 0.0001690387725830078, 0.000339508056640625, 0.0014629364013671875, 0.024930953979492188, 0.34442687034606934, 7.005630731582642]

ti_time [0.0011148452758789062，0.0008449554443359375，0.0007512569427490234， 0.0010488033294677734、0.0026645660400390625、0.007939338684082031、0.048232316970825195]

ti_time [0.0011148452758789062, 0.0008449554443359375, 0.0007512569427490234, 0.0010488033294677734, 0.0026645660400390625, 0.007939338684082031, 0.048232316970825195]

或绘制它们

plt.plot(size, np_time)
plt.plot(size, va_time)
plt.plot(size, ti_time)
plt.loglog()

对于非常小的数据序列，"N ** 2"算法不可用.

For anything but very small data series, the "N**2" algorithm is unusable.

这篇关于用numpy计算向量的自相关的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用numpy计算向量的自相关 [英] Computing autocorrelation of vectors with numpy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

用numpy计算向量的自相关 [英] Computing autocorrelation of vectors with numpy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭