Numpy中的功率谱和数据自相关 [英] Power Spectrum and Autocorrelation of Data in Numpy

查看:360
本文介绍了Numpy中的功率谱和数据自相关的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对使用Python计算3D空间中的粒子系统(约100,000个)的功率谱感兴趣.到目前为止,我发现Numpy中的一组函数(fftfftn,..)计算离散傅里叶变换,其绝对值的平方是功率谱.我的问题是有关如何表示我的数据的问题-实际上,答案很简单.

I am interested in computing the power spectrum of a system of particles (~100,000) in 3D space with Python. What I have found so far is a group of functions in Numpy (fft,fftn,..) which compute the discrete Fourier transform, of which the square of the absolute value is the power spectrum. My question is a matter of how my data are being represented - and truthfully may be fairly simple to answer.

我拥有的数据结构是一个数组,其形状为( n ,2), n 是我拥有的粒子数,每列代表一个 n 粒子的x,y和z坐标.我认为该函数应该使用fftn()函数,该函数接受 n 维数组的离散傅立叶变换-但它没有说明格式.数据应如何表示为要馈入fftn的数据结构?

The data structure I have is an array which has a shape of (n,2), n being the number of particles I have, and each column representing either the x, y, and z coordinate of the n particles. The function I believe I should be using it the fftn() function, which takes the discrete Fourier transform of an n-dimensional array - but it says nothing about the format. How should the data be represented as a data structure to be fed into fftn?

这是到目前为止我尝试过的测试功能:

Here is what I've tried so far to test the function:

import numpy as np
import random
import matplotlib.pyplot as plt

DATA = np.zeros((100,3))

for i in range(len(DATA)):
    DATA[i,0] = random.uniform(-1,1)
    DATA[i,1] = random.uniform(-1,1)
    DATA[i,2] = random.uniform(-1,1)

FFT = np.fft.fftn(DATA)
PS = abs(FFT)**2

plt.plot(PS)
plt.show()

名为DATA的数组是一个模拟数组,最终它的形状将为100,000 x 3.代码的输出给了我类似的东西:

The array entitled DATA is a mock array, ultimately the thing which will be 100,000 by 3 in shape. The output of the code gives me something like:

如您所见,我认为这给了我3个1D功率谱(数据的每一列1个),但实际上我希望功率谱是半径的函数.

As you can see, I think this is giving me three 1D power spectra (1 for each column of my data), but really I'd like a power spectrum as a function of radius.

有人知道如何计算功率谱的任何建议或替代方法/封装(我什至会满足于两点自相关函数的要求).

Does anybody have any advice or alternative methods/packages they know of to compute the power spectrum (I'd even settle for the two point autocorrelation function).

推荐答案

它与您设置它的方式不太一样...

It doesn't quite work the way you are setting it out...

您需要一个函数,叫它f(x, y, z),它描述空间中质量的密度.在您的情况下,您可以将星系视为点质量,因此您将拥有一个以每个星系的位置为中心的增量功能.借助此功能,您可以计算三维自相关,从而可以计算出功率谱.

You need a function, lets call it f(x, y, z), that describes the density of mass in space. In your case, you can consider the galaxies as point masses, so you will have a delta function centered at the location of each galaxy. It is for this function that you can calculate the three-dimensional autocorrelation, from which you could calculate the power spectrum.

如果要使用numpy为您完成此操作,则首先需要离散化函数.一个可能的模拟示例为:

If you want to use numpy to do that for you, you are first going to have to discretize your function. A possible mock example would be:

import numpy as np
import matplotlib.pyplot as plt

space = np.zeros((100, 100, 100), dtype=np.uint8)

x, y, z = np.random.randint(100, size=(3, 1000))
space[x, y, z] += 1

space_ps = np.abs(np.fft.fftn(space))
space_ps *= space_ps

space_ac = np.fft.ifftn(space_ps).real.round()
space_ac /= space_ac[0, 0, 0]

现在,space_ac拥有数据集的三维自相关函数.这不是您想要的,要获得一维相关函数,您必须对原点周围的球形壳上的值求平均值:

And now space_ac holds the three-dimensional autocorrelation function for the data set. This is not quite what you are after, and to get you one-dimensional correlation function you would have to average the values on spherical shells around the origin:

dist = np.minimum(np.arange(100), np.arange(100, 0, -1))
dist *= dist
dist_3d = np.sqrt(dist[:, None, None] + dist[:, None] + dist)
distances, _ = np.unique(dist_3d, return_inverse=True)
values = np.bincount(_, weights=space_ac.ravel()) / np.bincount(_)

plt.plot(distances[1:], values[1:])

以这种方式自己做事还有另一个问题:当您按上述方法计算功率谱时,数学上就好像您的三维阵列环绕边界,即点[999, y, z][0, y, z]的邻居.因此,您的自相关可能会将两个非常遥远的星系显示为近邻.解决此问题的最简单方法是,使数组沿每个维度大两倍,用多余的零填充,然后丢弃多余的数据.

There is another issue with doing things yourself this way: when you compute the power spectrum as above, mathematically is as if your three dimensional array wrapped around the borders, i.e. point [999, y, z] is a neighbour to [0, y, z]. So your autocorrelation could show two very distant galaxies as close neighbours. The simplest way to deal with this is by making your array twice as large along every dimension, padding with extra zeros, and then discarding the extra data.

或者,您可以将scipy.ndimage.filters.correlatemode='constant'结合使用,为您完成所有肮脏的工作.

Alternatively you could use scipy.ndimage.filters.correlate with mode='constant' to do all the dirty work for you.

这篇关于Numpy中的功率谱和数据自相关的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆