从另一个数据集中查找数据的对应关系 [英] Finding the correspondence of data from one data set in the other
问题描述
我有数据目录,我想在我的MCMC 代码.关键是实现的速度,以避免减慢我的马尔可夫链蒙特卡罗采样.问题:在目录中,我在第一列和第二列中有两个参数,称为 ra
和 dec
,它们是天空坐标:
I have a catalogue of data and I want to use it in my MCMC code. What is crucial is the speed of implementation, in order to avoid slowing down my Markov chain monte carlo sampling.
The problem:
In the catalogue, I have in the first and second column two parameters called ra
and dec
which are sky coordinates:
data=np.loadtxt('Final.Cluster.Shear.NegligibleShotNoise.Redshift.cat')
ra=data[:,0]
dec=data[:,1]
然后在第七列和第八列的X
和Y
位置,即网格坐标,它们是网格空间中的点
then in the seven and eight columns X
and Y
positions, i.e. the grid coordinates, they are points in a grid space
Xpos=data[:,6]
Ypos=data[:,7]
在我写的函数中,它需要被调用一百万次,我将给出一个 Xcenter
和 Ycenter
位置(例如 Xcenter=200.6,Ycenter=310.9)作为函数的输入,我想在 ra
和 dec
列.然而,可能会发生输入在 ra
和 dec
中没有任何真正对应的情况.所以我想做一个插值,以防 X
和 Y
和 ra
和 dec
没有类似的条目数据并根据目录中的真实ra
和dec
条目获得插值坐标.
In the function that I have written and it is needed to be called like a million time,
I will give one Xcenter
and Ycenter
positions (for example Xcenter=200.6, Ycenter=310.9) as inputs to the function and I want to find the correspondence points in the ra
and dec
columns. However it might happen that the inputs do not have any real correspondence in the ra
and dec
. So I want to do an interpolation in case there is no similar entries for X
and Y
and ra
and dec
data in the catalogue and obtain the interpolated coordinates based on real ra
and dec
entries in the catalogue.
推荐答案
这是一个完美的案例,可以使用 scipy.spatial.cKDTree()
类一次查询所有点:
This is a perfect case where the scipy.spatial.cKDTree()
class can be used to query all the points at once:
from scipy.spatial import cKDTree
k = cKDTree(data[:, 6:8]) # creating the KDtree using the Xpos and Ypos
xyCenters = np.array([[200.6, 310.9],
[300, 300],
[400, 400]])
print(k.query(xyCenters))
# (array([ 1.59740195, 1.56033234, 0.56352196]),
# array([ 2662, 22789, 5932]))
其中 [2662, 22789, 5932]
是对应于 xyCenters
中给出的三个最近点的索引.您可以使用这些索引非常有效地使用 np.take()
获取 ra
和 dec
值:
where [ 2662, 22789, 5932]
are the indices corresponding to the three closest points given in xyCenters
. You can use these indices to get your ra
and dec
values very efficiently using np.take()
:
dists, indices = k.query(xyCenters)
myra = np.take(ra, indices)
mydec = np.take(dec, indices)
这篇关于从另一个数据集中查找数据的对应关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!