从另一个数据集中查找数据的对应关系 [英] Finding the correspondence of data from one data set in the other

查看:36
本文介绍了从另一个数据集中查找数据的对应关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数据目录,我想在我的MCMC 代码.关键是实现的速度,以避免减慢我的马尔可夫链蒙特卡罗采样.问题:在目录中,我在第一列和第二列中有两个参数,称为 radec,它们是天空坐标:

I have a catalogue of data and I want to use it in my MCMC code. What is crucial is the speed of implementation, in order to avoid slowing down my Markov chain monte carlo sampling. The problem: In the catalogue, I have in the first and second column two parameters called ra and dec which are sky coordinates:

data=np.loadtxt('Final.Cluster.Shear.NegligibleShotNoise.Redshift.cat')
ra=data[:,0]
dec=data[:,1]

然后在第七列和第八列的XY位置,即网格坐标,它们是网格空间中的点

then in the seven and eight columns X and Y positions, i.e. the grid coordinates, they are points in a grid space

Xpos=data[:,6]
Ypos=data[:,7]

在我写的函数中,它需要被调用一百万次,我将给出一个 XcenterYcenter 位置(例如 Xcenter=200.6,Ycenter=310.9)作为函数的输入,我想在 radec 列.然而,可能会发生输入在 radec 中没有任何真正对应的情况.所以我想做一个插值,以防 XYradec 没有类似的条目数据并根据目录中的真实radec条目获得插值坐标.

In the function that I have written and it is needed to be called like a million time, I will give one Xcenter and Ycenter positions (for example Xcenter=200.6, Ycenter=310.9) as inputs to the function and I want to find the correspondence points in the ra and dec columns. However it might happen that the inputs do not have any real correspondence in the ra and dec. So I want to do an interpolation in case there is no similar entries for X and Y and ra and dec data in the catalogue and obtain the interpolated coordinates based on real ra and dec entries in the catalogue.

推荐答案

这是一个完美的案例,可以使用 scipy.spatial.cKDTree() 类一次查询所有点:

This is a perfect case where the scipy.spatial.cKDTree() class can be used to query all the points at once:

from scipy.spatial import cKDTree

k = cKDTree(data[:, 6:8]) # creating the KDtree using the Xpos and Ypos

xyCenters = np.array([[200.6, 310.9],
                      [300, 300],
                      [400, 400]])
print(k.query(xyCenters))
# (array([ 1.59740195,  1.56033234,  0.56352196]),
#  array([ 2662, 22789,  5932]))

其中 [2662, 22789, 5932] 是对应于 xyCenters 中给出的三个最近点的索引.您可以使用这些索引非常有效地使用 np.take() 获取 radec 值:

where [ 2662, 22789, 5932] are the indices corresponding to the three closest points given in xyCenters. You can use these indices to get your ra and dec values very efficiently using np.take():

dists, indices = k.query(xyCenters)
myra = np.take(ra, indices)
mydec = np.take(dec, indices)

这篇关于从另一个数据集中查找数据的对应关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆