根据大量xy点从2D数组中提取插值 [英] Extract interpolated values from a 2D array based on a large set of xy points

查看:160
本文介绍了根据大量xy点从2D数组中提取插值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从 OpenDataCube 查询返回了一个相当大的1000 x 4000像素xr.DataArray xy点值的大集合(> 200,000). 我需要对数组进行采样以在每个xy点下返回一个值,并返回内插值(例如,如果该点位于01.0像素之间的中间,则返回值应该是0.5).

I have a reasonably large 1000 x 4000 pixel xr.DataArray returned from an OpenDataCube query, and a large set (> 200,000) of xy point values. I need to sample the array to return a value under each xy point, and return interpolated values (e.g. if the point lands halfway between a 0 and a 1.0 pixel, the value returned should be 0.5).

xr.interp使我可以轻松地对插值进行采样,但是它返回一个庞大的矩阵,其中包含所有xy值的每种组合,而不仅仅是每个xy点本身的值.我尝试使用np.diagonal仅提取xy点值,但这很慢,很快会遇到内存问题,并且由于我仍然需要等待通过xr.interp插值的每种组合,因此感觉效率很低.

xr.interp lets me easily sample interpolated values, but it returns a huge matrix of every combination of all the x and y values, rather than just the values for each xy point itself. I've tried using np.diagonal to extract just the xy point values, but this is slow, very quickly runs into memory issues and feels inefficient given I still need to wait for every combination of values to be interpolated via xr.interp.

可复制的示例

(仅使用10,000个采样点(理想情况下,我需要的东西可以扩展到> 200,000或更多):

(using just 10,000 sample points (ideally, I need something that can scale to > 200,000 or more):

# Create sample array
width, height = 1000, 4000
val_array = xr.DataArray(data=np.random.randint(0, 10, size=(height, width)).astype(np.float32),
                         coords={'x': np.linspace(3000, 5000, width),
                                 'y': np.linspace(-3000, -5000, height)}, dims=['y', 'x'])

# Create sample points
n = 10000
x_points = np.random.randint(3000, 5000, size=n)
y_points = np.random.randint(-5000, -3000, size=n)

当前方法

%%timeit

# ATTEMPT 1
np.diagonal(val_array.interp(x=x_points, y=y_points).squeeze().values)
32.6 s ± 1.01 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

有人知道实现这一目标的更快或更有效的内存吗?

Does anyone know of a faster or more memory efficient way to achieve this?

推荐答案

为避免整个网格,您需要引入一个新的维度.

To avoid the full grid, you need to introduce a new dimension.

x = xr.DataArray(x_points, dims='z')
y = xr.DataArray(y_points, dims='z')
val_array.interp(x=x, y=y)

将沿着新的z维为您提供一个数组:

Will give you an array just along the new z dimension:

<xarray.DataArray (z: 10000)>
array([4.368132, 2.139781, 5.693636, ..., 3.7505  , 3.713589, 2.28494 ])
Coordinates:
    x        (z) int64 4647 4471 4692 3942 3468 ... 3040 3993 3027 4427 3749
    y        (z) int64 -3744 -4074 -3634 -3289 -3221 ... -4195 -4131 -4814 -3362
Dimensions without coordinates: z

36.9 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

高级插值.

这篇关于根据大量xy点从2D数组中提取插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆