多维数组上的块状直方图 [英] Numpy histogram on multi-dimensional array

查看:70
本文介绍了多维数组上的块状直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个形状为(n_days,n_lat,n_lon)的np.array,我想为每个lat-lon单元(即,每日价值的分布)。

given an np.array of shape (n_days, n_lat, n_lon), I'd like to compute a histogram with fixed bins for each lat-lon cell (ie the distribution of daily values).

一个简单的解决方案是遍历单元格并为每个单元格调用 np.histogram

A simple solution to the problem is to loop over the cells and invoke np.histogram for each cell::

bins = np.linspace(0, 1.0, 10)
B = np.rand(n_days, n_lat, n_lon)
H = np.zeros((n_bins, n_lat, n_lon), dtype=np.int32)
for lat in range(n_lat):
    for lon in range(n_lon):
        H[:, lat, lon] = np.histogram(A[:, lat, lon], bins=bins)[0]
# note: code not tested

但这很慢。有没有不涉及循环的更有效的解决方案?

but this is quite slow. Is there a more efficient solution that does not involve a loop?

我调查了 np.searchsorted 以获取将 B 中每个值的bin索引,然后使用花式索引来更新 H ::

I looked into np.searchsorted to get the bin indices for each value in B and then use fancy indexing to update H::

bin_indices = bins.searchsorted(B)
H[bin_indices.ravel(), idx[0], idx[1]] += 1  # where idx is a index grid given by np.indices
# note: code not tested

但是这不起作用,因为就地添加运算符(+ =)似乎不支持同一单元格的多个更新。

but this does not work because the in-place add operator (+=) doesn't seem to support multiple updates of the same cell.

thx,
彼得

thx, Peter

推荐答案

您可以使用numpy.apply_along_axis消除循环。

You can use numpy.apply_along_axis to eliminate the loop.

hist, bin_edges = apply_along_axis(lambda x: histogram(x, bins=bins), 0, B)

这篇关于多维数组上的块状直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆