有效地计算3D numpy阵列沿具有不同面元边缘的轴的直方图 [英] Efficiently calculate histogram of a 3D numpy array along an axis with different bin edges

查看:68
本文介绍了有效地计算3D numpy阵列沿具有不同面元边缘的轴的直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个3D numpy数组,表示为 data ,形状为N x R x C,即N个样本,R行和C列.我想获取样本和行的每种组合的沿列的直方图.但是bin边缘(请参阅bins 长度固定为S的> numpy.histogram )在不同的行上会有所不同,但会在样本之间共享.以这个示例为例,对于第一个样本( data [0] ),其第一行的bin边缘序列与第二行的bin边缘序列不同,但与第一行的bin边缘序列相同来自第二个样本( data [1] ).因此,所有面元边缘序列都存储在形状为R x S的2D numpy数组中,表示为 bin_edges .

I have a 3D numpy array, denoted as data, of shape N x R x C, i.e. N samples, R rows and C columns. I would like to obtain histograms along column for each combination of sample and row. However bin edges (see argument bins in numpy.histogram), of fixed length S, will be different at different rows but are shared across samples. Consider this example for illustration, for the 1st sample (data[0]), bin edge sequence for its 1st row is different from that for its 2nd row, but is the same as that for the 1st row from the 2nd sample (data[1]). Thus all the bin edge sequences are stored in a 2D numpy array of shape R x S, denoted as bin_edges.

我的问题是如何有效地计算直方图?

My question is how to efficiently calculate the histograms?

使用 numpy.histogram ,我能够提出一个可行但相当缓慢的解决方案,如以下代码片段所示

Using numpy.histogram, I was able to come up with a working but fairly slow solution as shown in the below code snippet

```
Get dummy data

    N: number of samples
    R: number of rows (or kernels)
    C: number of columns (or pixels)
    S: number of bins
```
import numpy as np

N, R, C, S = 100, 50, 1000, 10
data = np.random.randn(N, R, C)

# for each row/kernel, pool pixels of all samples
poolsamples = np.swapaxes(data, 0, 1).reshape(R, -1)
# use quantiles as bin edges
percentiles = np.linspace(0, 100, num=(S + 1))
bin_edges = np.transpose(np.percentile(poolsamples, percentiles, axis=1))


```
A working but slow solution of getting histograms along column
```
hist = np.empty((N, R, S))
for idx in np.arange(R):
    bin_edges_i = bin_edges[idx, :]
    counts = np.apply_along_axis(
        lambda a: np.histogram(a, bins=bin_edges_i)[0],
        1, data[:, idx, :])
    hist[:, idx, :] = counts

可能的方向

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆