numpy:通过关联从关联中找到最小值和最大值 [英] Numpy: Finding minimum and maximum values from associations through binning

查看:241
本文介绍了numpy:通过关联从关联中找到最小值和最大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个源自此帖子的问题.因此,对该问题的一些介绍将类似于该帖子.

This is a question derived from this post. So, some of the introduction of the problem will be similar to that post.

假设result是2D数组,而values是1D数组. values保留一些与result中的每个元素关联的值. values中的元素到result的映射存储在x_mappingy_mapping中. result中的位置可以与不同的值关联.现在,我必须找到按关联分组的最小值和最大值.

Let's say result is a 2D array and values is a 1D array. values holds some values associated with each element in result. The mapping of an element in values to result is stored in x_mapping and y_mapping. A position in result can be associated with different values. Now, I have to find the minimum and maximum of the values grouped by associations.

一个更好地说明问题的例子.

An example for better clarification.

min_result数组:

[[0, 0],
[0, 0],
[0, 0],
[0, 0]]

max_result数组:

[[0, 0],
[0, 0],
[0, 0],
[0, 0]]

values数组:

[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.]

注意:这里result数组和values具有相同数量的元素.但事实并非如此.大小之间根本没有关系.

Note: Here result arrays and values have the same number of elements. But it might not be the case. There is no relation between the sizes at all.

x_mappingy_mapping具有从1D values到2D result的映射(最小和最大). x_mappingy_mappingvalues的大小将相同.

x_mapping and y_mapping have mappings from 1D values to 2D result(both min and max). The sizes of x_mapping, y_mapping and values will be the same.

x_mapping-[0, 1, 0, 0, 0, 0, 0, 0]

y_mapping-[0, 3, 2, 2, 0, 3, 2, 1]

此处,第一个值(values[0])和第五个值(values[4])的x为0,y为0(x_mapping[0]y_mappping[0]),因此与result[0, 0]相关联.如果我们从该组计算最小值和最大值,则结果将分别为1和5.因此,min_result[0, 0]将具有1,max_result[0, 0]将具有5.

Here, 1st value(values[0]) and 5th value(values[4]) have x as 0 and y as 0(x_mapping[0] and y_mappping[0]) and hence associated with result[0, 0]. If we compute the minimum and maximum from this group, we will have 1 and 5 as results respectively. So, min_result[0, 0] will have 1 and max_result[0, 0] will have 5.

请注意,如果根本没有关联,则result的默认值为零.

Note that if there is no association at all then the default value for result will be zero.

x_mapping = np.array([0, 1, 0, 0, 0, 0, 0, 0])
y_mapping = np.array([0, 3, 2, 2, 0, 3, 2, 1])
values = np.array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.], dtype=np.float32)
max_result = np.zeros([4, 2], dtype=np.float32)
min_result = np.zeros([4, 2], dtype=np.float32) 
min_result[-y_mapping, x_mapping] = values # randomly initialising from values
for i in range(values.size):
    x = x_mapping[i]
    y = y_mapping[i]
    # maximum
    if values[i] > max_result[-y, x]:
        max_result[-y, x] = values[i]
    # minimum
    if values[i] < min_result[-y, x]:
        min_result[-y, x] = values[i]

min_result

[[1., 0.],
[6., 2.],
[3., 0.],
[8., 0.]]

max_result

[[5., 0.],
[6., 2.],
[7., 0.],
[8., 0.]]

失败的解决方案

#1

min_result = np.zeros([4, 2], dtype=np.float32)
np.minimum.reduceat(values, [-y_mapping, x_mapping], out=min_result)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-126de899a90e> in <module>()
1 min_result = np.zeros([4, 2], dtype=np.float32)
----> 2 np.minimum.reduceat(values, [-y_mapping, x_mapping], out=min_result)

ValueError: object too deep for desired array

#2

min_result = np.zeros([4, 2], dtype=np.float32)
np.minimum.reduceat(values, lidx, out= min_result)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-07e8c75ccaa5> in <module>()
1 min_result = np.zeros([4, 2], dtype=np.float32)
----> 2 np.minimum.reduceat(values, lidx, out= min_result)

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (4,2)->(4,) (8,)->() (8,)->(8,) 

#3

lidx = ((-y_mapping) % 4) * 2 + x_mapping #from mentioned post
min_result = np.zeros([8], dtype=np.float32)
np.minimum.reduceat(values, lidx, out= min_result).reshape(4,2)

[[1., 4.],
[5., 5.],
[1., 3.],
[5., 7.]]

问题

如何使用np.minimum.reduceatnp.maximum.reduceat解决此问题?我正在寻找针对运行时进行优化的解决方案.

Question

How to use np.minimum.reduceat and np.maximum.reduceat for solving this problem? I'm looking for a solution that is optimised for runtime.

我正在将Numpy版本1.14.3与Python 3.5.2结合使用

I'm using Numpy version 1.14.3 with Python 3.5.2

推荐答案

方法1

同样,最直观的方法是使用

Again, the most intuitive ones would be with numpy.ufunc.at. Now, since, these reductions would be performed against the existing values, we need to initialize the output with max values for minimum reductions and min values for maximum ones. Hence, the implementation would be -

min_result[-y_mapping, x_mapping] = values.max()
max_result[-y_mapping, x_mapping] = values.min()

np.minimum.at(min_result, [-y_mapping, x_mapping], values)
np.maximum.at(max_result, [-y_mapping, x_mapping], values)

方法2

要利用np.ufunc.reduceat,我们需要对数据进行排序-

To leverage np.ufunc.reduceat, we need to sort data -

m,n = max_result.shape
out_dtype = max_result.dtype
lidx = ((-y_mapping)%m)*n + x_mapping

sidx = lidx.argsort()
idx = lidx[sidx]
val = values[sidx]

m_idx = np.flatnonzero(np.r_[True,idx[:-1] != idx[1:]])
unq_ids = idx[m_idx]

max_result_out.flat[unq_ids] = np.maximum.reduceat(val, m_idx)
min_result_out.flat[unq_ids] = np.minimum.reduceat(val, m_idx)

这篇关于numpy:通过关联从关联中找到最小值和最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆