将2D numpy数组重新采样为任意尺寸 [英] Resample 2D numpy array to arbitrary dimensions

查看:202
本文介绍了将2D numpy数组重新采样为任意尺寸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种将numpy二维数组重新缩放为任意维度的方法,以使重新缩放后的数组中的每个单元格都包含(部分)覆盖的所有单元格的加权平均值。



如果新尺寸是原始尺寸的倍数,我发现了几种方法可以做到这一点。例如,给定4x4数组,可以将其重新缩放为2x2数组,其中第一个单元格是原始单元格中左上4个单元格的均值,等等。但是这些方法似乎都不起作用,例如从4x4开始时数组转换为3x3数组。



此图像说明了从4x4(黑色网格)变为3x3(红色网格)时我想做的事情:
。如果您希望在大型列表中执行此操作,那么您将需要提高效率,因为它会非常慢。


I am looking for a way to rescale a numpy 2D array to arbitrary dimensions in such a way that each cell in the rescaled array contains a weighted mean of all the cells that it (partially) covers.

I have found several methods to do this if the new dimensions are multiples of the original dimensions. For example, given a 4x4 array, this can be rescaled into a 2x2 array where the first cell is the mean of the 4 top-left cells in the original etc. But none of these methods seem to work for example when going from a 4x4 array to a 3x3 array.

This image illustrates what I'd like to do in the case of going from 4x4 (black grid) to 3x3 (red grid): https://www.dropbox.com/s/iutym4frcphcef2/regrid.png?dl=0

Cell (0,0) in the smaller array covers the entire cell (0,0) and parts of cells (1,0), (0,1) and (1,1). I'd like the new cell to contain the mean of these cells weighted by the areas of the yellow, green, blue and orange regions.

Is there a way in to do this with numpy/scipy? Is there a name for this type of regridding (that would help when searching for a method)?

解决方案

Here you go:#

It uses the Interval package to easily calculate the overlaps of the cells of the different grids, so you'll need to grab that.

from matplotlib import pyplot
import numpy
from interval import Interval, IntervalSet

def overlap(rect1, rect2):
  """Calculate the overlap between two rectangles"""
  xInterval = Interval(rect1[0][0], rect1[1][0]) & Interval(rect2[0][0], rect2[1][0])
  yInterval = Interval(rect1[0][1], rect1[1][1]) & Interval(rect2[0][1], rect2[1][1])
  area = (xInterval.upper_bound - xInterval.lower_bound) * (yInterval.upper_bound - yInterval.lower_bound)
  return area


def meanInterp(data, m, n):

  newData = numpy.zeros((m,n))
  mOrig, nOrig = data.shape

  hBoundariesOrig, vBoundariesOrig = numpy.linspace(0,1,mOrig+1), numpy.linspace(0,1,nOrig+1)
  hBoundaries, vBoundaries = numpy.linspace(0,1,m+1), numpy.linspace(0,1,n+1)

  for iOrig in range(mOrig):
    for jOrig in range(nOrig):
      for i in range(m):
        if hBoundaries[i+1] <= hBoundariesOrig[iOrig]: continue
        if hBoundaries[i] >= hBoundariesOrig[iOrig+1]: break
        for j in range(n):
          if vBoundaries[j+1] <= vBoundariesOrig[jOrig]: continue
          if vBoundaries[j] >= vBoundariesOrig[jOrig+1]: break

          boxCoords = ((hBoundaries[i], vBoundaries[j]),(hBoundaries[i+1], vBoundaries[j+1]))
          origBoxCoords = ((hBoundariesOrig[iOrig], vBoundariesOrig[jOrig]),(hBoundariesOrig[iOrig+1], vBoundariesOrig[jOrig+1]))

          newData[i][j] += overlap(boxCoords, origBoxCoords) * data[iOrig][jOrig] / (hBoundaries[1] * vBoundaries[1])

  return newData



fig = pyplot.figure()
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)

m1, n1 = 37,59
m2, n2 = 10,13

dataGrid1 = numpy.random.rand(m1, n1)
dataGrid2 = meanInterp(dataGrid1, m2, n2)

mat1 = ax1.matshow(dataGrid1, cmap="YlOrRd")
mat2 = ax2.matshow(dataGrid2, cmap="YlOrRd")

#make both plots square
ax1.set_aspect(float(n1)/float(m1))
ax2.set_aspect(float(n2)/float(m2))



pyplot.show()

Here are a couple of examples with differing grids:

Down sampling is possible too.

After having done this, i'm pretty sure all i've done is some form of image sampling. If you're looking to do this on large lists, then you're going to need to make things a bit more efficient, as it will be pretty slow.

这篇关于将2D numpy数组重新采样为任意尺寸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆