通过numpy的数组迭代,然后在另一个数组索引值 [英] Iterating through a numpy array and then indexing a value in another array

查看:1313
本文介绍了通过numpy的数组迭代,然后在另一个数组索引值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我努力让这个code工作,我想通过一个numpy的数组迭代,并根据结果,指数在另一numpy的数组中的值,然后保存在基于价值的新位置。

I am struggling to get this code to work I want to iterate through an numpy array and based on the result, index to a value in another numpy array and then save that in a new position based on that value.

     # Convert the sediment transport and the flow direction rasters into Numpy arrays
    sediment_transport_np = arcpy.RasterToNumPyArray(sediment_transport_convert, '#', '#', '#', -9999)
    flow_direction_np = arcpy.RasterToNumPyArray(flow_direction_convert, '#', '#', '#', -9999)

    [rows,cols]= sediment_transport_np.shape
    elevation_change = np.zeros((rows,cols), np.float)

   # Main body for calculating elevation change        

    # Attempt 1
    for [i, j], flow in np.ndenumerate(flow_direction_np):
        if flow == 32:
            elevation_change[i, j]  = sediment_transport_np[i - 1, j - 1]
        elif flow == 16:
            elevation_change[i, j] = sediment_transport_np[i, j - 1]
        elif flow == 8:
            elevation_change[i, j] = sediment_transport_np[i + 1, j - 1]
        elif flow == 4:
            elevation_change[i, j] = sediment_transport_np[i + 1, j]
        elif flow == 64:
            elevation_change[i, j] = sediment_transport_np[i - 1, j]
        elif flow == 128:
            elevation_change[i, j] = sediment_transport_np[i - 1, j + 1]
        elif flow == 1:
            elevation_change[i, j] = sediment_transport_np[i, j + 1]
        elif flow == 2:
            elevation_change[i, j] = sediment_transport_np[i + 1, j + 1]

转换的numpy的阵列回到光栅

    elevation_change_raster = arcpy.NumPyArrayToRaster(elevation_change, bottom_left_corner, raster_cell_width, raster_cell_height, -9999)

    elevation_change_raster.save(output_raster)

我得到的错误是:

The error I get is:

行书elevation_change ...

Running script elevation_change...

回溯(最近通话最后一个):文件,行606,在执行IndexError:指数(655)超出范围(0℃=指数< 655)的尺寸0

Traceback (most recent call last): File "", line 606, in execute IndexError: index (655) out of range (0<=index<655) in dimension 0

未能执行(elevation_change)

Failed to execute (elevation_change)

推荐答案

这个错误是因为你试图超越 sediment_transport 栅格的界限指标(例如第i + 1和j + 1的部分)。现在,你正在试图获得,当你在网格的边界是不存在的价值。此外,它不是引发错误,但你抓住当前相对边缘,当你在I = 0或j = 0(由于I-1和J-1的部分)。

Cause of the Problem

The error is because you're trying to index beyond the bounds of the sediment_transport grid (e.g. the i+1 and j+1 portions). Right now, you're trying to get a value that doesn't exist when you're at a boundary of the grid. Also, it's not raising an error, but you're currently grabbing the opposite edge when you're at i=0 or j=0 (due to the i-1 and j-1 parts).

您提到,您想 elevation_change 的值是在边界0(这当然似乎是合理的)。另一种常见的边界条件是包装的价值观和抓住从相对边缘的值。它可能没有多大意义,在这种情况下,但我会在一两个例子说明它,因为它很容易用一些方法来实现。

You mentioned that you wanted the values of elevation_change to be 0 at the boundaries (which certainly seems reasonable). Another common boundary condition is to "wrap" the values and grab a value from the opposite edge. It probably doesn't make much sense in this case, but I'll show it in a couple of examples because it's easy to implement with some of the methods.

这是诱人的,只是捕获异常并将其值设置为0。例如:

It's tempting to just catch the exception and set the value to 0. For example:

for [i, j], flow in np.ndenumerate(flow_direction_np):
    try:
        if flow == 32:
            ...
        elif ...
            ...
    except IndexError:
        elevation_change[i, j] = 0

但是,这种方法实际上是不正确。负索引是有效的,并且将返回格的相对边缘。因此,这将基本上实现在右侧和电网的底部边缘的零的边界条件,并在左和顶部边缘的环绕式边界条件

However, this approach is actually incorrect. Negative indexing is valid, and will return the opposite edge of the grid. Therefore, this would basically implement a "zero" boundary condition on the right and bottom edges of the grid, and a "wrap-around" boundary condition on the left and top edges.

在零的边界条件的情况下,有一个很简单的方法来避免索引问题:垫在 sediment_transport 电网零。这样,如果我们的索引超出了原有电网的优势,我们会得到一个0(或任何恒定值要垫阵列)。

In the case of "zero" boundary conditions, there's a very simple way to avoid indexing problems: Pad the sediment_transport grid with zeros. This way, if we index beyond the edge of the original grid, we'll get a 0. (Or whatever constant value you'd like to pad the array with.)

附注:这是使用 numpy.pad 的理想场所。然而,它在V1.7溶液。我要在这里用它来跳过,因为OP提到的ArcGIS和圆弧不与先进的最新版本numpy的出货。

Side note: This is the perfect place to use numpy.pad. However, it was added in v1.7. I'm going to skip using it here, as the OP mentions ArcGIS, and Arc doesn't ship with an up-to-date version of numpy.

例如:

padded_transport = np.zeros((rows + 2, cols + 2), float)
padded_transport[1:-1, 1:-1] = sediment_transport  
# The two lines above could be replaced with:
#padded_transport = np.pad(sediment_transport, 1, mode='constant')

for [i, j], flow in np.ndenumerate(flow_direction):
    # Need to take into account the offset in the "padded_transport"
    r, c = i + 1, j + 1

    if flow == 32:
        elevation_change[i, j] = padded_transport[r - 1, c - 1]
    elif flow == 16:
        elevation_change[i, j] = padded_transport[r, c - 1]
    elif flow == 8:
        elevation_change[i, j] = padded_transport[r + 1, c - 1]
    elif flow == 4:
        elevation_change[i, j] = padded_transport[r + 1, c]
    elif flow == 64:
        elevation_change[i, j] = padded_transport[r - 1, c]
    elif flow == 128:
        elevation_change[i, j] = padded_transport[r - 1, c + 1]
    elif flow == 1:
        elevation_change[i, j] = padded_transport[r, c + 1]
    elif flow == 2:
        elevation_change[i, j] = padded_transport[r + 1, c + 1]

DRY(不要重复自己)

我们可以用更简洁写这篇code有点字典

elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
          16:  (0, -1), 
          8:   (1, -1),
          4:   (1,  0),
          64: (-1,  0),
          128:(-1,  1),
          1:   (0,  1),
          2:   (1,  1)}

padded_transport = np.zeros((nrows + 2, ncols + 2), float)
padded_transport[1:-1, 1:-1] = sediment_transport  

for [i, j], flow in np.ndenumerate(flow_direction):
    # Need to take into account the offset in the "padded_transport"
    r, c = i + 1, j + 1
    # This also allows for flow_direction values not listed above...
    dr, dc = lookup.get(flow, (0,0))
    elevation_change[i,j] = padded_transport[r + dr, c + dc]

此时,它的原始阵列的位多余的垫。实施不同的边界条件,如果您使用的填充是非常容易 numpy.pad ,但我们可以刚出直接写的逻辑:

At this point, it's a bit superfluous to pad the original array. Implement different boundary conditions by padding is very easy if you use numpy.pad, but we could just write the logic out directly:

elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
          16:  (0, -1), 
          8:   (1, -1),
          4:   (1,  0),
          64: (-1,  0),
          128:(-1,  1),
          1:   (0,  1),
          2:   (1,  1)}

for [i, j], flow in np.ndenumerate(flow_direction):
    dr, dc = lookup.get(flow, (0,0))
    r, c = i + dr, j + dc
    if not ((0 <= r < nrows) & (0 <= c < ncols)):
        elevation_change[i,j] = 0
    else:
        elevation_change[i,j] = sediment_transport[r, c]

矢量化的计算

通过迭代在python numpy的阵列是原因,我不会深入到这里相当缓慢。因此,也有在numpy的实现该更有效的方法。诀窍是使用 numpy.roll 与布尔索引一起。

有关环绕式的边界条件,它是那样简单:

For "wrap-around" boundary conditions, it's as simple as:

elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
          16:  (0, -1), 
          8:   (1, -1),
          4:   (1,  0),
          64: (-1,  0),
          128:(-1,  1),
          1:   (0,  1),
          2:   (1,  1)}

for value, (row, col) in lookup.iteritems():
    mask = flow_direction == value
    shifted = np.roll(mask, row, 0)
    shifted = np.roll(shifted, col, 1)
    elevation_change[mask] = sediment_transport[shifted]

return elevation_change

如果你不熟悉numpy的,这可能看起来有点像希腊。有两个部分此。首先是使用布尔索引。由于这是什么做一个简单的例子:

If you're not familiar with numpy, this probably looks a bit like greek. There are two parts to this. The first is using boolean indexing. As a quick example of what this does:

In [1]: import numpy as np

In [2]: x = np.arange(9).reshape(3,3)

In [3]: x
Out[3]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [4]: mask = np.array([[False, False, True],
...                      [True, False, False],
...                      [True, False, False]])


In [5]: x[mask]
Out[5]: array([2, 3, 6])

正如你所看到的,如果我们的索引具有相同形状的布尔网格阵列,它是真正的价值,将被退回。同样,您可以设置的值这个样子。

As you can see, if we index an array with a boolean grid of the same shape, the values where it is True, will be returned. Similarly, you can set values this way.

接下来的诀窍是 numpy.roll 。这将值由给定的量在一个方向上移动。他们将环绕式的边缘。

The next trick is numpy.roll. This will shift the values by a given amount in one direction. They'll "wrap-around" at the edges.

In [1]: import numpy as np

In [2]: np.array([[0,0,0],[0,1,0],[0,0,0]])
Out[2]: 
array([[0, 0, 0],
       [0, 1, 0],
       [0, 0, 0]])

In [3]: x = _

In [4]: np.roll(x, 1, axis=0)
Out[4]: 
array([[0, 0, 0],
       [0, 0, 0],
       [0, 1, 0]])

In [5]: np.roll(x, 1, axis=1)
Out[5]: 
array([[0, 0, 0],
       [0, 0, 1],
       [0, 0, 0]])

希望这使得有点感觉,在任何速度。

Hopefully that makes a bit of sense, at any rate.

要实现(使用 numpy.pad 或任意边界条件),我们会做这样的事零的边界条件:

To implement "zero" boundary conditions (or arbitrary boundary conditions using numpy.pad), we'd do something like this:

def vectorized(flow_direction, sediment_transport):
    elevation_change = np.zeros_like(sediment_transport)
    nrows, ncols = flow_direction.shape
    lookup = {32: (-1, -1),
              16:  (0, -1), 
              8:   (1, -1),
              4:   (1,  0),
              64: (-1,  0),
              128:(-1,  1),
              1:   (0,  1),
              2:   (1,  1)}

    # Initialize an array for the "shifted" mask
    shifted = np.zeros((nrows+2, ncols+2), dtype=bool)

    # Pad "sediment_transport" with zeros
    # Again, `np.pad` would be better and more flexible here, as it would
    # easily allow lots of different boundary conditions...
    tmp = np.zeros((nrows+2, ncols+2), sediment_transport.dtype)
    tmp[1:-1, 1:-1] = sediment_transport
    sediment_transport = tmp

    for value, (row, col) in lookup.iteritems():
        mask = flow_direction == value

        # Reset the "shifted" mask
        shifted.fill(False)
        shifted[1:-1, 1:-1] = mask

        # Shift the mask by the right amount for the given value
        shifted = np.roll(shifted, row, 0)
        shifted = np.roll(shifted, col, 1)

        # Set the values in elevation change to the offset value in sed_trans
        elevation_change[mask] = sediment_transport[shifted]

    return elevation_change

优势向矢量

在矢量的版本快很多,但是会使用更多的内存。

Advantage to Vectorization

The "vectorized" version is much faster, but will use more RAM.

对于1000通过1000网:

For a 1000 by 1000 grid:

In [79]: %timeit vectorized(flow_direction, sediment_transport)
10 loops, best of 3: 170 ms per loop

In [80]: %timeit iterate(flow_direction, sediment_transport)
1 loops, best of 3: 5.36 s per loop

In [81]: %timeit lookup(flow_direction, sediment_transport)
1 loops, best of 3: 3.4 s per loop

这些结果是从比较用随机生成的数据如下实现:

These results are from comparing the following implementations with randomly generated data:

import numpy as np

def main():
    # Generate some random flow_direction and sediment_transport data...
    nrows, ncols = 1000, 1000
    flow_direction = 2 ** np.random.randint(0, 8, (nrows, ncols))
    sediment_transport = np.random.random((nrows, ncols))

    # Make sure all of the results return the same thing...
    test1 = vectorized(flow_direction, sediment_transport)
    test2 = iterate(flow_direction, sediment_transport)
    test3 = lookup(flow_direction, sediment_transport)
    assert np.allclose(test1, test2)
    assert np.allclose(test2, test3)


def vectorized(flow_direction, sediment_transport):
    elevation_change = np.zeros_like(sediment_transport)
    sediment_transport = np.pad(sediment_transport, 1, mode='constant')
    lookup = {32: (-1, -1),
              16:  (0, -1), 
              8:   (1, -1),
              4:   (1,  0),
              64: (-1,  0),
              128:(-1,  1),
              1:   (0,  1),
              2:   (1,  1)}

    for value, (row, col) in lookup.iteritems():
        mask = flow_direction == value
        shifted = np.pad(mask, 1, mode='constant')
        shifted = np.roll(shifted, row, 0)
        shifted = np.roll(shifted, col, 1)
        elevation_change[mask] = sediment_transport[shifted]

    return elevation_change

def iterate(flow_direction, sediment_transport):
    elevation_change = np.zeros_like(sediment_transport)
    padded_transport = np.pad(sediment_transport, 1, mode='constant')  

    for [i, j], flow in np.ndenumerate(flow_direction):
        r, c = i + 1, j + 1
        if flow == 32:
            elevation_change[i, j] = padded_transport[r - 1, c - 1]
        elif flow == 16:
            elevation_change[i, j] = padded_transport[r, c - 1]
        elif flow == 8:
            elevation_change[i, j] = padded_transport[r + 1, c - 1]
        elif flow == 4:
            elevation_change[i, j] = padded_transport[r + 1, c]
        elif flow == 64:
            elevation_change[i, j] = padded_transport[r - 1, c]
        elif flow == 128:
            elevation_change[i, j] = padded_transport[r - 1, c + 1]
        elif flow == 1:
            elevation_change[i, j] = padded_transport[r, c + 1]
        elif flow == 2:
            elevation_change[i, j] = padded_transport[r + 1, c + 1]
    return elevation_change

def lookup(flow_direction, sediment_transport):
    elevation_change = np.zeros_like(sediment_transport)
    nrows, ncols = flow_direction.shape
    lookup = {32: (-1, -1),
              16:  (0, -1), 
              8:   (1, -1),
              4:   (1,  0),
              64: (-1,  0),
              128:(-1,  1),
              1:   (0,  1),
              2:   (1,  1)}

    for [i, j], flow in np.ndenumerate(flow_direction):
        dr, dc = lookup.get(flow, (0,0))
        r, c = i + dr, j + dc
        if not ((0 <= r < nrows) & (0 <= c < ncols)):
            elevation_change[i,j] = 0
        else:
            elevation_change[i,j] = sediment_transport[r, c]

    return elevation_change

if __name__ == '__main__':
    main()

这篇关于通过numpy的数组迭代,然后在另一个数组索引值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆