如何使用Dask对模具进行编程 [英] How to programm a stencil with Dask

查看:88
本文介绍了如何使用Dask对模具进行编程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在很多情况下,科学家们使用模具来模拟系统的动力学,这使数学运算符在网格上卷积。通常,此操作会消耗大量计算资源。 此处是对该想法的很好解释。

In many occasions, scientists simulates a system's dynamics using a Stencil, this is convolving a mathematical operator over a grid. Commonly, this operation consumes a lot of computational resources. Here is a good explanation of the idea.

在numpy中,编写2D 5点模具的规范方法如下:

In numpy, the canonical way of programming a 2D 5-points stencil is as follows:

for i in range(rows):
    for j in range(cols):
        grid[i, j] = ( grid[i,j] + grid[i-1,j] + grid[i+1,j] + grid[i,j-1] + grid[i,j+1]) / 5

或更有效的方法是使用切片:

Or, more efficiently, using slicing:

grid[1:-1,1:-1] = ( grid[1:-1,1:-1] + grid[0:-2,1:-1] + grid[2:,1:-1] + grid[1:-1,0:-2] + grid[1:-1,2:] ) / 5

但是,如果您的网格很大,则无法在内存中修复,或者如果卷积操作真的很复杂,则将需要很长时间,因此使用并行编程技术来克服此问题或仅获得结果快点。像 Dask 之类的工具,科学家可以以几乎透明的方式自行编写此模拟程序。目前,Dask不支持项目分配,因此,如何使用Dask对模具进行编程。

However, if your grid is really big, it won't fix in your memory, or if the convolution operation is really complicated it will take a very long time, parallel programing techniques are use to overcome this problems or simply to get the result faster. Tools like Dask allow scientist to program this simulations by themselves, in a parallel-almost-transparent manner. Currently, Dask doesn't support item assignment, so, how can I program a stencil with Dask.

推荐答案

很好的问题。您是正确的 dask.array 提供并行计算,但不支持项目分配。我们可以通过使一个函数一次处理一个numpy数据块,然后将该函数映射到我们的数组上并以稍微重叠的边界来解决模板计算。

Nice question. You're correct that dask.array do provide parallel computing but don't doesn't support item assignment. We can solve stencil computations by making a function to operate on a block of numpy data at a time and then by mapping that function across our array with slightly overlapping boundaries.

您应该制作一个函数,该函数需要一个numpy数组并返回一个应用了模板的新numpy数组。

You should make a function that takes a numpy array and returns a new numpy array with the stencil applied. This should not modify the original array.

def apply_stencil(x):
    out = np.empty_like(x)
    ...  # do arbitrary computations on out    
    return out



映射a具有重叠区域的函数



Dask数组通过将数组分解为较小的数组的不相交的块来并行操作。诸如模板计算之类的操作将需要相邻块之间的一点重叠。幸运的是,可以使用 dask.array.ghost 模块进行处理, dask.array.map_overlap 方法

Map a function with overlapping regions

Dask arrays operate in parallel by breaking an array into disjoint chunks of smaller arrays. Operations like stencil computations will require a bit of overlap between neighboring blocks. Fortunately this can be handled with the dask.array.ghost module, and the dask.array.map_overlap method in particular.

实际上, map_overlap 文档字符串中的示例是一维正向有限差分计算

Actually, the example in the map_overlap docstring is a 1d forward finite difference computation

>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = from_array(x, chunks=5)
>>> def derivative(x):
...     return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1,  0,  1,  1,  0,  0, -1, -1,  0])

这篇关于如何使用Dask对模具进行编程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆