沿轴应用柔光性能 [英] dask performance apply along axis

查看:53
本文介绍了沿轴应用柔光性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用dask在大型高分辨率海洋模型数据集上计算随时间的线性趋势.

I am trying to compute the linear trend over time on a large high resolution ocean model dataset using dask.

我已遵循此示例(应用函数沿着简单数组的轴),发现 apply_along_axis 的语法更容易.

I have followed this example (Applying a function along an axis of a dask array) and found the syntax of apply_along_axis easier.

我目前正在使用 dask.array.apply_along_axis 将numpy函数包装在1维数组上,然后将生成的dask数组打包到xarray Dataarray 中.使用 top -u< username> 建议不要并行执行计算(大约100%使用cpu).

I am currently using dask.array.apply_along_axis to wrap a numpy function on 1 dimensional arrays and then package the resulting dask array into an xarray Dataarray. Using top -u <username> suggest that the computation is not executed in parallel (~100% cpu use).

我应该期望 map_blocks 的性能更好吗?还是对如何提高 apply_along_axis 的性能有任何建议?任何提示都将受到高度赞赏.

Should I expect a better performance from map_blocks? Or are there any suggestions on how to improve the performance of apply_along_axis? Any tips are highly appreciated.

import numpy as np
from scipy import optimize
import xarray as xr
import dask.array as dsa

def _lin_trend(y):
    x = np.arange(len(y))
    return np.polyfit(x, y, 1)



def linear_trend(da, dim, name='parameter'):
    da = da.copy()
    axis_num = da.get_axis_num(dim)

    dims = list(da.dims)
    dims[axis_num] = name
    coords = da.rename({dim:name}).coords
    coords[name] = ['slope', 'intercept']

    dsk = da.data
    dsk_trend = dsa.apply_along_axis(_lin_trend,0,dsk)
    out = xr.DataArray(dsk_trend, dims=dims, coords=coords)
    return out

推荐答案

我认为最终的性能会受到我正在处理的文件系统的限制.为了回答您的问题,我的数据集具有以下形状:

I think that ultimately the performance is limited by the filesystem I am working on. To answer your question though, my dataset has the following shape:

<xarray.Dataset>
Dimensions:         (st_edges_ocean: 51, st_ocean: 50, time: 101, xt_ocean: 3600, yt_ocean: 2700)
Coordinates:
  * xt_ocean        (xt_ocean) float64 -279.9 -279.8 -279.7 -279.6 -279.5 ...
  * yt_ocean        (yt_ocean) float64 -81.11 -81.07 -81.02 -80.98 -80.94 ...
  * st_ocean        (st_ocean) float64 5.034 15.1 25.22 35.36 45.58 55.85 ...
  * st_edges_ocean  (st_edges_ocean) float64 0.0 10.07 20.16 30.29 40.47 ...
  * time            (time) float64 3.634e+04 3.671e+04 3.707e+04 3.744e+04 ...

因此,它很大,需要很长时间才能从磁盘读取.我将其重新分块,以便时间维度为单个块

So it is rather big and needs a long time to read from disk. I have rechunked it so that the time dimension is a single chunk

dask.array<concatenate, shape=(101, 50, 2700, 3600), dtype=float64, 
chunksize=(101, 1, 270, 3600)>

这对性能没有太大的影响(完成该功能大约需要20个小时(包括读取和写入磁盘).我目前只是在时间上分块,例如

That did not make a big difference for the performance (it still takes about 20 hrs for the function to finish (that is including reading and writing to disk). I am currently only chunking in time, e.g.

dask.array<concatenate, shape=(101, 50, 2700, 3600), dtype=float64, 
chunksize=(1, 1, 2700, 3600)>

我对这两种方法的相对性能感兴趣,并在笔记本电脑上进行了测试.

I was interested in the relative performance of both methods and ran a test on my laptop.

import xarray as xr
import numpy as np
from scipy import stats
import dask.array as dsa

slope = 10
intercept = 5
t = np.arange(250)
x = np.arange(10)
y = np.arange(500)
z = np.arange(200)
chunks = {'x':10, 'y':10}

noise = np.random.random([len(x), len(y), len(z), len(t)])
ones = np.ones_like(noise)
time = ones*t
data = (time*slope+intercept)+noise
da = xr.DataArray(data, dims=['x', 'y', 'z', 't'],
                 coords={'x':('x', x),
                        'y':('y', y),
                        'z':('z', z),
                        't':('t', t)})
da = da.chunk(chunks)
da

我现在定义了一组私有函数(使用linregress和polyfit来计算时间序列的斜率),以及使用dask.apply_along和xarray.apply_ufunc的不同实现.

I now defined a set of private functions (using both linregress and polyfit to calculate the slope of a timeseries), as well as different implementations using dask.apply_along and xarray.apply_ufunc.

def _calc_slope_poly(y):
    """ufunc to be used by linear_trend"""
    x = np.arange(len(y))
    return np.polyfit(x, y, 1)[0]


def _calc_slope(y):
    '''returns the slop from a linear regression fit of x and y'''
    x = np.arange(len(y))
    return stats.linregress(x, y)[0]

def linear_trend_along(da, dim):
    """computes linear trend over 'dim' from the da.
       Slope and intercept of the least square fit are added to a new
       DataArray which has the dimension 'name' instead of 'dim', containing
       slope and intercept for each gridpoint
    """
    da = da.copy()
    axis_num = da.get_axis_num(dim)
    trend = dsa.apply_along_axis(_calc_slope, axis_num, da.data)
    return trend

def linear_trend_ufunc(obj, dim):
    trend = xr.apply_ufunc(_calc_slope, obj,
                           vectorize=True,
                           input_core_dims=[[dim]],
                           output_core_dims=[[]],
                           output_dtypes=[np.float],
                           dask='parallelized')

    return trend

def linear_trend_ufunc_poly(obj, dim):
    trend = xr.apply_ufunc(_calc_slope_poly, obj,
                           vectorize=True,
                           input_core_dims=[[dim]],
                           output_core_dims=[[]],
                           output_dtypes=[np.float],
                           dask='parallelized')

    return trend

def linear_trend_along_poly(da, dim):
    """computes linear trend over 'dim' from the da.
       Slope and intercept of the least square fit are added to a new
       DataArray which has the dimension 'name' instead of 'dim', containing
       slope and intercept for each gridpoint
    """
    da = da.copy()
    axis_num = da.get_axis_num(dim)
    trend = dsa.apply_along_axis(_calc_slope_poly, axis_num, da.data)
    return trend

trend_ufunc = linear_trend_ufunc(da, 't')
trend_ufunc_poly = linear_trend_ufunc_poly(da, 't')
trend_along = linear_trend_along(da, 't')
trend_along_poly = linear_trend_along_poly(da, 't')

对计算进行计时似乎表明 apply_along 方法可能会稍快一些.使用polyfit而不是linregress似乎有很大的影响.我不确定为什么这样做会快得多,但是您可能对此很感兴趣.

Timing the computation seems to indicate that the apply_along method might be marginally faster. Using polyfit instead of linregress seems to have quite a big influences though. I am not sure why this is much faster but perhaps this is of interest to you.

%%timeit 
print(trend_ufunc[1,1,1].data.compute())

4.89 s±180毫秒/循环(平均±标准偏差,共7次运行,每个循环1次)

4.89 s ± 180 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit 
trend_ufunc_poly[1,1,1].compute()

每个循环2.74 s±182毫秒(平均±标准偏差,共运行7次,每个循环1次)

2.74 s ± 182 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit 
trend_along[1,1,1].compute()

每个循环4.58 s±193毫秒(平均±标准偏差,共运行7次,每个循环1次)

4.58 s ± 193 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
trend_along_poly[1,1,1].compute()

2.64 s±65毫秒/循环(平均±标准偏差,共7次运行,每个循环1次)

2.64 s ± 65 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

这篇关于沿轴应用柔光性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆