@distributed似乎可以正常工作，函数返回很奇怪 [英] @distributed seems to work, function return is wonky

查看：51 发布时间：2021/5/28 18:45:29 julia distributed

本文介绍了@distributed似乎可以正常工作，函数返回很奇怪的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在学习如何在Julia中进行并行计算.我在3x嵌套的 for 循环的开始处使用 @sync @distributed 来并行化事物(请参见底部的代码).从行 println(errCmp [row，col])中，我可以看到数组 errCmp 的所有元素都已打印出来.例如

I'm just learning how to do parallel computing in Julia. I'm using @sync @distributed at the start of a 3x nested for loop to parallelize things (see code at bottom). From the line println(errCmp[row, col]) I can watch all the elements of the array errCmp be printed out. E.g.

From worker 3:    2.351134946074191e9
From worker 4:    2.3500830193505473e9
From worker 5:    2.3502416529551845e9
From worker 2:    2.3509105625656652e9
From worker 3:    2.3508352842971106e9
From worker 4:    2.3497049296121807e9
From worker 5:    2.35048428351797e9
From worker 2:    2.350742582031195e9
From worker 3:    2.350616273660934e9
From worker 4:    2.349709546599313e9

但是，当函数返回时， errCmp 是我在乞讨时预分配的零数组.

However, when the function returns, errCmp is the array of zeros I pre-allocate at the begging.

我想念一些收集所有东西的闭门词吗?

Am I missing some closing term to collect everything?

function optimizeDragCalc(df::DataFrame)
    paramGrid = [cd*AoM for cd = range(1e-3, stop = 0.01, length = 50), AoM = range(2e-4, stop = 0.0015, length = 50)]
    errCmp    = zeros(size(paramGrid))
    # totalSize = size(paramGrid, 1) * size(paramGrid, 2) * size(df.time, 1)
    @sync @distributed for row = 1:size(paramGrid, 1)
        for col = 1:size(paramGrid, 2)
            # Run the propagation here
            BC = 1/paramGrid[row, col]
            slns, _ = propWholeTraj(df, BC)
            for time = 1:size(df.time, 1)
                errDF = propError(slns[time], df, time)
                errCmp[row, col] += sum(errDF.totalErr)
            end # time
            # println("row: ", row, " of ",size(paramGrid, 1),"   col: ", col, " of ", size(paramGrid, 2))
            println(errCmp[row, col])
        end # col
    end # row
    # plot(heatmap(z = errCmp))
    return errCmp, paramGrid
end
errCmp, paramGrid = @time optimizeDragCalc(df)

推荐答案

您没有提供最低限度的工作示例，但我想这可能很难.这是我的MWE.让我们假设我们要使用 Distributed 来计算 Array 列的总和:

You did not provide a minimal working example but I guess it might be hard. So here is mine MWE. Let us assume that we want to use Distributed to calculate sums of Array's columns:

using Distributed
addprocs(2)
@everywhere using StatsBase
data = rand(1000,2000)
res = zeros(2000)
@sync @distributed for col = 1:size(data)[2]
    res[col] = StatsBase.mean(data[:,col])
    # does not work!
    # ... because data is created locally and never returned!
end

为了更正上面的代码，您需要提供一个聚合函数(我故意简化示例-可以进行进一步的优化).

In order to correct the above code you need to provide an aggregator function (I keep the example intentionally simplified - a further optimization is possible).

using Distributed
addprocs(2)
@everywhere using Distributed,StatsBase
data = rand(1000,2000)    
@everywhere function t2(d1,d2)
    append!(d1,d2)
    d1
end
res = @sync @distributed (t2) for col = 1:size(data)[2]
    [(myid(),col, StatsBase.mean(data[:,col]))]
end

现在让我们看一下输出.我们可以看到，有些值是在worker 2 上计算的，而其他值是在worker 3 上计算的:

Now let us see the output. We can see that some of the values have been calculated on worker 2 while others on worker 3:

julia> res
2000-element Array{Tuple{Int64,Int64,Float64},1}:
 (2, 1, 0.49703681326230276)
 (2, 2, 0.5035341367791002)
 (2, 3, 0.5050607022354537)
 ⋮
 (3, 1998, 0.4975699181976122)
 (3, 1999, 0.5009498778934444)
 (3, 2000, 0.499671315490524)

其他可能的改进/修改:

Further possible improvements/modifications:

使用 @spawnat 在远程进程(而不是主进程并发送它们)上生成值
使用 SharedArray -这可以在工作人员之间自动分配数据.根据我的经验，需要非常仔细的编程.
使用 ParallelDataTransfer.jl 在工作人员之间发送数据.非常易于使用，对于大量消息而言效率不高.
始终考虑使用Julia线程机制(在某些情况下，它使工作更轻松-再次取决于问题)

use @spawnat to generate values at remote processes (instead of the master process and sending them)
use SharedArray - this allows to automatically distribute data among workers. From my experience requires very careful programming.
use ParallelDataTransfer.jl to send data among workers. Very easy to use, not efficient for huge number of messages.
always consider Julia threading mechanism (in some scenarios it makes life easier - again depends on the problem)

这篇关于@distributed似乎可以正常工作，函数返回很奇怪的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

@distributed似乎可以正常工作，函数返回很奇怪 [英] @distributed seems to work, function return is wonky

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

@distributed似乎可以正常工作，函数返回很奇怪 [英] @distributed seems to work, function return is wonky

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭