如何以及何时使用@Async和@sync在朱莉娅 [英] How and When to Use @async and @sync in Julia

查看:112
本文介绍了如何以及何时使用@Async和@sync在朱莉娅的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了<$ C $的文档 C> @Async 和 @sync 宏,但仍无法弄清楚如何以及何时使用它们,也可以找到很多资源或例子为他们其他在互联网上。

I have read the documentation for the @async and @sync macros but still cannot figure out how and when to use them, nor can I find many resources or examples for them elsewhere on the internet.

我的近期目标是找到一种方法来设置几个工人做并行工作,然后等待,直到他们都完成了我的code继续。这篇文章:<一href=\"http://stackoverflow.com/questions/32143159/waiting-for-a-task-to-be-completed-on-remote-processor-in-julia/32148849#32148849\">Waiting一个任务将在朱莉娅远程处理器包含一个成功的方式做到这一点完成。我原以为这应该是可能使用 @Async @sync 宏,但我最初的失败来完成本作我不知道如果我如何正确理解和何时使用这些宏。

My immediate goal is to find a way to set several workers to do work in parallel and then wait until they have all finished to proceed in my code. This post: Waiting for a task to be completed on remote processor in Julia contains one successful way to accomplish this. I had thought it should be possible using the @async and @sync macros, but my initial failures to accomplish this made me wonder if I am understanding properly how and when to use these macros.

推荐答案

据异步在?@的文档, @Async 包装任务中的前pression。这意味着,对于任何在其范围之内,茱莉亚将开始这项任务运行,但再进行无需等待任务完成脚本最终会发生什么。因此,例如,如果没有宏你将得到

According to the documentation under ?@async, "@async wraps an expression in a Task." What this means is that for whatever falls within its scope, Julia will start this task running but then proceed to whatever comes next in the script without waiting for the task to complete. Thus, for instance, without the macro you will get:

julia> @time sleep(2)
  2.005766 seconds (13 allocations: 624 bytes)

但随着宏观,您可以:

But with the macro, you get:

julia> @time @async sleep(2)
  0.000021 seconds (7 allocations: 657 bytes)
Task (waiting) @0x0000000112a65ba0

julia> 

朱莉娅从而允许脚本进行(以及 @time 宏充分执行),而不等待任务(在此情况下,睡眠两秒钟),以完成。

Julia thus allows the script to proceed (and the @time macro to fully execute) without waiting for the task (in this case, sleeping for two seconds) to complete.

@sync 宏,相比之下,将等到 @Async 的所有动态封闭的用途, @spawn @spawnat @parallel 是完整的。 (据@同步在?文档)。于是,我们看到:

The @sync macro, by contrast, will "Wait until all dynamically-enclosed uses of @async, @spawn, @spawnat and @parallel are complete." (according to the documentation under ?@sync). Thus, we see:

julia> @time @sync @async sleep(2)
  2.002899 seconds (47 allocations: 2.986 KB)
Task (done) @0x0000000112bd2e00

在那么这个简单的例子,没有一点要包括的单个实例 @Async @sync 一起。但是,其中 @sync 可能是有用的,你必须 @Async 应用到你希望允许多个操作所有无需等待每完成马上出发。

In this simple example then, there is no point to including a single instance of @async and @sync together. But, where @sync can be useful is where you have @async applied to multiple operations that you wish to allow to all start at once without waiting for each to complete.

例如,假设我们有多个工人,我们想开始他们每个人的任务同时工作,然后取从这些任务的结果。最初的(但不正确的)企图可能是:

For example, suppose we have multiple workers and we'd like to start each of them working on a task simultaneously and then fetch the results from those tasks. An initial (but incorrect) attempt might be:

addprocs(2)
@time begin
    a = cell(nworkers())
    for (idx, pid) in enumerate(workers())
        a[idx] = remotecall_fetch(pid, sleep, 2)
    end
end
## 4.011576 seconds (177 allocations: 9.734 KB)

这里的问题是,在循环等待各remotecall_fetch()操作来完成,即,对于每个过程继续以启动下一个remotecall_fetch()操作之前完成其工作(在这种情况下,睡眠2秒)。在的实际情况而言,我们没有得到并行的好处在这里,因为我们的进程不会同时做他们的工作(即睡觉)。

The problem here is that the loop waits for each remotecall_fetch() operation to finish, i.e. for each process to complete its work (in this case sleeping for 2 seconds) before continuing to start the next remotecall_fetch() operation. In terms of a practical situation, we're not getting the benefits of parallelism here, since our processes aren't doing their work (i.e. sleeping) simultaneously.

我们可以但是,解决这个问题,通过使用 @Async @sync 宏的组合:

We can correct this, however, by using a combination of the @async and @sync macros:

@time begin
    a = cell(nworkers())
    @sync for (idx, pid) in enumerate(workers())
        @async a[idx] = remotecall_fetch(pid, sleep, 2)
    end
end
## 2.009416 seconds (274 allocations: 25.592 KB)

现在,如果我们算上循环作为一个单独的每一步操作,我们可以看到有由 @Async 宏pceded两个独立的操作$ P $。宏允许每个这些启动,和code键继续(在这种情况下,在循环的下一个步骤),每个完成之前。但是,使用 @sync 宏,其范围涵盖了整个循环,这意味着我们将不会允许脚本继续过去那种循环,直到所有操作$的通过 @Async pceded p $已经完成。

Now, if we count each step of the loop as a separate operation, we see that there are two separate operations preceded by the @async macro. The macro allows each of these to start up, and the code to continue (in this case to the next step of the loop) before each finishes. But, the use of the @sync macro, whose scope encompasses the whole loop, means that we won't allow the script to proceed past that loop until all of the operations preceded by @async have completed.

有可能通过进一步调整上面的例子中看到它在一定的修改,如何改变让这些宏的操作更加清晰的认识。例如,假设我们只需要在 @Async 没有 @sync

It is possible to get an even more clear understanding of the operation of these macros by further tweaking the above example to see how it changes under certain modifications. For instance, suppose we just have the @async without the @sync:

@time begin
    a = cell(nworkers())
    for (idx, pid) in enumerate(workers())
        println("sending work to $pid")
        @async a[idx] = remotecall_fetch(pid, sleep, 2)
    end
end
## 0.001429 seconds (27 allocations: 2.234 KB)

在这里, @Async 宏允许我们能够在我们的循环,甚至执行完毕之前,各remotecall_fetch()操作继续。但是,是好还是坏,我们没有) @sync 宏prevent持续过去的循环下去,直到所有的remotecall_fetch的code(操作完成

Here, the @async macro allows us to continue in our loop even before each remotecall_fetch() operation finishes executing. But, for better or worse, we have no @sync macro to prevent the code from continuing past this loop until all of the remotecall_fetch() operations finish.

然而,每个remotecall_fetch()操作仍在并行运行,甚至一度,我们走下去。我们可以看到,因为如果我们等待两秒钟,然后阵列的,含有该结果,将包含:

Nevertheless, each remotecall_fetch() operation is still running in parallel, even once we go on. We can see that because if we wait for two seconds, then the array a, containing the results, will contain:

sleep(2)
julia> a
2-element Array{Any,1}:
 nothing
 nothing

(将无元素的结果成功提取了休眠功能,它不返回任何值的结果)

(The "nothing" element is the result of a successful fetch of the results of the sleep function, which does not return any values)

我们也可以看到,这两个remotecall_fetch()操作基本上在同一时间启动,因为打印命令,precede他们也连续快速(输出从这些命令的这里没有显示)执行。与将打印命令执行在2秒的延迟彼此在下一个示例对比这

We can also see that the two remotecall_fetch() operations start at essentially the same time because the print commands that precede them also execute in rapid succession (output from these commands not shown here). Contrast this with the next example where the print commands execute at a 2 second lag from each other:

如果我们把 @Async 宏对整个环(而不只是它的内台阶),则再次我们的脚本将立即继续而不等待remotecall_fetch ()操作来完成。然而,现在我们只允许脚本继续过去循环的全过程。我们不允许循环的各个步骤,完成了previous人之前开始。这样,不象在上面的例子中,后两秒钟循环之后的脚本进行,有结果阵列仍具有指示所述第二remotecall_fetch()操作仍没有完成一个元件作为和#undef

If we put the @async macro on the whole loop (instead of just the inner step of it), then again our script will continue immediately without waiting for the remotecall_fetch() operations to finish. Now, however, we only allow for the script to continue past the loop as a whole. We don't allow each individual step of the loop to start before the previous one finished. As such, unlike in the example above, two seconds after the script proceeds after the loop, there is the results array still has one element as #undef indicating that the second remotecall_fetch() operation still has not completed.

@time begin
    a = cell(nworkers())
    @async for (idx, pid) in enumerate(workers())
        println("sending work to $pid")
        a[idx] = remotecall_fetch(pid, sleep, 2)
    end
end
# 0.001279 seconds (328 allocations: 21.354 KB)
# Task (waiting) @0x0000000115ec9120
## This also allows us to continue to

sleep(2)

a
2-element Array{Any,1}:
    nothing
 #undef    

和,这并不奇怪,如果我们把 @sync @Async 紧挨着对方,我们让每个remotecall_fetch()依次运行(而不是同时),但直到每个已经完成,我们不会在code继续。换句话说,这将是,我相信,如果我们的地方既不是宏本质上是等价的,就像睡眠(2)基本表现相同于 @sync @Async睡眠(2)

And, not surprisingly, if we put the @sync and @async right next to each other, we get that each remotecall_fetch() runs sequentially (rather than simultaneously) but we don't continue in the code until each has finished. In other words, this would be, I believe, essentially the equivalent of if we had neither macro in place, just like sleep(2) behaves essentially identically to @sync @async sleep(2)

@time begin
    a = cell(nworkers())
    @sync @async for (idx, pid) in enumerate(workers())
        a[idx] = remotecall_fetch(pid, sleep, 2)
    end
end
# 4.019500 seconds (4.20 k allocations: 216.964 KB)
# Task (done) @0x0000000115e52a10

请注意也有可能有 @Async 宏观的范围内更复杂的操作。该文档给出包含范围内的整个循环的例子 @Async

Note also that it is possible to have more complicated operations inside the scope of the @async macro. The documentation gives an example containing an entire loop within the scope of @async.

更新:回想一下,同步宏的帮助下称它将等到 @Async 的所有动态封闭式用途, @spawn @spawnat @parallel 完成。对于作为才算数而言,完成它很重要,你如何使用 @sync @Async 宏。考虑下面的例子,这是在上面给出的例子中的一个的轻微变化:

Update: Recall that the help for the sync macros states that it will "Wait until all dynamically-enclosed uses of @async, @spawn, @spawnat and @parallel are complete." For the purposes of what counts as "complete" it matters how you define the tasks within the scope of the @sync and @async macros. Consider the below example, which is a slight variation on one of the examples given above:

@time begin
    a = cell(nworkers())
    @sync for (idx, pid) in enumerate(workers())
        @async a[idx] = remotecall(pid, sleep, 2)
    end
end
## 0.172479 seconds (93.42 k allocations: 3.900 MB)

julia> a
2-element Array{Any,1}:
 RemoteRef{Channel{Any}}(2,1,3)
 RemoteRef{Channel{Any}}(3,1,4)

较早的实施例大致了2秒执行,这表明这两个任务是在平行并且等待每一个脚本来继续操作之前完成它们的功能的执行运行。本例中,然而,具有低得多的时间评估。其原因是, @sync 的目的,为RemoteCall()操作已完成一旦发工人的工作要做。 (请注意,所得到的数组,一个,在这里,只包含的RemoteRef对象类型,它只是表明有一些与在理论上可以在某些点在未来被取特定进程正在进行)。相比之下,remotecall_fetch()操作只完成,当它从工人获取消息,其任务​​已完成。

The earlier example took roughly 2 seconds to execute, indicating that the two tasks were run in parallel and that the script waiting for each to complete execution of their functions before proceeding. This example, however, has a much lower time evaluation. The reason is that for the purposes of @sync the remotecall() operation has "finished" once it has sent the worker the job to do. (Note that the resulting array, a, here, just contains RemoteRef object types, which just indicate that there is something going on with a particular process which could in theory be fetched at some point in the future). By contrast, the remotecall_fetch() operation has only "finished" when it gets the message from the worker that its task is complete.

因此​​,如果你正在寻找方法,以确保工人的某些行动已经在你的脚本(移动作为例如在这篇文章中讨论前完成:<一href=\"http://stackoverflow.com/questions/32143159/waiting-for-a-task-to-be-completed-on-remote-processor-in-julia?lq=1\">Waiting有关在朱莉娅远程处理器要完成的任务),有必要认真了解为完成才算数,你会如何衡量并投入运作,在你的脚本的想法。

Thus, if you are looking for ways to ensure that certain operations with workers have completed before moving on in your script (as for instance is discussed in this post: Waiting for a task to be completed on remote processor in Julia) it is necessary to think carefully about what counts as "complete" and how you will measure and then operationalize that in your script.

这篇关于如何以及何时使用@Async和@sync在朱莉娅的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆