互动 - [R异步指挥调度 [英] Asynchronous command dispatch in interactive R

查看:105
本文介绍了互动 - [R异步指挥调度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道如果这是可以做到使用并行处理后端之一研究(它可能不是)。我已经尝试了一些谷歌搜索,并拿出什么。

I'm wondering if this is possible to do (it probably isn't) using one of the parallel processing backends in R. I've tried a few google searches and come up with nothing.

普遍的问题我都不得不时刻:

The general problem I have at the moment:


  • 我有大约需要半小时到负荷
  • 一些大型物体
  • 我要生成一系列的数据图的(需要几分钟)。

  • 我想用数据去和做其他事情,而发生这种情况(虽然不改变基本的数据!)

理想我将能够从交互式会话调度命令,而不必等待它返回(这样我就可以走做其他的事情,而我等待的情节渲染)。这是可能的,或者这是一厢情愿的情况?

Ideally I would be able to dispatch the command from the interactive session, and not have to wait for it to return (so I can go do other things while I wait for the plot to render). Is this possible, or is this a case of wishful thinking?

推荐答案

要扩大德克的回答,我建议您使用雪花API中的平行包。在 mcparallel 功能似乎是为这个完美的(如果你不使用Windows),但它不执行图形操作,由于它的使用很好地工作。用雪的API的问题是,它不正式支持异步操作。然而,这是相当容易,如果你不介意使用非导出函数作弊的事。如果你看一下code为 clusterCall ,你能弄清楚如何异步提交任务:

To expand on Dirk's answer, I suggest that you use the "snow" API in the parallel package. The mcparallel function would seem to be perfect for this (if you're not using Windows), but it doesn't work well for performing graphic operations due to it's use of fork. The problem with the "snow" API is that it doesn't officially support asynchronous operations. However, it's rather easy to do if you don't mind cheating by using non-exported functions. If you look at the code for clusterCall, you can figure out how to submit tasks asynchronously:

> library(parallel)
> clusterCall
function (cl = NULL, fun, ...) 
{
    cl <- defaultCluster(cl)
    for (i in seq_along(cl)) sendCall(cl[[i]], fun, list(...))
    checkForRemoteErrors(lapply(cl, recvResult))
}

所以,你只需要使用 sendCall 提交任务, recvResult 来等待结果。下面是使用 bigmemory 包这样一个例子,由德克建议。

So you just use sendCall to submit a task, and recvResult to wait for the result. Here's an example of that using the bigmemory package, as suggested by Dirk.

您可以创建一个大矩阵使用功能,如 big.matrix as.big.matrix 。你可能会想这样做有效的,但我会使用 as.big.matrix 以Z >:

You can create a "big matrix" using functions such as big.matrix or as.big.matrix. You'll probably want to do that efficiently, but I'll just convert a matrix z using as.big.matrix:

library(bigmemory)
big <- as.big.matrix(z)

现在我将创建一个集群,每个工人连接到使用描述 attach.big.matrix

Now I'll create a cluster and connect each of the workers to big using describe and attach.big.matrix:

cl <- makePSOCKcluster(2)
worker.init <- function(descr) {
  library(bigmemory)
  big <<- attach.big.matrix(descr)
  X11()  # use "quartz()" on a Mac; "windows()" on Windows
  NULL
}
clusterCall(cl, worker.init, describe(big))

这也除了每个工人打开图形窗口,连接到大的矩阵。

This also opens graphics window on each worker in addition to attaching to the big matrix.

要叫 persp 上的第一个簇的工人,我们使用 sendCall

To call persp on the first cluster worker, we use sendCall:

parallel:::sendCall(cl[[1]], function() {persp(big[]); NULL}, list())

这将返回几乎立即,虽然直到出现的情节,可能需要一段时间。在这一点上,你可以提交任务到其他群集工人,或做其他事是完全无关的。只要确保你提交另一个任务到同一个工人之前读取结果:

This returns almost immediately, although it may take awhile until the plot appears. At this point, you can submit tasks to the other cluster worker, or do something else that is completely unrelated. Just make sure that you read the result before submitting another task to the same worker:

r1 <- parallel:::recvResult(cl[[1]])

当然,这一切都是非常容易出错,而不是在所有的pretty,但你可以写一些功能,使其更容易。只要记住,非导出的函数,如这些都可以有R的任何新版本改变。

Of course, this is all very error prone and not at all pretty, but you could write some functions to make it easier. Just keep in mind that non-exported functions such as these can change with any new release of R.

请注意,这是完全可能的,通过合法的子集群集对象执行一个特定的工人一个任务或一组工人。例如:

Note that it is perfectly possible and legitimate to execute a task on a specific worker or set of workers by subsetting the cluster object. For example:

clusterEvalQ(cl[1], persp(big[]))

这将任务发送给第一个工作,而其他什么也不做。但当然,这是的同步的,所以你不能做其他群集工人任何事情,直到这个任务完成。我知道异步发送任务的唯一方法是作弊。

This will send the task to the first worker while the others do nothing. But of course, this is synchronous, so you can't do anything on the other cluster workers until this task finishes. The only way that I know to send the tasks asynchronously is to cheat.

这篇关于互动 - [R异步指挥调度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆