如何在与提交计算机不同的计算机上获得Dask计算的结果? [英] How can I get result of Dask compute on a different machine than the one that submitted it?

查看:127
本文介绍了如何在与提交计算机不同的计算机上获得Dask计算的结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Django服务器后面使用Dask,这里总结了我的基本设置: https://github.com/MoonVision/django-dask-demo/ 可以在这里找到Dask客户端: https://github.com/MoonVision/django-dask-demo/blob/master/demo/daskmanager/daskmanager.py

I am using Dask behind a Django server and the basic setup I have is summarised here: https://github.com/MoonVision/django-dask-demo/ where the Dask client can be found here: https://github.com/MoonVision/django-dask-demo/blob/master/demo/daskmanager/daskmanager.py

我希望能够将任务的保存与提交任务的服务器分开,以实现健壮性和可伸缩性。我还想了解有关任务处理状态的更多详细信息,现在即使任务正在处理,将来的状态也始终处于待定状态。

I want to be able to separate the saving of a task from the server that submitted it for robustness and scalability. I also would like more detailed information as to the processing status of the task, right now the future status is always pending even if the task is processing. Having a rough estimate of percent complete would also be great.

现在,如果Web服务器死了,客户端将被删除并且任务将停止,因为没有客户仍然抱着未来。我可以使用 fire_and_forget 解决这个问题,但是我然后无法保存任务状态和结果。

Right now, if the web server were to die, the client would get deleted and the task would stop as no client is still holding the future. I can get around this by using fire_and_forget but I then have no way to save the task status and result when it completes.

我看到了在fire_and_forget之后跟踪状态并保存结果的方法:

Ways I see to track the status and save the result after a fire_and_forget:


  1. 我可以有一个调度程序插件,该插件将所有传输发送到AMPQ服务器(RabbitMQ)。我喜欢这种鲁棒性,并且能够订阅调度程序输出的某些消息,并且知道每条消息都会得到处理。我不确定如何通过这种方法自行获得结果。我可以手动在每个图形的末尾添加一个节点以保存结果,但希望它在幕后。

  1. I could have a scheduler plugin that sends all transfers to AMPQ server (RabbitMQ). I like the robustness and being able to subscribe to certain messages that are output by the scheduler and knowing every message will be processed. I'm not sure how I could get the result it self with this method. I could manually adding a node to the end of every graph to save the result but would rather have it be behind the scenes.

get_task_stream 或以某种方式使用。这样,如果服务器宕机,看来我可能会错过一些消息,因此似乎是一个更糟糕的选择1。

get_task_stream on separate server or use it in some way. With this, it seems I could miss some messages if the server were to go down so seems like a worse option 1.

其他选择?

什么是最好的方法?

编辑:刚刚经过测试,似乎当提交任务的客户端关闭时,即使调用fire_and_forget,创建的所有期货也会从处理转移到被遗忘。

Just tested and it seems when the client that submitted a task shuts down, all futures it created are moved from processing to forgotten, even if calling fire_and_forget.

推荐答案

您可能想看看Dask的协调基元,例如Queues和Pub / Sub。我的猜测是将您的期货放入队列可以解决您的问题。

You probably want to look at Dask's coordination primitivies like Queues and Pub/Sub. My guess is that putting your futures into a queue would solve your problem.

https://docs.dask.org/en/latest/futures.html#coordination-primitives

这篇关于如何在与提交计算机不同的计算机上获得Dask计算的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆