如何在与提交计算机不同的计算机上获得Dask计算的结果? [英] How can I get result of Dask compute on a different machine than the one that submitted it?
问题描述
我在Django服务器后面使用Dask,这里总结了我的基本设置: https://github.com/MoonVision/django-dask-demo/ 可以在这里找到Dask客户端: https://github.com/MoonVision/django-dask-demo/blob/master/demo/daskmanager/daskmanager.py
I am using Dask behind a Django server and the basic setup I have is summarised here: https://github.com/MoonVision/django-dask-demo/ where the Dask client can be found here: https://github.com/MoonVision/django-dask-demo/blob/master/demo/daskmanager/daskmanager.py
我希望能够将任务的保存与提交任务的服务器分开,以实现健壮性和可伸缩性。我还想了解有关任务处理状态的更多详细信息,现在即使任务正在处理,将来的状态也始终处于待定状态。
I want to be able to separate the saving of a task from the server that submitted it for robustness and scalability. I also would like more detailed information as to the processing status of the task, right now the future status is always pending even if the task is processing. Having a rough estimate of percent complete would also be great.
现在,如果Web服务器死了,客户端将被删除并且任务将停止,因为没有客户仍然抱着未来。我可以使用 fire_and_forget 解决这个问题,但是我然后无法保存任务状态和结果。
Right now, if the web server were to die, the client would get deleted and the task would stop as no client is still holding the future. I can get around this by using fire_and_forget but I then have no way to save the task status and result when it completes.
我看到了在fire_and_forget之后跟踪状态并保存结果的方法:
Ways I see to track the status and save the result after a fire_and_forget:
-
我可以有一个调度程序插件,该插件将所有传输发送到AMPQ服务器(RabbitMQ)。我喜欢这种鲁棒性,并且能够订阅调度程序输出的某些消息,并且知道每条消息都会得到处理。我不确定如何通过这种方法自行获得结果。我可以手动在每个图形的末尾添加一个节点以保存结果,但希望它在幕后。
I could have a scheduler plugin that sends all transfers to AMPQ server (RabbitMQ). I like the robustness and being able to subscribe to certain messages that are output by the scheduler and knowing every message will be processed. I'm not sure how I could get the result it self with this method. I could manually adding a node to the end of every graph to save the result but would rather have it be behind the scenes.
get_task_stream 或以某种方式使用。这样,如果服务器宕机,看来我可能会错过一些消息,因此似乎是一个更糟糕的选择1。
get_task_stream on separate server or use it in some way. With this, it seems I could miss some messages if the server were to go down so seems like a worse option 1.
其他选择?
什么是最好的方法?
编辑:刚刚经过测试,似乎当提交任务的客户端关闭时,即使调用fire_and_forget,创建的所有期货也会从处理转移到被遗忘。
Just tested and it seems when the client that submitted a task shuts down, all futures it created are moved from processing to forgotten, even if calling fire_and_forget.
推荐答案
您可能想看看Dask的协调基元,例如Queues和Pub / Sub。我的猜测是将您的期货放入队列可以解决您的问题。
You probably want to look at Dask's coordination primitivies like Queues and Pub/Sub. My guess is that putting your futures into a queue would solve your problem.
https://docs.dask.org/en/latest/futures.html#coordination-primitives
这篇关于如何在与提交计算机不同的计算机上获得Dask计算的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!