如何在单个线程中运行dask.distributed集群? [英] How do I run a dask.distributed cluster in a single thread?

查看:84
本文介绍了如何在单个线程中运行dask.distributed集群?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在单个线程中运行完整的Dask.Distributed集群?我想将其用于调试或概要分析。

How can I run a complete Dask.distributed cluster in a single thread? I want to use this for debugging or profiling.

注意:这是一个常见问题。我在此处向堆栈溢出添加问题和答案,以供将来重用。

推荐答案

本地调度程序



如果您可以使用单机调度程序的API(只是计算),那么可以使用单线程调度程序

Local Scheduler

If you can get by with the single-machine scheduler's API (just compute) then you can use the single-threaded scheduler

x.compute(scheduler='single-threaded')



分布式调度程序-单机



如果要在一台计算机上运行dask.distributed群集,则可以不带任何参数启动客户端

Distributed Scheduler - Single Machine

If you want to run a dask.distributed cluster on a single machine you can start the client with no arguments

from dask.distributed import Client
client = Client()  # Starts local cluster
x.compute()

这使用很多线程,但是在一台机器上运行

This uses many threads but operates on one machine

或者,如果要在单个进程中运行所有内容,则可以使用 processes = False 关键字

Alternatively if you want to run everything in a single process then you can use the processes=False keyword

from dask.distributed import Client
client = Client(processes=False)  # Starts local cluster
x.compute()

所有通信和控制都在单个线程中进行,尽管计算在单独的线程池中进行。

All of the communication and control happen in a single thread, though computation occurs in a separate thread pool.

要在单个线程中运行控制,通信和计算,您需要创建一个Tornadocurrent.futures执行器。当心,此Tornado API可能不是公开的。

To run control, communication, and computation all in a single thread you need to create a Tornado concurrent.futures Executor. Beware, this Tornado API may not be public.

from dask.distributed import Scheduler, Worker, Client
from tornado.concurrent import DummyExecutor
from tornado.ioloop import IOLoop
import threading

loop = IOLoop()
e = DummyExecutor()
s = Scheduler(loop=loop)
s.start()
w = Worker(s.address, loop=loop, executor=e)
loop.add_callback(w._start)

async def f():
    async with Client(s.address, start=False) as c:
        future = c.submit(threading.get_ident)
        result = await future
        return result

>>> threading.get_ident() == loop.run_sync(f)
True

这篇关于如何在单个线程中运行dask.distributed集群?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆