Dask Workers有哪些活动线程? [英] What threads do Dask Workers have active?
问题描述
在运行Dask工作程序时,我注意到有一些超出我预期的线程。我希望看到Dask Worker运行多少个线程,它们在做什么?
When running a Dask worker I notice that there are a few extra threads beyond what I was expecting. How many threads should I expect to see running from a Dask Worker and what are they doing?
推荐答案
Dask Worker具有以下线程:
Dask workers have the following threads:
- 在其中运行任务的线程池。通常在1到计算机上逻辑核的数量之间。
- 一个管理线程来管理事件循环,通过(非阻塞)套接字进行通信,响应快速查询,将任务分配到工作线程等上。
- 几个线程,用于在通信过程中对消息进行可选的压缩和(反)序列化
- 一个线程可以监视和分析上面的两个项目
- A pool of threads in which to run tasks. This is typically somewhere between 1 and the number of logical cores on the computer
- One administrative thread to manage the event loop, communication over (non-blocking) sockets, responding to fast queries, the allocation of tasks onto worker threads, etc..
- A couple of threads that are used for optional compression and (de)serialization of messages during communication
- One thread to monitor and profile the two items above
此外,默认情况下,还有一个附加的Nanny进程可以监视工作人员。此过程有其自己的几个管理线程。
Additionally, by default there is an additional Nanny process that watches the worker. This process has a couple of its own threads for administration.
这些是截至2018年10月的内部详细信息,如有更改,恕不另行通知。
These are internal details as of October 2018 and may change without notice.
遇到线程过多问题的人们通常正在运行本身是多线程的任务,因此会遇到N平方线程问题。通常,这里的解决方案是使用 OMP_NUM_THREADS = 1
这样的环境变量,但这取决于您所使用的确切库。
People who run into "too many threads" issues often are running tasks that are themselves multi-threaded, and so get an N-squared threading issue. Often the solution here is to use environment variables like OMP_NUM_THREADS=1
but this depends on the exact libraries that you're using.
这篇关于Dask Workers有哪些活动线程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!