达斯调度程序为空/图形未显示 [英] Dask scheduler empty / graph not showing

查看:61
本文介绍了达斯调度程序为空/图形未显示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的设置如下:

# etl.py
from dask.distributed import Client
import dask
from tasks import task1, task2, task3

def runall(**kwargs):
    print("done")


def etl():
    client = Client()

    tasks = {}
    tasks['task1'] = dask.delayed(task)(*args)
    tasks['task2'] = dask.delayed(task)(*args)
    tasks['task3'] = dask.delayed(task)(*args)

     out = dask.delayed(runall)(**tasks)
     out.compute()

此逻辑是从luigi借来的,可与if语句很好地配合以控制要运行的任务.

This logic was borrowed from luigi and works nicely with if statements to control what tasks to run.

但是,有些任务从SQL加载大量数据并导致GIL冻结警告(至少我怀疑,因为很难诊断出哪一行确切地导致了问题).有时,在8787上显示的图形/监视仅显示 scheduler empty 并没有显示任何内容,我怀疑这些是由应用程序冻结导致的.最快从SQL加载大量数据的最佳方法是什么.(MSSQL和oracle).目前,这已通过调整设置的 sqlalchemy 完成.是否会添加 async await 帮助?

However, some of the tasks load large amounts of data from SQL and cause GIL freeze warnings (At least this is my suspicion as it is hard to diagnose what line exactly causes the issue). Sometimes the graph / monitoring shown on 8787 does not show anything just scheduler empty, I suspect these are caused by the app freezing dask. What is the best way to load large amounts of data from SQL in dask. (MSSQL and oracle). At the moment this is doen with sqlalchemy with tuned settings. Would adding async and await help?

但是,某些任务有点慢,我想在内部使用 dask.dataframe bag 之类的东西.该文档建议不要在内部延迟调用.这是否也适用于 dataframe bag .整个脚本在单个40核计算机上运行.

However, some of tasks are a bit slow and I'd like to use stuff like dask.dataframe or bag internally. The docs advise against calling delayed inside delayed. Does this also hold for dataframe and bag. The entire script is run on a single 40 core machine.

使用 bag.starmap 我得到了这样的图形:

Using bag.starmap I get a graph like this:

在计算到达该任务并在其中调用计算后,在其中添加/发现较高的直线.

where the upper straight lines are added/ discovered once the computation reaches that task and compute is called inside it.

推荐答案

似乎没有韵律或原因,除了机器忙/忙于显示所需的状态更新和散景图.

There appears to be no rhyme or reason other than the machine was / is very busy and struggling to show the state updates and bokeh plots as desired.

这篇关于达斯调度程序为空/图形未显示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆