如何按kedro管道中的声明顺序运行节点? [英] How to run the nodes in sequence as declared in kedro pipeline?

查看:126
本文介绍了如何按kedro管道中的声明顺序运行节点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Kedro管道中,节点(类似于python函数)是按顺序声明的.在某些情况下,一个节点的输入就是前一个节点的输出.但是,有时,在命令行中调用kedro run API时,节点不会按顺序运行.

In Kedro pipeline, nodes (something like python functions) are declared sequentially. In some cases, the input of one node is the output of the previous node. However, sometimes, when kedro run API is called in the commandline, the nodes are not run sequentially.

在kedro文档中,它说默认情况下节点是按顺序运行的.

In kedro documentation, it says that by default the nodes are ran in sequence.

我的run.py代码:

My run.py code:

def main(
tags: Iterable[str] = None,
env: str = None,
runner: Type[AbstractRunner] = None,
node_names: Iterable[str] = None,
from_nodes: Iterable[str] = None,
to_nodes: Iterable[str] = None,
from_inputs: Iterable[str] = None,
):

project_context = ProjectContext(Path.cwd(), env=env)
project_context.run(
    tags=tags,
    runner=runner,
    node_names=node_names,
    from_nodes=from_nodes,
    to_nodes=to_nodes,
    from_inputs=from_inputs,
)

当前,我的最后一个节点有时在我的前几个节点之前运行.

Currently my last node is sometimes ran before my first few nodes.

推荐答案

我从Kedro github上得到的答案:

The answer that I recieved from Kedro github:

管道仅根据以下内容确定节点的执行顺序 目前的数据集依存关系(节点输入和输出).所以 指示节点A应该在节点B之前运行的唯一选择 将虚拟数据集作为节点A的输出和节点B的输入.

Pipeline determines the node execution order exclusively based on dataset dependencies (node inputs and outputs) at the moment. So the only option to dictate that the node A should run before node B is to put a dummy dataset as an output of node A and an input of node B.

这篇关于如何按kedro管道中的声明顺序运行节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆