Google Dataflow非python依赖项-单独的setup.py? [英] Google Dataflow non-python dependencies - separate setup.py?

查看:61
本文介绍了Google Dataflow非python依赖项-单独的setup.py?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们需要在数据流流程中安装非Python依赖项(我们需要ODBC驱动程序来访问MSSQL DB)

We need a non-Python dependency installed into our Dataflow process (we need an ODBC driver to access an MSSQL DB)

我们已经编写了一个 setup.py ,可以使用以下步骤成功安装它们:

We've written a setup.py that successfully installs those using the steps here: https://cloud.google.com/dataflow/pipelines/dependencies-python#non-python-dependencies

我们要保留软件包的原始 setup.py (不会安装那些额外的依赖项);有没有一种方法可以使用不同的 setup.py 进行Dataflow安装?

We want to keep our original setup.py for the package (which doesn't install those extra dependencies); is there a way of using a different setup.py for Dataflow installs?

我们尝试过:

  • calling it setup_dataflow.py, but Dataflow raised an error stating it needed to be called setup.py.
  • following the steps here, and using a setup.py within a child path to the root path. We weren't successful at that

我们可以尝试在 setup.py 中使用 if 语句来确定它是否已安装在Dataflow环境中(尽管我找不到任何可靠的环境变量来识别这个)

We could try a if statement within setup.py to identify whether it's being installed in a Dataflow environment (though I couldn't find any reliable environment variables to identify this)

有什么建议/建议吗?

谢谢

推荐答案

当前没有方便的方法可以做到这一点.您可能有两个不同的程序包,如下所示:

Currently there's no convenient way to do this. You could have two different packages, something like so:

+- dataflow_pipeline
++- setup.py
+- original_pipeline
++- setup.py
++- pipeline.py

dataflow_pipeline/setup.py 只是在其中导入 original_package ,并添加额外的依赖项.

Where dataflow_pipeline/setup.py simply imports original_package, and adds the extra dependencies.

这不理想,但应该可以.

It's not ideal, but it should work.

这篇关于Google Dataflow非python依赖项-单独的setup.py?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆