Google Dataflow非python依赖项-单独的setup.py? [英] Google Dataflow non-python dependencies - separate setup.py?
问题描述
我们需要在数据流流程中安装非Python依赖项(我们需要ODBC驱动程序来访问MSSQL DB)
We need a non-Python dependency installed into our Dataflow process (we need an ODBC driver to access an MSSQL DB)
我们已经编写了一个 setup.py
,可以使用以下步骤成功安装它们:
We've written a setup.py
that successfully installs those using the steps here: https://cloud.google.com/dataflow/pipelines/dependencies-python#non-python-dependencies
我们要保留软件包的原始 setup.py
(不会安装那些额外的依赖项);有没有一种方法可以使用不同的 setup.py
进行Dataflow安装?
We want to keep our original setup.py
for the package (which doesn't install those extra dependencies); is there a way of using a different setup.py
for Dataflow installs?
我们尝试过:
- 将其称为
setup_dataflow.py
,但是Dataflow出现错误,指出需要将其命名为setup.py
. - 遵循步骤此处,并在子路径中使用
setup.py
根路径.在那方面我们没有成功
- calling it
setup_dataflow.py
, but Dataflow raised an error stating it needed to be calledsetup.py
. - following the steps here, and using a
setup.py
within a child path to the root path. We weren't successful at that
我们可以尝试在 setup.py
中使用 if
语句来确定它是否已安装在Dataflow环境中(尽管我找不到任何可靠的环境变量来识别这个)
We could try a if
statement within setup.py
to identify whether it's being installed in a Dataflow environment (though I couldn't find any reliable environment variables to identify this)
有什么建议/建议吗?
谢谢
推荐答案
当前没有方便的方法可以做到这一点.您可能有两个不同的程序包,如下所示:
Currently there's no convenient way to do this. You could have two different packages, something like so:
+- dataflow_pipeline
++- setup.py
+- original_pipeline
++- setup.py
++- pipeline.py
dataflow_pipeline/setup.py
只是在其中导入 original_package
,并添加额外的依赖项.
Where dataflow_pipeline/setup.py
simply imports original_package
, and adds the extra dependencies.
这不理想,但应该可以.
It's not ideal, but it should work.
这篇关于Google Dataflow非python依赖项-单独的setup.py?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!