Google Cloud Dataflow实例的图像 [英] Image for Google Cloud Dataflow instances
问题描述
当我运行Dataflow作业时,它将使用我的小程序包(setup.py或requirements.txt)并将其上载以在Dataflow实例上运行.
When I run Dataflow job, it takes my small package (setup.py or requirements.txt) and uploads it to run on the Dataflow instances.
但是Dataflow实例上实际正在运行什么?我最近有一个堆栈跟踪:
But what is actually running on the Dataflow instance? I got a stacktrace recently:
File "/usr/lib/python2.7/httplib.py", line 1073, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 1035, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 877, in _send_output
msg += message_body
TypeError: must be str, not unicode
[while running 'write to datastore/Convert to Mutation']
但是从理论上讲,如果我正在执行str += unicode
,则表明我可能没有运行此
But in theory, if I'm doing str += unicode
, it implies I might not be running this Python patch? Can you point to the docker image that these jobs are running, so I can know what version of Python I'm working with, and make sure I'm not barking up the wrong tree here?
The cloud console shows me the instance template, which seems to point to dataflow-dataflow-owned-resource-20170308-rc02, but it seems I don't have permission to look at it. Is the source for it online anywhere?
推荐答案
尚未测试(也许有一种更简单的方法),但是类似的方法可能会解决问题:
Haven't tested (and maybe there is an easier way), but something like this might do the trick:
-
从控制台
- ssh进入数据流工作器之一
- 运行
docker ps
获取容器ID - 运行
docker inspect <container_id>
- 从字段
Image
中获取图像ID
- 运行
docker history --no-trunc <image>
- ssh into one of the Dataflow workers from the console
- run
docker ps
to get the container id - run
docker inspect <container_id>
- grab the image id from the field
Image
- run
docker history --no-trunc <image>
然后,您应该找到想要的东西.
Then you should find what you are after.
这篇关于Google Cloud Dataflow实例的图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!