从Google Cloud Composer运行docker操作员 [英] Running docker operator from Google Cloud Composer
问题描述
关于文档,Google Cloud Composer气流工作程序节点是通过专用的kubernetes集群提供服务的:
As for the documentation, Google Cloud Composer airflow worker nodes are served from a dedicated kubernetes cluster:
我有一个包含Docker的ETL步骤,我想使用气流运行该步骤,最好在托管Workers的同一Kubernetes上或在专用集群上进行.
I have a Docker contained ETL step that I would like to run using airflow, preferably on the same Kubernetes that is hosting the Workers OR on a dedicated cluster.
从Cloud Composer气流环境启动Docker Operation
的最佳实践是什么?
What would be the best practice for starting Docker Operation
from Cloud Composer airflow environment?
务实的解决方案是❤️
推荐答案
Google Cloud Composer最近刚刚发布到了General Availability中,现在您可以使用KubernetesPodOperator
将Pod启动到与该GKE集群相同的集群中受控气流的使用.
Google Cloud Composer has just recently released into General Availability, and with that you are now able to use a KubernetesPodOperator
to launch pods into the same GKE cluster that the managed airflow uses.
确保您的Composer环境至少为1.0.0
Make sure your Composer environment is at least 1.0.0
运算符示例:
import datetime
from airflow import models
from airflow.contrib.operators import kubernetes_pod_operator
with models.DAG(
dag_id='composer_sample_kubernetes_pod',
schedule_interval=datetime.timedelta(days=1),
start_date=YESTERDAY) as dag:
# Only name, namespace, image, and task_id are required to create a
# KubernetesPodOperator. In Cloud Composer, currently the operator defaults
# to using the config file found at `/home/airflow/composer_kube_config if
# no `config_file` parameter is specified. By default it will contain the
# credentials for Cloud Composer's Google Kubernetes Engine cluster that is
# created upon environment creation.
kubernetes_min_pod = kubernetes_pod_operator.KubernetesPodOperator(
# The ID specified for the task.
task_id='pod-ex-minimum',
# Name of task you want to run, used to generate Pod ID.
name='pod-ex-minimum',
# The namespace to run within Kubernetes, default namespace is
# `default`. There is the potential for the resource starvation of
# Airflow workers and scheduler within the Cloud Composer environment,
# the recommended solution is to increase the amount of nodes in order
# to satisfy the computing requirements. Alternatively, launching pods
# into a custom namespace will stop fighting over resources.
namespace='default',
# Docker image specified. Defaults to hub.docker.com, but any fully
# qualified URLs will point to a custom repository. Supports private
# gcr.io images if the Composer Environment is under the same
# project-id as the gcr.io images.
image='gcr.io/gcp-runtimes/ubuntu_16_0_4')
其他资源:
- Kubernetes' `KubernetesPodOperator docs
- More
KubernetesPodOperator
examples. KubernetesPodOperator
Airflow code
这篇关于从Google Cloud Composer运行docker操作员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!