如何在Kubernetes上的Google Cloud上备份Postgres数据库? [英] How to backup a Postgres database in Kubernetes on Google Cloud?
问题描述
备份在 Google Cloud Container Engine 上运行的Postgres数据库的最佳实践是什么? ?
我的想法是努力将备份存储在 Google Cloud Storage 中,但是我不确定如何将磁盘/Pod连接到存储桶.
我正在使用以下配置在Kubernetes集群中运行Postgres:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- image: postgres:9.6.2-alpine
imagePullPolicy: IfNotPresent
env:
- name: PGDATA
value: /var/lib/postgresql/data
- name: POSTGRES_DB
value: my-database-name
- name: POSTGRES_PASSWORD
value: my-password
- name: POSTGRES_USER
value: my-database-user
name: postgres-container
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql
name: my-postgres-volume
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-postgres-disk
name: my-postgres-volume
我试图创建工作来运行备份:
apiVersion: batch/v1
kind: Job
metadata:
name: postgres-dump-job
spec:
template:
metadata:
labels:
app: postgres-dump
spec:
containers:
- command:
- pg_dump
- my-database-name
# `env` value matches `env` from previous configuration.
image: postgres:9.6.2-alpine
imagePullPolicy: IfNotPresent
name: my-postgres-dump-container
volumeMounts:
- mountPath: /var/lib/postgresql
name: my-postgres-volume
readOnly: true
restartPolicy: Never
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-postgres-disk
name: my-postgres-volume
(据我了解),这应该运行 pg_dump
命令,并将备份数据输出到stdout(应该显示在kubectl logs
中).
顺便说一句,当我检查Pod(使用kubectl get pods
)时,它表明Pod从未脱离待处理"状态,这是由于没有足够的资源来启动作业而导致的.
/p>
以Job身份运行此过程是否正确? 如何将作业连接到Google Cloud Storage? 还是我应该做一些完全不同的事情?
我猜测由于性能下降,在数据库容器(带有kubectl exec
)中运行pg_dump
是不明智的,但是在开发/登台服务器中可以这样做吗?
就像@Marco Lamina所说的那样,您可以像
那样在postgres pod上运行pg_dump.DUMP
// pod-name name of the postgres pod
// postgres-user database user that is able to access the database
// database-name name of the database
kubectl exec [pod-name] -- bash -c "pg_dump -U [postgres-user] [database-name]" > database.sql
RESTORE
// pod-name name of the postgres pod
// postgres-user database user that is able to access the database
// database-name name of the database
cat database.sql | kubectl exec -i [pod-name] -- psql -U [postgres-user] -d [database-name]
您可以拥有一个运行此命令的作业窗格,并将其导出到文件存储系统(例如AWS s3).
What is the best practice for backing up a Postgres database running on Google Cloud Container Engine?
My thought is working towards storing the backups in Google Cloud Storage, but I am unsure of how to connect the Disk/Pod to a Storage Bucket.
I am running Postgres in a Kubernetes cluster using the following configuration:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- image: postgres:9.6.2-alpine
imagePullPolicy: IfNotPresent
env:
- name: PGDATA
value: /var/lib/postgresql/data
- name: POSTGRES_DB
value: my-database-name
- name: POSTGRES_PASSWORD
value: my-password
- name: POSTGRES_USER
value: my-database-user
name: postgres-container
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql
name: my-postgres-volume
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-postgres-disk
name: my-postgres-volume
I have attempted to create a Job to run a backup:
apiVersion: batch/v1
kind: Job
metadata:
name: postgres-dump-job
spec:
template:
metadata:
labels:
app: postgres-dump
spec:
containers:
- command:
- pg_dump
- my-database-name
# `env` value matches `env` from previous configuration.
image: postgres:9.6.2-alpine
imagePullPolicy: IfNotPresent
name: my-postgres-dump-container
volumeMounts:
- mountPath: /var/lib/postgresql
name: my-postgres-volume
readOnly: true
restartPolicy: Never
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-postgres-disk
name: my-postgres-volume
(As far as I understand) this should run the pg_dump
command and output the backup data to stdout (which should appear in the kubectl logs
).
As an aside, when I inspect the Pods (with kubectl get pods
), it shows the Pod never gets out of the "Pending" state, which I gather is due to there not being enough resources to start the Job.
Is it correct to run this process as a Job? How do I connect the Job to Google Cloud Storage? Or should I be doing something completely different?
I'm guessing it would be unwise to run pg_dump
in the database Container (with kubectl exec
) due to a performance hit, but maybe this is ok in a dev/staging server?
As @Marco Lamina said you can run pg_dump on postgres pod like
DUMP
// pod-name name of the postgres pod
// postgres-user database user that is able to access the database
// database-name name of the database
kubectl exec [pod-name] -- bash -c "pg_dump -U [postgres-user] [database-name]" > database.sql
RESTORE
// pod-name name of the postgres pod
// postgres-user database user that is able to access the database
// database-name name of the database
cat database.sql | kubectl exec -i [pod-name] -- psql -U [postgres-user] -d [database-name]
You can have a job pod that does run this command and exports this to a file storage system such as AWS s3.
这篇关于如何在Kubernetes上的Google Cloud上备份Postgres数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!