具有dumpdata和迁移功能的Django备份策略 [英] Django backup strategy with dumpdata and migrations
问题描述
在此问题中,我设置了dumpdata
-基于数据库的备份系统.该设置类似于运行cron脚本,该脚本调用dumpdata
并将备份移动到远程服务器,目的是简单地使用loaddata
恢复数据库.但是,我不确定这在迁移中是否能发挥作用. loaddata
现在具有 ignorenonexistent
切换为处理已删除的模型/字段,但无法解决使用一次性默认值添加列或应用RunPython
代码的情况.
As in this question, I set up a dumpdata
-based backup system for my database. The setup is akin to running a cron script that calls dumpdata
and moves the backup to a remote server, with the aim of simply using loaddata
to recover the database. However, I'm not sure this plays well with migrations. loaddata
now has an ignorenonexistent
switch to deal with deleted models/fields, but it is not able to resolve cases where columns were added with one-off defaults or apply RunPython
code.
我认为,有两个子问题要解决:
The way I see it, there are two sub-problems to address:
- 用每个应用程序的当前版本标记每个
dumpdata
输出文件 - 将固定装置拼接到迁移路径中
- Tag each
dumpdata
output file with the current version of each app - Splice the fixtures into the migration path
我很困惑如何在不引入大量开销的情况下解决第一个问题.为每个包含{app_name: migration_number}
映射的备份保存一个额外的文件是否足够?
I'm stumped about how to tackle the first problem without introducing a ton of overhead. Would it be enough to save an extra file per backup that contained an {app_name: migration_number}
mapping?
我认为第二个问题一旦解决就容易了,因为该过程大致如下:
The second problem I think is easier once the first one is solved, since the process is roughly:
- 创建一个新数据库
- 将迁移向前迁移到每个应用程序的适当位置
- 使用给定的灯具文件调用
loaddata
- 运行其余的迁移
- Create a new database
- Run migrations forward to the appropriate point for each app
- Call
loaddata
with the given fixture file - Run the rest of the migrations
中有一些代码问题(与错误报告链接),我认为可以对此进行调整.
There's some code in this question (linked from the bug report) that I think could be adapted for this purpose.
由于这些是数据库的常规快照/大型快照,因此我不希望保留它们,因为数据迁移会导致迁移目录混乱.
Since these are fairly regular/large snapshots of the database, I don't want to keep them as data migrations cluttering up the migrations directory.
推荐答案
我正在按照以下步骤在项目的任何实例之间备份,还原或转移我的postgresql数据库:
I am taking the following steps to backup, restore or transfer my postgresql database between any instance of my project:
这个想法是要保持尽可能少的迁移,就像manage.py makemigrations
第一次在空数据库上运行一样.
The idea is to keep the least possible migrations as if manage.py makemigrations
was run for the first time on an empty database.
让我们假设我们在开发环境中拥有一个正常工作的数据库.此数据库是生产数据库的当前副本,不应对其进行任何更改.我们添加了模型,更改了属性等,并且这些操作产生了额外的迁移.
Let's assume that we have a working database to our development environment. This database is a current copy of the production database that should not be open to any changes. We have added models, altered attributes etc and those actions have generated additional migrations.
现在,数据库已准备好迁移到生产环境,该生产环境-如前所述-不对公众开放,因此不会以任何方式进行更改.为了实现这一点:
Now the database is ready to be migrated to production which -as stated before- is not open to public so it is not altered in any way. In order to achieve this:
- 我在开发环境中执行正常程序.
- 我将项目复制到生产环境中.
- 我在生产环境中执行正常程序
- I perform the normal procedure in the development environment.
- I copy the project to the production environment.
- I perform the normal procedure in the production environment
我们在开发环境中进行了更改.生产数据库中不应发生任何更改,因为它们将被覆盖.
We make the changes in our development environment. No changes should happen in the production database because they will be overridden.
首先,我有一个项目目录的备份(其中包括一个requirements.txt文件),一个数据库的备份,并且-当然-git
是我的朋友.
Before anything else, I have a backup of the project directory (which includes a requirements.txt file), a backup of the database and -of course- git
is a friend of mine.
I take a
dumpdata
backup in case I need it. However,dumpdata
has some serious limitations regarding content types, permissions or other cases where a natural foreignkey should be used:
./manage.py dumpdata --exclude auth.permission --exclude contenttypes --exclude admin.LogEntry --exclude sessions --indent 2 > db.json
我备份了一个pg_dump
来使用:
I take a pg_dump
backup to use:
pg_dump -U $user -Fc $database --exclude-table=django_migrations > path/to/backup-dir/db.dump
仅当我要合并现有迁移时,才删除每个应用程序中的所有迁移.
Only if I want to merge existing migrations in one, I delete all migrations from every application.
在我的情况下,migrations
文件夹是一个符号链接,因此我使用以下脚本:
In my case the migrations
folder is a symlink, so I use the following script:
#!/bin/bash
for dir in $(find -L -name "migrations")
do
rm -Rf $dir/*
done
我删除并重新创建数据库:
I delete and recreate the database:
例如,bash脚本可以包含以下命令:
For example, a bash script can include the following commands:
su -l postgres -c "PGPASSWORD=$password psql -c 'drop database $database ;'"
su -l postgres -c "createdb --owner $username $database"
su -l postgres -c "PGPASSWORD=$password psql $database -U $username -c 'CREATE EXTENSION $extension ;'"
我从转储中还原数据库:
I restore the database from the dump:
pg_restore -Fc -U $username -d $database path/to/backup-dir/db.dump
如果在步骤3中删除了迁移,我将通过以下方式重新创建它们:
If migrations were deleted in step 3, I recreate them in the following way:
./manage.py makemigrations <app1> <app2> ... <appn>
...通过使用以下脚本:
... by using the following script:
#!/bin/bash
apps=()
for app in $(find ./ -maxdepth 1 -type d ! -path "./<project-folder> ! -path "./.*" ! -path "./")
do
apps+=(${app#??})
done
all_apps=$(printf "%s " "${apps[@]}")
./manage.py makemigrations $all_apps
我使用虚假迁移进行迁移:
I migrate using a fake migration:
./manage.py migrate --fake
万一出了什么问题,而一切都变成了***,(这确实有可能发生),我可以使用备份将一切恢复到先前的工作状态.如果我要使用第一步中的db.json
文件,它会像这样:
In case something has gone completely wrong and everything is ***, (this can happen, indeed), I can use the backup to revert everything to its previous working state. If I would like to use the db.json
file from step one, it goes like this:
我执行以下步骤:
- 3(删除迁移)
- 4(删除并重新创建数据库)
- 6个(makemigrations)
然后:
-
应用迁移:
Apply the migrations:
./manage.py migrate
从db.json加载数据:
Load the data from db.json:
./manage.py loaddata path/to/db.json
然后,我尝试找出为什么以前的努力没有成功的原因.
Then I try to find out why my previous effort was not successful.
成功执行步骤后,我将项目复制到服务器,然后在该框中执行相同的操作.
When the steps are performed successfully, I copy the project to the server and perform the same ones to that box.
这样,我始终将迁移次数保持最少,并且能够对共享同一项目的任何盒子使用pg_dump
和pg_restore
.
This way, I always keep the least number of migrations and I am able to use pg_dump
and pg_restore
to any box that shares the same project.
这篇关于具有dumpdata和迁移功能的Django备份策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!