具有dumpdata和迁移功能的Django备份策略 [英] Django backup strategy with dumpdata and migrations

查看:130
本文介绍了具有dumpdata和迁移功能的Django备份策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题中,我设置了dumpdata-基于数据库的备份系统.该设置类似于运行cron脚本,该脚本调用dumpdata并将备份移动到远程服务器,目的是简单地使用loaddata恢复数据库.但是,我不确定这在迁移中是否能发挥作用. loaddata现在具有 ignorenonexistent 切换为处理已删除的模型/字段,但无法解决使用一次性默认值添加列或应用RunPython代码的情况.

As in this question, I set up a dumpdata-based backup system for my database. The setup is akin to running a cron script that calls dumpdata and moves the backup to a remote server, with the aim of simply using loaddata to recover the database. However, I'm not sure this plays well with migrations. loaddata now has an ignorenonexistent switch to deal with deleted models/fields, but it is not able to resolve cases where columns were added with one-off defaults or apply RunPython code.

我认为,有两个子问题要解决:

The way I see it, there are two sub-problems to address:

  • 用每个应用程序的当前版本标记每个dumpdata输出文件
  • 将固定装置拼接到迁移路径中
  • Tag each dumpdata output file with the current version of each app
  • Splice the fixtures into the migration path

我很困惑如何在不引入大量开销的情况下解决第一个问题.为每个包含{app_name: migration_number}映射的备份保存一个额外的文件是否足够?

I'm stumped about how to tackle the first problem without introducing a ton of overhead. Would it be enough to save an extra file per backup that contained an {app_name: migration_number} mapping?

我认为第二个问题一旦解决就容易了,因为该过程大致如下:

The second problem I think is easier once the first one is solved, since the process is roughly:

  1. 创建一个新数据库
  2. 将迁移向前迁移到每个应用程序的适当位置
  3. 使用给定的灯具文件调用loaddata
  4. 运行其余的迁移
  1. Create a new database
  2. Run migrations forward to the appropriate point for each app
  3. Call loaddata with the given fixture file
  4. Run the rest of the migrations

中有一些代码问题(与错误报告链接),我认为可以对此进行调整.

There's some code in this question (linked from the bug report) that I think could be adapted for this purpose.

由于这些是数据库的常规快照/大型快照,因此我不希望保留它们,因为数据迁移会导致迁移目录混乱.

Since these are fairly regular/large snapshots of the database, I don't want to keep them as data migrations cluttering up the migrations directory.

推荐答案

我正在按照以下步骤在项目的任何实例之间备份,还原或转移我的postgresql数据库:

I am taking the following steps to backup, restore or transfer my postgresql database between any instance of my project:

这个想法是要保持尽可能少的迁移,就像manage.py makemigrations第一次在空数据库上运行一样.

The idea is to keep the least possible migrations as if manage.py makemigrations was run for the first time on an empty database.

让我们假设我们在开发环境中拥有一个正常工作的数据库.此数据库是生产数据库的当前副本,不应对其进行任何更改.我们添加了模型,更改了属性等,并且这些操作产生了额外的迁移.

Let's assume that we have a working database to our development environment. This database is a current copy of the production database that should not be open to any changes. We have added models, altered attributes etc and those actions have generated additional migrations.

现在,数据库已准备好迁移到生产环境,该生产环境-如前所述-不对公众开放,因此不会以任何方式进行更改.为了实现这一点:

Now the database is ready to be migrated to production which -as stated before- is not open to public so it is not altered in any way. In order to achieve this:

  • 我在开发环境中执行正常程序.
  • 我将项目复制到生产环境中.
  • 我在生产环境中执行正常程序
  • I perform the normal procedure in the development environment.
  • I copy the project to the production environment.
  • I perform the normal procedure in the production environment

我们在开发环境中进行了更改.生产数据库中不应发生任何更改,因为它们将被覆盖.

We make the changes in our development environment. No changes should happen in the production database because they will be overridden.

首先,我有一个项目目录的备份(其中包括一个requirements.txt文件),一个数据库的备份,并且-当然-git是我的朋友.

Before anything else, I have a backup of the project directory (which includes a requirements.txt file), a backup of the database and -of course- git is a friend of mine.

  1. 我需要备份dumpdata.但是,dumpdata具有一些严重 限制关于内容类型,权限或自然外键:

  1. I take a dumpdata backup in case I need it. However, dumpdata has some serious limitations regarding content types, permissions or other cases where a natural foreignkey should be used:

./manage.py dumpdata --exclude auth.permission --exclude contenttypes  --exclude admin.LogEntry --exclude sessions --indent 2 > db.json

  • 我备份了一个pg_dump来使用:

  • I take a pg_dump backup to use:

    pg_dump -U $user -Fc $database --exclude-table=django_migrations > path/to/backup-dir/db.dump
    

  • 仅当我要合并现有迁移时,才删除每个应用程序中的所有迁移.

  • Only if I want to merge existing migrations in one, I delete all migrations from every application.

    在我的情况下,migrations文件夹是一个符号链接,因此我使用以下脚本:

    In my case the migrations folder is a symlink, so I use the following script:

    #!/bin/bash
    for dir in $(find -L -name "migrations")
    do
      rm -Rf $dir/*
    done
    

  • 我删除并重新创建数据库:

  • I delete and recreate the database:

    例如,bash脚本可以包含以下命令:

    For example, a bash script can include the following commands:

    su -l postgres -c "PGPASSWORD=$password psql -c 'drop database $database ;'"
    su -l postgres -c "createdb --owner $username $database"
    su -l postgres -c "PGPASSWORD=$password psql $database -U $username -c 'CREATE EXTENSION $extension ;'"
    

  • 我从转储中还原数据库:

  • I restore the database from the dump:

    pg_restore -Fc -U $username -d $database path/to/backup-dir/db.dump
    

  • 如果在步骤3中删除了迁移,我将通过以下方式重新创建它们:

  • If migrations were deleted in step 3, I recreate them in the following way:

    ./manage.py makemigrations <app1> <app2> ... <appn>
    

    ...通过使用以下脚本:

    ... by using the following script:

    #!/bin/bash
    apps=()
    for app in $(find ./ -maxdepth 1 -type d ! -path "./<project-folder> ! -path "./.*" ! -path "./")
    do
      apps+=(${app#??})
    done
    all_apps=$(printf "%s "  "${apps[@]}")
    
    ./manage.py makemigrations $all_apps
    

  • 我使用虚假迁移进行迁移:

  • I migrate using a fake migration:

    ./manage.py migrate --fake
    

  • 万一出了什么问题,而一切都变成了***,(这确实有可能发生),我可以使用备份将一切恢复到先前的工作状态.如果我要使用第一步中的db.json文件,它会像这样:

    In case something has gone completely wrong and everything is ***, (this can happen, indeed), I can use the backup to revert everything to its previous working state. If I would like to use the db.json file from step one, it goes like this:

    我执行以下步骤:

    • 3(删除迁移)
    • 4(删除并重新创建数据库)
    • 6个(makemigrations)

    然后:

    • 应用迁移:

    • Apply the migrations:

    ./manage.py migrate
    

  • 从db.json加载数据:

  • Load the data from db.json:

    ./manage.py loaddata path/to/db.json
    

  • 然后,我尝试找出为什么以前的努力没有成功的原因.

    Then I try to find out why my previous effort was not successful.

    成功执行步骤后,我将项目复制到服务器,然后在该框中执行相同的操作.

    When the steps are performed successfully, I copy the project to the server and perform the same ones to that box.

    这样,我始终将迁移次数保持最少,并且能够对共享同一项目的任何盒子使用pg_dumppg_restore.

    This way, I always keep the least number of migrations and I am able to use pg_dump and pg_restore to any box that shares the same project.

    这篇关于具有dumpdata和迁移功能的Django备份策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆