使用python的MySQL备份导致服务器崩溃 [英] Server crash on MySQL backup using python
问题描述
我有一个Python脚本,它每小时将我的MySQL数据库备份到Amazon S3存储桶中.我使用脚本只是调用mysqldump
来创建转储,然后使用tinys3
将其上传到S3存储桶,请注意,我将lock-tables
设置为false,这样不会妨碍其他应用程序的交易.>
以下是供您参考的脚本:
import tinys3
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "my_project.settings")
application = get_wsgi_application()
from django.utils import timezone
import pytz
import datetime
import json
timezone.activate(pytz.timezone("Asia/Kolkata"))
current_datetime = timezone.localtime(
datetime.datetime.utcnow().replace(tzinfo=pytz.utc)
)
dir_struct = '/'.join(current_datetime.strftime("%Y-%m-%d-%H-%M-%S").split('-'))
endpoint = 's3-us-west-2.amazonaws.com'
params = json.load(open('buckets.json'))
S3_ACCESS_KEY=params['S3_ACCESS_KEY']
S3_SECRET_KEY = params["S3_SECRET_KEY"]
bucket = params['mysql']
db_name = params['db_name']
mysql_command = 'sudo mysqldump --defaults-file=/home/ubuntu/.my.cnf --lock-tables=false %s > /home/ubuntu/%s.sql' %(db_name, db_name)
compress_command = "zip -r /home/ubuntu/%s.sql.zip /home/ubuntu/%s.sql" %(db_name, db_name)
delete_command = "sudo rm -rf /home/ubuntu/%s.sql*" %db_name
os.system(mysql_command)
os.system(compress_command)
backup_file = open('/home/ubuntu/%s.sql.zip' %db_name, 'rb')
conn = tinys3.Connection(S3_ACCESS_KEY, S3_SECRET_KEY, tls=True,endpoint=endpoint)
print conn.upload(
(dir_struct+'%s.sql.zip' %db_name),
backup_file,
bucket,
public=False
)
print conn.get((dir_struct+'%s.sql.zip' %db_name),bucket)
os.system(delete_command)
问题是当我启动cron作业以每小时运行一次此脚本时,服务器在几个小时后(例如5到7个小时)崩溃.我还没有找到这种行为的重要原因.这里有什么问题?该脚本是否存在错误或与MySQL有关的问题?
很容易想象这里发生了什么. Mysqldump 运行缓慢.恢复更糟.
它并非旨在作为快速或可扩展的备份解决方案 大量数据.具有大数据量,即使备份 步骤需要花费合理的时间,恢复数据可能会非常缓慢 因为重放SQL语句涉及要插入的磁盘I/O, 索引创建等等.
获取备份后,您似乎会压缩它,然后将其上传到Amazon s3.我的猜测是您的第二个备份在第一个备份完成之前就开始了,并且一直不断升级,直到服务器不堪重负为止.
即使服务器没有崩溃,您仍然不应该使用这种方法,因为在几个月的时间里,您将花费大量资金进行存储.
有一个更好的方法. MySQL复制.没有cronjobs,如果门架掉落,恢复几乎是即时的,没有庞大的数据传输.
I have a python script that backs up my MySQL database to Amazon S3 buckets every hour. I use the script to simply call mysqldump
in order to create the dump and then upload it to the S3 bucket using tinys3
, note that I set lock-tables
to false so that transactions by other applications are not hindered.
Here is the script for your reference:
import tinys3
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "my_project.settings")
application = get_wsgi_application()
from django.utils import timezone
import pytz
import datetime
import json
timezone.activate(pytz.timezone("Asia/Kolkata"))
current_datetime = timezone.localtime(
datetime.datetime.utcnow().replace(tzinfo=pytz.utc)
)
dir_struct = '/'.join(current_datetime.strftime("%Y-%m-%d-%H-%M-%S").split('-'))
endpoint = 's3-us-west-2.amazonaws.com'
params = json.load(open('buckets.json'))
S3_ACCESS_KEY=params['S3_ACCESS_KEY']
S3_SECRET_KEY = params["S3_SECRET_KEY"]
bucket = params['mysql']
db_name = params['db_name']
mysql_command = 'sudo mysqldump --defaults-file=/home/ubuntu/.my.cnf --lock-tables=false %s > /home/ubuntu/%s.sql' %(db_name, db_name)
compress_command = "zip -r /home/ubuntu/%s.sql.zip /home/ubuntu/%s.sql" %(db_name, db_name)
delete_command = "sudo rm -rf /home/ubuntu/%s.sql*" %db_name
os.system(mysql_command)
os.system(compress_command)
backup_file = open('/home/ubuntu/%s.sql.zip' %db_name, 'rb')
conn = tinys3.Connection(S3_ACCESS_KEY, S3_SECRET_KEY, tls=True,endpoint=endpoint)
print conn.upload(
(dir_struct+'%s.sql.zip' %db_name),
backup_file,
bucket,
public=False
)
print conn.get((dir_struct+'%s.sql.zip' %db_name),bucket)
os.system(delete_command)
The problem is that when I start the cron job to run this script every hour, the server crashes after a few hours (say 5 to 7 hours). I haven't found a considerable reason for this behaviour yet. What is the problem here? Is there a fault in this script or something related to MySQL?
It's easy to imagine what's happening here. Mysqldump is slow. Restoration worse.
It is not intended as a fast or scalable solution for backing up substantial amounts of data. With large data sizes, even if the backup step takes a reasonable time, restoring the data can be very slow because replaying the SQL statements involves disk I/O for insertion, index creation, and so on.
Once you take the backup you appear to zip it, then upload it to amazon s3. It's my guess is that your second backup starts before the first one finishes and it keeps escalating until the server is overwhelmed.
Even if your server doesn't crash, you still should not be using this approach because in a few months time you will be spending thumping amounts for storage.
There is a much much better way. Mysql replication. No cronjobs, almost immidiate recovery if the mast goes down, no bulky data transfers.
这篇关于使用python的MySQL备份导致服务器崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!