如果从机关闭,Django多个数据库将返回主机 [英] Django Multiple Databases Fallback to Master if Slave is down

查看:133
本文介绍了如果从机关闭,Django多个数据库将返回主机的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于Django,我拥有MySQL db后台的主从复制设置。
目前我只读Master和DB,但我的仪表板是相当查询密集型的。
我正在寻找一个选项,在这里我可以定义如下
DATABASES

  DATABASES = { 
'default_slave':{
'ENGINE':'django.db.backends.mysql',
'NAME':'application',
'USER':'root' ,
'PASSWORD':'',
'HOST':'',
'PORT':'3306',
},
'default'
'ENGINE':'django.db.backends.mysql',
'NAME':'application',
'USER':'root',
'PASSWORD' '',
'HOST':'',
'PORT':'3306',
},
}
pre>

对于仪表板,报告和各种其他应用程序,我想做的是:



尝试连接: default_slave:使用 default_slave 如果可达到其他使用默认值



也就是说,如果从机已经启动,则获取报告从奴隶数据库本身,如果没有从主数据库获取报告。



捕获是,从属可以是向上还是向下,我希望这可以动态选择哪个数据库用于获取报告,基于rechability。



这是可能的吗?我可以在手之前测试连接并向前移动吗?



有了这个,我将在Master中写入sync_db,并且从奴隶中读取,如果从机是up。



需要一些解决方案/提示原始查询以及 orm查询



路由器的概念似乎很好,但是从来不可追溯,我不知道这个可能性。



更新 / p>

如何了解MULTI-DATABASE



数据库

  DATABASES = {
'default_slave':{
'ENGINE':'django.db.backends.mysql',
'NAME'应用程序',
'USER':'root',
'PASSWORD':'',
'HOST':'',
'PORT':'3306',
},
'default':{
'ENGINE':'django.db.backends.mysql',
'NAME':'application',
'USER ':'root',
'PASSWORD':'',
'HO ST':'',
'PORT':'3306',
},
}
'linux':{
'ENGINE':'django.db
'NAME':'application',
'USER':'root',
'PASSWORD':'',
'HOST':' ',
'PORT':'3306',
},
}
'linux_slave':{
'ENGINE':'django.db.backends.mysql ',
'NAME':'application',
'USER':'root',
'PASSWORD':'',
'HOST':'',
'PORT':'3306',
},
}
'mac':{
'ENGINE':'django.db.backends.mysql',
'NAME':'application',
'USER':'root',
'PASSWORD':'',
'HOST':'',
'PORT ':'3306',
},
}
'mac_slave':{
'ENGINE':'d jango.db.backends.mysql',
'NAME':'application',
'USER':'root',
'PASSWORD':'',
'HOST ':'',
'PORT':'3306',
},
}
'pc':{
'ENGINE':'django.db。 '$'$'$'$'$'$'$'$'$'$'$'$'$'$'$' ,
'PORT':'3306',
},
}
'pc_slave':{
'ENGINE':'django.db.backends.mysql' ,
'NAME':'application',
'USER':'root',
'PASSWORD':'',
'HOST':'',
'PORT':'3306',
},
}

现在,我有
1.静态数据
2.动态数据



静态数据必须存储在默认中,这将被复制到default_slave'



对于动态数据,查询首先需要判断动态数据可能在哪里:在'mac'或'pc'或'linux' / p>

要实现这一点,我在'static table'中添加了一个字段:'query_on',其中包含['mac'或'linux'或'pc']



现在,使用查询集,我只是写
static = Static.objects.get(pk = 1)

query_on = static.query_on
dynamic = Dynamic.objects.get(static = static)使用(alias = query_on)



这样做很好,查询路由到数据库需要执行,这里我需要判断:


  1. 如果<'query_on'> _slave :connection has up:use :<'query_on'> _slave
    OR

  2. 如果<'query_on' > _slave :connection is down:use:<'query_on'>

如何去做?



应用程序的更多详细信息:


  1. 有一个数据库:默认(配置和分析数据库):用于维护配置数据和报告分析数据

  2. 有20个数据库(原始数据库):例如:mac,linux ,rhel,windows,pc ....(示例名称):用于收集原始数据,而不是进行分析的过程

  3. 每个数据库都有一个或多个从站,命名约定将是:default_slave_0,default_slave_1,default_slave_2等为其他数据库以及

现在,分析数据需要每5分钟先查询30次分钟,1小时....并且该查询需要发送到特定数据库,因为并不是每个数据库都将携带分析所需的特定数据集。



为此,我们需要


  1. 从(默认或任何一个奴隶(奴隶部分是问题))

  2. 一旦我们进行了配置,我们可以轻松地查看原始数据可能在哪里

  3. 查询原始数据,并收集结果并分析 - >将其存储在默认数据库中。

现在所有30(原始)和1默认数据库将需要同步,因为我们在所有节点中保持相同的数据结构。



现在,由于我们正在查看所有数据库的CPU峰值,因此使用从属数据库查询原始数据是有意义的。



因此使用的的要求。我不能想象路由器在这里会有什么帮助?

解决方案

使用路由器,您处于正确的轨道。我假设你的两个db定义是相同的只是一个错字。



(我将使用更敏感的主 - >关注者



在您的db_for_read()函数中,您可以检查与追随者的连接。这可能会带来更多的开销,但这是为数据库自动故障转移的成本。示例数据库定义将是:

  DATABASES = {
'followers':{
'ENGINE' :'django.db.backends.mysql',
'NAME':'追随者',
'USER':'root',
'HOST':'54 .34.65.24',
'PORT':'3306',
},
'default':{
'ENGINE':'django.db.backends.mysql',
'NAME ':'application',
'USER':'root',
'HOST':'54 .34.65.23',
'PORT':'3306',
}
}

您可以快速尝试连接,除了这个例子。使用这种方式执行所需操作的路由器将如下所示:

 从django.conf导入设置
import socket


def test_connection_to_db(database_name):
try:
db_definition = getattr(settings,'DATABASES')[database_name]
s = socket.create_connection(( db_definition ['HOST'],db_definition ['PORT']),5)
s.close()
返回True
除了(AttributeError,socket.timeout)为e:
返回False


类FailoverRouter(对象):
默认为追踪者读取但提供故障转移回到默认

def db_for_read(self,model,** hints):
如果test_connection_to_db('追随者'):
返回'追随者'
返回'默认'

def db_for_write(self,model,** hints):
将所有写入到默认db
return'default'

def allow_syncdb(self,db,model):
确保只有默认的db允许syncdb
return db =='default'

这样,仍然会在master中同步syncdb。另外,您可以为 db_for_read() db_for_write()更复杂(例如选择跟随者db



我不知道这个 test_connection()的开销将会导致每次读取,因为这将取决于MySQL服务器和超时。也许更好的架构是使用memcached缓存这些报告,或者只是解决与从站不断下降的问题,并在设置中更新数据库定义。


I have master - slave replication setup for MySQL db backend for Django. Currently I am reading and writing for Master DB only, but my dashboards are quite query intensive. I was searching for an option, where in I can Define like following DATABASES

DATABASES = {
'default_slave': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '',
        'PORT': '3306',
    },
'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  

for dashboards, reports and various other apps, what I want to do is:

Try connection : default_slave : using default_slave if reachable else using default

That is, if slave is up, fetch the reports from slave database itself, if not fetch the reports from master database.

Catch is, slave can be up or down, and i want this to be dynamically selectable regarding which database to use for fetching reports, based on rechability.

Is this possible ? Can I test connection before hand and move forward ?

With this I would write and sync_db in Master, and always read from Slave, if slave is up.

Need some solution/hint for raw queries as well as orm queries

The router concept seems nice, but fallback on slave not reachable, i don't know the possibility.

UPDATE

How to go about MULTI-DATABASE

DATABASES

DATABASES = {
'default_slave': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '',
        'PORT': '3306',
    },
'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  
'linux': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  
'linux_slave': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  
'mac': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  
'mac_slave': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  
'pc': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  
'pc_slave': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'PASSWORD': '',
        'HOST': '', 
        'PORT': '3306',
    },
}  

Now, I have 1. Static Data 2. Dynamic Data

Static Data has to be stored in 'default' which will be replicated to 'default_slave'

For dynamic data, the query first needs to judge where might the Dynamic Data be lying : in 'mac' or in 'pc' or in 'linux'

To Achieve that, I added one field in 'static table' : 'query_on' which contains either ['mac' or 'linux' or 'pc']

Now, whicl using query set, I am simply writing static = Static.objects.get(pk = 1)
query_on = static.query_on dynamic = Dynamic.objects.get(static = static).using(alias=query_on)

This works well, the query routes to the database it needs to get executed, here I need to judge :

  1. If <'query_on'>_slave : connection is up : use : <'query_on'>_slave OR
  2. If <'query_on'>_slave : connection is down : use : <'query_on'>

How to go about that ?

Further details for the application:

  1. There is one database : default (configuration & analytics database) : for maintaining the configuration data and report analytics data
  2. There are 20 databases (raw databases) : as example says : mac, linux, rhel, windows, pc .... (example name) : for collecting the raw data, which is not processes for analytics
  3. each database has one or multiple slaves, naming convention would be : default_slave_0, default_slave_1, default_slave_2 and so for other databases as well

Now the analytics data needs to be first queried per 5 minutes, 30 minutes, 1 hour .... and that query needs to be sent out to specific database, because not every database will be carrying specific dataset required for analytics.

To do that, we need to

  1. get the configuration data from (default or any one of its slave (slave part is the question) )
  2. once we have the configuration, we can easily look into where the "raw" data might be
  3. query for raw data, and gather results and analyse --> store it in "default" database.

Now all the 30 (raw) and 1 default database, would require "sync", as we maintain the same data abse structure throughout all the nodes.

Now, since we are looking at CPU spikes on all the databases, it makes sense to use "slave" databases to query for "raw" data.

hence the requirement of using . I am not able to imagine how routers would be of help here ?

解决方案

You are on the right track with using a router. I'm assuming the fact that your two db definitions are identical is just a typo.

(FYI, I'm going to refer to the database hierarchy using the more sensitive master->follower)

In your db_for_read() functions, you can check for connectivity to your follower. This might incur a bit more overhead, but that's the cost of having auto-failover for a database. An example database definition would be:

DATABASES = {
'follower': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'follower',
        'USER': 'root',
        'HOST': '54.34.65.24',
        'PORT': '3306',
    },
'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'application',
        'USER': 'root',
        'HOST': '54.34.65.23',
        'PORT': '3306',
    },
}  

You can test the connection with a quick try/except like this example. A router using this that does what you need would look like:

from django.conf import settings
import socket


def test_connection_to_db(database_name):
    try:
        db_definition = getattr(settings, 'DATABASES')[database_name]
        s = socket.create_connection((db_definition['HOST'], db_definition['PORT']), 5)
        s.close()
        return True
    except (AttributeError, socket.timeout) as e:
        return False


class FailoverRouter(object):
    """A router that defaults reads to the follower but provides a failover back to the default"""

    def db_for_read(self, model, **hints):
        if test_connection_to_db('follower'):
            return 'follower'
        return 'default'

    def db_for_write(self, model, **hints):
        "Point all writes to the default db"
        return 'default'

    def allow_syncdb(self, db, model):
        "Make sure only the default db allows syncdb"
        return db == 'default'

This will still syncdb in master like you want. Also, you could make the logic for both db_for_read() and db_for_write() more complicated (like pick the follower db only for certain models that are queried for your reports.

I don't know what overhead this test_connection() will cause for every read, since that will depend on the MySQL server and the timeout. Perhaps a better architecture is to cache these reports using memcached, or just work out the issues with the slave ever going down and update your database definitions in settings first.

这篇关于如果从机关闭,Django多个数据库将返回主机的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆