mariadb 连接器 J Aurora 快速故障转移实现 [英] mariadb connector J Aurora Fast failover implemantation

查看:43
本文介绍了mariadb 连接器 J Aurora 快速故障转移实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试了解 Mariadb Connetor J 的 Aurora 快速故障转移实现.虽然我不是 Java 应用程序工程师,但我的主要工作是担任 DBA.我想我对OOP语言知之甚少,并且已经阅读了mariadb connector J的源代码,重点是相关的Aurora实现.但这很困难,对我的猜测没有信心.

I'm trying to understand Aurora fast failover implementation of Mariadb Connetor J. Although I am not java apps engineer, my prime work is jobs as DBA. I think I have little knowledge about OOP languages and have read the source code of mariadb connector J focusing related Aurora implementation. But it was difficult and cannot have confidence about my guess.

如果能分享您的知识或只是一些评论,我真的很感激.

I really appreciate if share your knowledge about it or just any few comments.

在最新版本中,我们只是注册了 Aurora 集群的集群端点,驱动程序会自动找出每个实例端点.

In the latest version, we just register cluster endpoint of Aurora cluster and driver automatically find out every instance endpoint.

我猜这个逻辑是如何工作的

My guess how this logic work is

  • 驱动程序从 information_schema.REPLICA_HOST_STATUS 生成每个端点连接字符串,该字符串知道 Sever_id 列上的所有实例标识符.有一个关于端点字符串的模式.因此,一旦驱动程序成功从集群端点连接任何实例,驱动程序就可以生成每个实例端点.

  • Driver generates every endpoint connect string from information_schema.REPLICA_HOST_STATUS that knows all instance identifiers on Sever_id column. There is a pattern about endpoint string. So once driver succeeded to connect any instance from cluster endpoint, driver can generate every instance endpoint.

git每个实例端点后,驱动程序抛出一个查询SHOW GLOBAL STATUS LIKE 'innodb_read_only'.如果返回值为0(false),则设置为Writer,否则设置为Reader.

After git every instance endpoints, driver throws a query "SHOW GLOBAL STATUS LIKE 'innodb_read_only'. If return value is 0(false), it set as Writer, otherwise set as Reader.

如果健康检查失败,驱动程序将连接字符串推入黑名单".(虽然我找不到健康检查写在哪里)

Driver push the connect string into "blacklist" if health check fails.( I cannnot find where health check is written though)

驱动程序尝试从未列入黑名单的连接字符串进行连接,但如果失败,则尝试从列入黑名单的连接字符串进行连接.

Driver tries to connect from not black-listed connect string, but if it fails, it tries black-listed connect string.

我的猜测主要来自下面的java文件.

My guess is mainly from below java files.

  • mariadb-connector-j/src/main/java/org/mariadb/jdbc/internal/failover/impl/AuroraListener.java
  • mariadb-connector-j/src/main/java/org/mariadb/jdbc/internal/protocol/AuroraProtocol.java

https://github.com/MariaDB/mariadb-connector-j/blob/master/documentation/failover-and-high-availability-with-mariadb-connector-j.creole>

推荐答案

这里有一些提示:Aurora 有很多实例.一个是作家"(Master),其他是读者"(奴隶).

Here are some hints : Aurora has many instances. One is "writer" (Master), others are "Reader"(Slaves).

当一个 writer 宕机时,一个 slave 将被提升为新的 master,其他 slave 将从这个新的 master 复制(自动重启).要是老主子再上来,就变成奴隶了.

When a writer is down, one slave will be promoted new master, other slaves will now replicate from this new master (automatic reboot). If the old master come up again, it will become a slave.

Aurora 有一个用于集群的 DNS 端点,例如xx.cluster-yy.zz.rds.amazonaws.com",指向当前主节点.发生故障转移时,DNS 会刷新……但不会立即刷新.

Aurora has a DNS endpoint for cluster like "xx.cluster-yy.zz.rds.amazonaws.com" that point to current master. When a failover occur, DNS is refreshed ... but not immediatly.

与 Aurora 的连接"意味着与实例的 2 个底层连接:一个到主设备,一个到从设备.驱动程序将根据 Connection.setReadonly() 使用底层连接到 master 或 slave.

A "connection" to aurora mean 2 underlying connection to instances: one to master, one to slave. Driver will use the underlying connection to master or slave according to Connection.setReadonly().

驱动程序每次连接到实例时,都会确保当前状态检查全局变量innodb_read_only"(OFF = master).

Every time driver connect to an instance, it will ensure current state checking global variable "innodb_read_only" (OFF = master).

可以添加 Aurora 实例,因此在初始连接时,使用用户集群端点,将使用 information_schema.replica_host_status 检索当前实例列表.

Aurora instances can be added, so on initial connection, using user cluster endpoint, the current list of instances will be retrieved using information_schema.replica_host_status.

为了建立2个底层连接,驱动将连接到一个随机主机,如果这是当前的主机,那么所有其他主机都是从机,如果不是,驱动程序将询问从机当前的主机,以便下一个连接连接主机使用 information_schema.replica_host_status where session_id = 'MASTER_SESSION_ID'(比使用 DNS 更可靠).如果与实例的连接失败,该实例名称将被放入黑名单一段时间(此黑名单是每个 jvm 共享的)以避免重复使用.驱动程序尝试重新连接随机可用的主机,直到没有未列入黑名单的主机,然后可以使用列入黑名单的主机重试一段时间(取决于参数).如果连接成功,则实例未列入黑名单".

To established the 2 underlying connections, driver will connect to a random host, if this is the current master, good then all other hosts are slaves, if not, driver will ask the slave his current master so next connection will connect host using information_schema.replica_host_status where session_id = 'MASTER_SESSION_ID' (more relyable than using DNS). If connection to an instance fail, this instance name will be put in a blacklist for a certain amount of time (this blacklist is shared per jvm) to avoid reusing it. Driver try to reconnect a random available host until there is none not blacklisted, then can retry with blacklisted one for some time (depending on parameters). If connection is successful, then instance is "un-blacklisted".

对于底层从属连接的故障转移,然后使用主连接,然后一些底层线程池将尝试在后台重新连接从属实例.

For failover of underlying slave connection, master connection is then used, and some underlying pool of thread will then try to reconnect a slave instance in background.

这篇关于mariadb 连接器 J Aurora 快速故障转移实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆