MySQL 官方 Auto-Failover 功能测试


参考资料:
http://www.clusterdb.com/mysql/replication-and-auto-failover-made-easy-with-mysql-utilities

环境介绍:
master: demoenv-trial-1
slaves: demoenv-trial-2 demoenv-trial-3

1. 安装 Percona Server,在所有服务器上:
$ sudo yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
$ sudo yum install Percona-Server-shared-compat
$ sudo yum install Percona-Server-server-56

$ sudo yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
$ sudo yum install mysql-utilities

2. 配置 /etc/my.cnf,在所有服务器上:
注意:确保 server-id 不同且 report-host 与自身主机名相同
$ sudo vim /etc/my.cnf

[mysqld]
# basic setting
datadir = /opt/mysql/data
tmpdir = /opt/mysql/tmp
socket = /opt/mysql/run/mysqld.sock
port = 3306
pid-file = /opt/mysql/run/mysqld.pid

# innodb setting
default-storage-engine = INNODB
innodb_file_per_table = 1
log-bin = /opt/mysql/binlogs/bin-log-mysqld
log-bin-index = /opt/mysql/binlogs/bin-log-mysqld.index
innodb_data_home_dir = /opt/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /opt/mysql/data
binlog-do-db = testdb

# server id
server-id=1

# gtids setting
binlog-format = ROW
log-slave-updates = true
gtid-mode = on
enforce-gtid-consistency = true
report-host = demoenv-trial-1
report-port = 3306
master-info-repository = TABLE
relay-log-info-repository = TABLE
sync-master-info = 1

# other settings
[mysqld_safe]
log-error = /opt/mysql/log/mysqld.log
pid-file = /opt/mysql/run/mysqld.pid
open-files-limit = 8192

[mysqlhotcopy]
interactive-timeout

[client]
port = 3306
socket = /opt/mysql/run/mysqld.sock
default-character-set = utf8

3. 创建所需目录,在所有服务器上:
$ sudo mkdir -p /opt/mysql/{data,tmp,run,binlogs,log}
$ sudo chown mysql:mysql /opt/mysql/{data,tmp,run,binlogs,log}

4. 初始化数据库,在所有服务器上:
$ sudo -i
# su - mysql
$ mysql_install_db --user=mysql --datadir=/opt/mysql/data/
$ exit
# exit
$ sudo /etc/init.d/mysql start

5. 创建授权用户 root@'%' 以便通过 mysqlreplicate 来进行主从复制的配置,在所有服务器上:
$ mysql -uroot

mysql> grant all on *.* to root@'%' identified by 'pass' with grant option;
mysql> quit;

6. 创建复制所需的用户,在所有服务器上:
$ mysql -uroot

mysql> grant replication slave on *.* to 'rpl'@'%' identified by 'rpl';
mysql> quit;

7. 配置主从复制,可选择任意一台服务器操作:
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot

mysql> use mysql;
mysql> drop user root@'demoenv-trial-1';
mysql> quit;

[dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-1':3306 --slave=root:pass@'demoenv-trial-2':3306 --rpl-user=rpl:rpl

# master on demoenv-trial-1: ... connected.
# slave on demoenv-trial-2: ... connected.
# Checking for binary logging on master...
# set up replication...
# ...done.


[dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-1':3306 --slave=root:pass@'demoenv-trial-3':3306 --rpl-user=rpl:rpl

# master on demoenv-trial-1: ... connected.
# slave on demoenv-trial-3: ... connected.
# Checking for binary logging on master...
# set up replication...
# ...done.

8. 验证主从复制的数据完整性
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot

mysql> create database testdb;
mysql> quit;

[dong.guo@demoenv-trial-1 ~]$ mysql -uroot -ppass -h'demoenv-trial-2' -e 'show databases;'

+-------------------+
| Database          |
+-------------------+
| information_schema|
| mysql             |
| performance_schema|
| test              |
| testdb            |
+-------------------+

[dong.guo@demoenv-trial-1 ~]$ mysql -uroot -ppass -h'demoenv-trial-3' -e 'show databases;'

+-------------------+
| Database          |
+-------------------+
| information_schema|
| mysql             |
| performance_schema|
| test              |
| testdb            |
+-------------------+

[dong.guo@demoenv-trial-1 ~]$ mysqlrplshow --master=rpl:rpl@'demoenv-trial-1':3306 --discover-slaves-login=root:pass;

# master on demoenv-trial-1: ... connected.
# Finding slaves for master: demoenv-trial-1:3306

# Replication Topology Graph
demoenv-trial-1:3306 (MASTER)
  |
   +--- demoenv-trial-2:3306 - (SLAVE)
  |
   +--- demoenv-trial-3:3306 - (SLAVE)

[dong.guo@demoenv-trial-1 ~]$ mysqlrplcheck --master=root:pass@'demoenv-trial-1' --slave=root:pass@'demoenv-trial-2'

# master on demoenv-trial-1: ... connected.
# slave on demoenv-trial-2: ... connected.
Test Description                          Status
---------------------
Checking for binary logging on master     [pass]
Are there binlog exceptions?              [WARN]

+--------+--------+-----------+
| server | do_db  | ignore_db |
+--------+--------+-----------+
| master | testdb |           |
+--------+--------+-----------+

Replication user exists?                  [pass]
Checking server_id values                 [pass]
Checking server_uuid values               [pass]
Is slave connected to master?             [pass]
Check master information file             [pass]
Checking InnoDB compatibility             [pass]
Checking storage engines compatibility    [pass]
Checking lower_case_table_names setting   [pass]
Checking slave delay (seconds behind master)                         [pass]
# ...done.   

9. 关闭 master 测试 auto-failover
[dong.guo@demoenv-trial-1 ~]$ mysqlfailover --master=root:pass@'demoenv-trial-1':3306 --discover-slaves-login=root:pass --rediscover

# Discovering slaves for master at demoenv-trial-1:3306
# Discovering slave at demoenv-trial-2:3306
# Found slave: demoenv-trial-2:3306
# Discovering slave at demoenv-trial-3:3306
# Found slave: demoenv-trial-3:3306
# Checking privileges.
# Discovering slaves for master at demoenv-trial-1:3306
MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Sun Oct 20 06:58:52 2013

Master Information
------------------
Binary Log File       Position  Binlog_Do_DB  Binlog_Ignore_DB  
bin-log-mysqld.00000  299       testdb                          

GTID Executed Set
92df196b-3906-11e3-b6b6-000c290d14d7:1

Replication Health Status
+-----------------+------+--------+-------+-----------+---------------+
| host            | port | role   | state | gtid_mode | health        |
+-----------------+------+--------+-------+-----------+---------------+
| demoenv-trial-1 | 3306 | MASTER | UP    | ON        | OK            |
| demoenv-trial-2 | 3306 | SLAVE  | UP    | ON        | OK            |
| demoenv-trial-3 | 3306 | SLAVE  | UP    | ON        | OK            |
+-----------------+------+--------+-------+-----------+---------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs

然后,执行命令的终端就挂起了,需要另外开启一个终端来关闭master:
[dong.guo@demoenv-trial-1 ~]$ mysqladmin -uroot -ppass shutdown

接着,在刚刚挂起的终端上,可以看到:

Failover starting in 'auto' mode...
# Candidate slave demoenv-trial-2:3306 will become the new master.
# Checking slaves status (before failover).
# Preparing candidate for failover.
# Creating replication user if it does not exist.
# Stopping slaves.
# Performing STOP on all slaves.
# Switching slaves to new master.
# Disconnecting new master as slave.
# Starting slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Failover complete.
# Discovering slaves for master at demoenv-trial-2:3306

Failover console will restart in 5 seconds.

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Sun Oct 20 07:01:25 2013

Master Information
------------------
Binary Log File       Position  Binlog_Do_DB  Binlog_Ignore_DB  
bin-log-mysqld.00000  299       testdb                          

GTID Executed Set
92df196b-3906-11e3-b6b6-000c290d14d7:1

Replication Health Status
+-----------------+------+--------+-------+-----------+---------------+
| host            | port | role   | state | gtid_mode | health        |
+-----------------+------+--------+-------+-----------+---------------+
| demoenv-trial-2 | 3306 | MASTER | UP    | ON        | OK            |
| demoenv-trial-3 | 3306 | SLAVE  | UP    | ON        | OK            |
+-----------------+------+--------+-------+-----------+---------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs

服务器 'demoenv-trial-2' 变成了新的master。

10. 插入数据测试主从复制
[dong.guo@demoenv-trial-2 ~]$ mysql -uroot

mysql> use testdb;
Database changed
mysql> CREATE TABLE `hostgroup` (
    ->   `hostgroup_id` tinyint(4) NOT NULL AUTO_INCREMENT,
    ->   `hostgroup_name` char(20) DEFAULT NULL,
    ->   `hostgroup_next` tinyint(4) NOT NULL,
    ->   `colo_name` char(4) NOT NULL,
    ->   PRIMARY KEY (`hostgroup_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Query OK, 0 rows affected (0.10 sec)

mysql> quit;

[dong.guo@demoenv-trial-2 binlogs]$ mysql -uroot -ppass -h'demoenv-trial-3' testdb -e 'show tables;'

+-----------------+
| Tables_in_testdb|
+-----------------+
| hostgroup       |
+-----------------+

11. 尝试将旧的master恢复
[dong.guo@demoenv-trial-1 ~]$ sudo /etc/init.d/mysql start
[dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-2':3306 --slave=root:pass@'demoenv-trial-1':3306

# master on demoenv-trial-2: ... connected.
# slave on demoenv-trial-1: ... connected.
# Checking for binary logging on master...
# setting up replication...
# ...done.

在刚刚挂起的终端上,可以看到:

...
Replication Health Status
+-----------------+------+--------+-------+-----------+---------------+
| host            | port | role   | state | gtid_mode | health        |
+-----------------+------+--------+-------+-----------+---------------+
| demoenv-trial-2 | 3306 | MASTER | UP    | ON        | OK            |
| demoenv-trial-1 | 3306 | SLAVE  | UP    | ON        | OK            |
| demoenv-trial-3 | 3306 | SLAVE  | UP    | ON        | OK            |
+-----------------+------+--------+-------+-----------+---------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs

将旧的master恢复
[dong.guo@demoenv-trial-1 ~]$ mysqlrpladmin --master=root:pass@'demoenv-trial-2':3306 --new-master=root:pass@'demoenv-trial-1':3306 --demote-master --discover-slaves-login=root:pass switchover

# Discovering slaves for master at demoenv-trial-2:3306
# Discovering slave at demoenv-trial-1:3306
# Found slave: demoenv-trial-1:3306
# Discovering slave at demoenv-trial-3:3306
# Found slave: demoenv-trial-3:3306
# Checking privileges.
# Performing switchover from master at demoenv-trial-2:3306 to slave at demoenv-trial-1:3306.
# Checking candidate slave prerequisites.
# Checking slaves configuration to master.
# Waiting for slaves to catch up to old master.
# Stopping slaves.
# Performing STOP on all slaves.
# Demoting old master to be a slave to the new master.
# Switching slaves to new master.
# Starting all slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Switchover complete.
#
# Replication Topology Health:
+-----------------+------+--------+-------+-----------+---------------+
| host            | port | role   | state | gtid_mode | health        |
+-----------------+------+--------+-------+-----------+---------------+
| demoenv-trial-1 | 3306 | MASTER | UP    | ON        | OK            |
| demoenv-trial-2 | 3306 | SLAVE  | UP    | ON        | OK            |
| demoenv-trial-3 | 3306 | SLAVE  | UP    | ON        | OK            |
+-----------------+------+--------+-------+-----------+---------------+
# ...done.

在刚刚挂起的终端上,可以看到:

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Sun Oct 20 07:30:07 2013

Master Information
------------------
Binary Log File       Position  Binlog_Do_DB  Binlog_Ignore_DB  
bin-log-mysqld.00000  710       testdb                          

GTID Executed Set
8a58172b-1efd-11e3-8cf1-000c2950fe0c:1 [...]

Replication Health Status
+-----------------+------+--------+-------+-----------+----------------------------------+
| host            | port | role   | state | gtid_mode | health                           |
+-----------------+------+--------+-------+-----------+----------------------------------+
| demoenv-trial-2 | 3306 | MASTER | UP    | ON        | OK                               |
| demoenv-trial-1 | 3306 | SLAVE  | UP    | WARN      | Slave is not connected to master.|
| demoenv-trial-3 | 3306 | SLAVE  | UP    | WARN      | Slave is not connected to master.|
+-----------------+------+--------+-------+-----------+----------------------------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs

可以看到failover在手动恢复了旧的master之后已经停止了工作。

按下Q,然后重启auto-failover:
[dong.guo@demoenv-trial-1 ~]$ mysqlfailover --master=root:pass@'demoenv-trial-1':3306 --discover-slaves-login=root:pass --rediscover

...

Replication Health Status
+-----------------+------+--------+-------+-----------+---------------+
| host            | port | role   | state | gtid_mode | health        |
+-----------------+------+--------+-------+-----------+---------------+
| demoenv-trial-1 | 3306 | MASTER | UP    | ON        | OK            |
| demoenv-trial-2 | 3306 | SLAVE  | UP    | ON        | OK            |
| demoenv-trial-3 | 3306 | SLAVE  | UP    | ON        | OK            |
+-----------------+------+--------+-------+-----------+---------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs

它又可以工作了。

12. 对于 auto-failover 的简短总结:
它包含在mysql-utilities这个软件包中;
它只能工作在MySQL 5.6 版本上,因为需要GITDs的支持;
它可以自动选取一个slave作为新的master,当现有的master死掉以后;
但是它不能自动将旧的master恢复,并且在手动恢复了旧的master之后也会停止工作。

, ,

  1. No comments yet.
(will not be published)
*