Replication and auto-failover made easy with MySQL Utilities


Reference:
http://www.clusterdb.com/mysql/replication-and-auto-failover-made-easy-with-mysql-utilities

Environment:
master: demoenv-trial-1
slaves: demoenv-trial-2 demoenv-trial-3

1. Install Percona Server, on all servers:
$ sudo yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
$ sudo yum install Percona-Server-shared-compat
$ sudo yum install Percona-Server-server-56

$ sudo yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
$ sudo yum install mysql-utilities

2. Configure /etc/my.cnf, on all servers:
Ensure the "server-id" is different and the "report-host" same as own hostname
$ sudo vim /etc/my.cnf

[mysqld]
# basic setting
datadir = /opt/mysql/data
tmpdir = /opt/mysql/tmp
socket = /opt/mysql/run/mysqld.sock
port = 3306
pid-file = /opt/mysql/run/mysqld.pid

# innodb setting
default-storage-engine = INNODB
innodb_file_per_table = 1
log-bin = /opt/mysql/binlogs/bin-log-mysqld
log-bin-index = /opt/mysql/binlogs/bin-log-mysqld.index
innodb_data_home_dir = /opt/mysql/data
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /opt/mysql/data
binlog-do-db = testdb

# server id
server-id=1

# gtids setting
binlog-format = ROW
log-slave-updates = true
gtid-mode = on
enforce-gtid-consistency = true
report-host = demoenv-trial-1
report-port = 3306
master-info-repository = TABLE
relay-log-info-repository = TABLE
sync-master-info = 1

# other settings
max_allowed_packet = 64M
max_connections = 1024
expire_logs_days = 14
character-set-server = utf8

[mysqld_safe]
log-error = /opt/mysql/log/mysqld.log
pid-file = /opt/mysql/run/mysqld.pid
open-files-limit = 8192

[mysqlhotcopy]
interactive-timeout

[client]
port = 3306
socket = /opt/mysql/run/mysqld.sock
default-character-set = utf8

3. Create directories, on all servers:
$ sudo mkdir -p /opt/mysql/{data,tmp,run,binlogs,log}
$ sudo chown mysql:mysql /opt/mysql/{data,tmp,run,binlogs,log}

4. Initialize the database, on all servers:
$ sudo -i
# su - mysql
$ mysql_install_db --user=mysql --datadir=/opt/mysql/data/
$ exit
# exit
$ sudo /etc/init.d/mysql start

5. Create remote access to root@'%' for "mysqlreplicate" to create replication setting, on all servers:
$ mysql -uroot
mysql> grant all on *.* to root@'%' identified by 'pass' with grant option;
mysql> quit;

6. Create replication user, on all servers:
$ mysql -uroot
mysql> grant replication slave on *.* to 'rpl'@'%' identified by 'rpl';
mysql> quit;

7. Set up replication, on any one server:
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot
mysql> use mysql;
mysql> drop user root@'demoenv-trial-1';
mysql> quit;

[dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-1':3306 --slave=root:pass@'demoenv-trial-2':3306 --rpl-user=rpl:rpl

# master on demoenv-trial-1: ... connected.
# slave on demoenv-trial-2: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.

[dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-1':3306 --slave=root:pass@'demoenv-trial-3':3306 --rpl-user=rpl:rpl

# master on demoenv-trial-1: ... connected.
# slave on demoenv-trial-3: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.

8. Verify replication to slaves
[dong.guo@demoenv-trial-1 ~]$ mysql -uroot
mysql> create database testdb;
mysql> quit;

[dong.guo@demoenv-trial-1 ~]$ mysql -uroot -ppass -h'demoenv-trial-2' -e 'show databases;'

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
| testdb             |
+--------------------+

[dong.guo@demoenv-trial-1 ~]$ mysql -uroot -ppass -h'demoenv-trial-3' -e 'show databases;'

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
| testdb             |
+--------------------+

[dong.guo@demoenv-trial-1 ~]$ mysqlrplshow --master=rpl:rpl@'demoenv-trial-1':3306 --discover-slaves-login=root:pass;

# master on demoenv-trial-1: ... connected.
# Finding slaves for master: demoenv-trial-1:3306

# Replication Topology Graph
demoenv-trial-1:3306 (MASTER)
   |
   +--- demoenv-trial-2:3306 - (SLAVE)
   |
   +--- demoenv-trial-3:3306 - (SLAVE)

[dong.guo@demoenv-trial-1 ~]$ mysqlrplcheck --master=root:pass@'demoenv-trial-1' --slave=root:pass@'demoenv-trial-2'

# master on demoenv-trial-1: ... connected.
# slave on demoenv-trial-2: ... connected.
Test Description                                                     Status
---------------------------------------------------------------------------
Checking for binary logging on master                                [pass]
Are there binlog exceptions?                                         [WARN]

+---------+---------+------------+
| server  | do_db   | ignore_db  |
+---------+---------+------------+
| master  | testdb  |            |
+---------+---------+------------+

Replication user exists?                                             [pass]
Checking server_id values                                            [pass]
Checking server_uuid values                                          [pass]
Is slave connected to master?                                        [pass]
Check master information file                                        [pass]
Checking InnoDB compatibility                                        [pass]
Checking storage engines compatibility                               [pass]
Checking lower_case_table_names setting                              [pass]
Checking slave delay (seconds behind master)                         [pass]
# ...done.   

9. Test master shutdown and verify the failover functionality
[dong.guo@demoenv-trial-1 ~]$ mysqlfailover --master=root:pass@'demoenv-trial-1':3306 --discover-slaves-login=root:pass --rediscover

------------------------------------------------------------------
# Discovering slaves for master at demoenv-trial-1:3306
# Discovering slave at demoenv-trial-2:3306
# Found slave: demoenv-trial-2:3306
# Discovering slave at demoenv-trial-3:3306
# Found slave: demoenv-trial-3:3306
# Checking privileges.
# Discovering slaves for master at demoenv-trial-1:3306
MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Sun Oct 20 06:58:52 2013

Master Information
------------------
Binary Log File       Position  Binlog_Do_DB  Binlog_Ignore_DB  
bin-log-mysqld.00000  299       testdb                          

GTID Executed Set
92df196b-3906-11e3-b6b6-000c290d14d7:1

Replication Health Status
+------------------+-------+---------+--------+------------+-------------------------------------------+
| host             | port  | role    | state  | gtid_mode  | health                                    |
+------------------+-------+---------+--------+------------+-------------------------------------------+
| demoenv-trial-1  | 3306  | MASTER  | UP     | ON         | OK                                        |
| demoenv-trial-2  | 3306  | SLAVE   | UP     | ON         | OK                                        |
| demoenv-trial-3  | 3306  | SLAVE   | UP     | ON         | OK                                        |
+------------------+-------+---------+--------+------------+-------------------------------------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs
------------------------------------------------------------------

Then the terminal hung up, so open another terminal to shutdown the master:

Then on the hung up terminal, we could see:

------------------------------------------------------------------
Failover starting in 'auto' mode...
# Candidate slave demoenv-trial-2:3306 will become the new master.
# Checking slaves status (before failover).
# Preparing candidate for failover.
# Creating replication user if it does not exist.
# Stopping slaves.
# Performing STOP on all slaves.
# Switching slaves to new master.
# Disconnecting new master as slave.
# Starting slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Failover complete.
# Discovering slaves for master at demoenv-trial-2:3306

Failover console will restart in 5 seconds.

MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Sun Oct 20 07:01:25 2013

Master Information
------------------
Binary Log File       Position  Binlog_Do_DB  Binlog_Ignore_DB  
bin-log-mysqld.00000  299       testdb                          

GTID Executed Set
92df196b-3906-11e3-b6b6-000c290d14d7:1

Replication Health Status
+------------------+-------+---------+--------+------------+-------------------------------------------+
| host             | port  | role    | state  | gtid_mode  | health                                    |
+------------------+-------+---------+--------+------------+-------------------------------------------+
| demoenv-trial-2  | 3306  | MASTER  | UP     | ON         | OK                                        |
| demoenv-trial-3  | 3306  | SLAVE   | UP     | ON         | OK                                        |
+------------------+-------+---------+--------+------------+-------------------------------------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs
------------------------------------------------------------------

The 'demoenv-trial-2' became the new master.

10. Inject data into new master for replication
[dong.guo@demoenv-trial-2 ~]$ mysql -uroot
mysql> use testdb;
Database changed
mysql> CREATE TABLE `hostgroup` (
-> `hostgroup_id` tinyint(4) NOT NULL AUTO_INCREMENT,
-> `hostgroup_name` char(20) DEFAULT NULL,
-> `hostgroup_next` tinyint(4) NOT NULL,
-> `colo_name` char(4) NOT NULL,
-> PRIMARY KEY (`hostgroup_id`)
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Query OK, 0 rows affected (0.10 sec)

mysql> quit;

11. Verify replication to slaves
[dong.guo@demoenv-trial-2 binlogs]$ mysql -uroot -ppass -h'demoenv-trial-3' testdb -e 'show tables;'

+------------------+
| Tables_in_testdb |
+------------------+
| hostgroup        |
+------------------+

12. Test old-master promotions and verify
[dong.guo@demoenv-trial-1 ~]$ sudo /etc/init.d/mysql start
[dong.guo@demoenv-trial-1 ~]$ mysqlreplicate --master=root:pass@'demoenv-trial-2':3306 --slave=root:pass@'demoenv-trial-1':3306

# master on demoenv-trial-2: ... connected.
# slave on demoenv-trial-1: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.

Then on the hung up terminal, we could see:

------------------------------------------------------------------
...
Replication Health Status
+------------------+-------+---------+--------+------------+-------------------------------------------+
| host             | port  | role    | state  | gtid_mode  | health                                    |
+------------------+-------+---------+--------+------------+-------------------------------------------+
| demoenv-trial-2  | 3306  | MASTER  | UP     | ON         | OK                                        |
| demoenv-trial-1  | 3306  | SLAVE   | UP     | ON         | OK                                        |
| demoenv-trial-3  | 3306  | SLAVE   | UP     | ON         | OK                                        |
+------------------+-------+---------+--------+------------+-------------------------------------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs
------------------------------------------------------------------

Make recovered old-master be restored as the master:
[dong.guo@demoenv-trial-1 ~]$ mysqlrpladmin --master=root:pass@'demoenv-trial-2':3306 --new-master=root:pass@'demoenv-trial-1':3306 --demote-master --discover-slaves-login=root:pass switchover

# Discovering slaves for master at demoenv-trial-2:3306
# Discovering slave at demoenv-trial-1:3306
# Found slave: demoenv-trial-1:3306
# Discovering slave at demoenv-trial-3:3306
# Found slave: demoenv-trial-3:3306
# Checking privileges.
# Performing switchover from master at demoenv-trial-2:3306 to slave at demoenv-trial-1:3306.
# Checking candidate slave prerequisites.
# Checking slaves configuration to master.
# Waiting for slaves to catch up to old master.
# Stopping slaves.
# Performing STOP on all slaves.
# Demoting old master to be a slave to the new master.
# Switching slaves to new master.
# Starting all slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Switchover complete.
#
# Replication Topology Health:
+------------------+-------+---------+--------+------------+-------------------------------------------+
| host             | port  | role    | state  | gtid_mode  | health                                    |
+------------------+-------+---------+--------+------------+-------------------------------------------+
| demoenv-trial-1  | 3306  | MASTER  | UP     | ON         | OK                                        |
| demoenv-trial-2  | 3306  | SLAVE   | UP     | ON         | OK                                        |
| demoenv-trial-3  | 3306  | SLAVE   | UP     | ON         | OK                                        |
+------------------+-------+---------+--------+------------+-------------------------------------------+
# ...done.

Then on the hung up terminal, we could see:

------------------------------------------------------------------
MySQL Replication Failover Utility
Failover Mode = auto     Next Interval = Sun Oct 20 07:30:07 2013

Master Information
------------------
Binary Log File       Position  Binlog_Do_DB  Binlog_Ignore_DB  
bin-log-mysqld.00000  710       testdb                          

GTID Executed Set
8a58172b-1efd-11e3-8cf1-000c2950fe0c:1 [...]

Replication Health Status
+------------------+-------+---------+--------+------------+------------------------------------+
| host             | port  | role    | state  | gtid_mode  | health                             |
+------------------+-------+---------+--------+------------+------------------------------------+
| demoenv-trial-2  | 3306  | MASTER  | UP     | ON         | OK                                 |
| demoenv-trial-1  | 3306  | SLAVE   | UP     | WARN       | Slave is not connected to master.  |
| demoenv-trial-3  | 3306  | SLAVE   | UP     | WARN       | Slave is not connected to master.  |
+------------------+-------+---------+--------+------------+------------------------------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs
------------------------------------------------------------------

Looks like the failover cannot work after the old-master restored.

If we input Q to stop it, then run the failover again:
[dong.guo@demoenv-trial-1 ~]$ mysqlfailover --master=root:pass@'demoenv-trial-1':3306 --discover-slaves-login=root:pass --rediscover

------------------------------------------------------------------
...

Replication Health Status
+------------------+-------+---------+--------+------------+-------------------------------------------+
| host             | port  | role    | state  | gtid_mode  | health                                    |
+------------------+-------+---------+--------+------------+-------------------------------------------+
| demoenv-trial-1  | 3306  | MASTER  | UP     | ON         | OK                                        |
| demoenv-trial-2  | 3306  | SLAVE   | UP     | ON         | OK                                        |
| demoenv-trial-3  | 3306  | SLAVE   | UP     | ON         | OK                                        |
+------------------+-------+---------+--------+------------+-------------------------------------------+

Q-quit R-refresh H-health G-GTID Lists U-UUIDs
------------------------------------------------------------------

It worked again.

13. So, here is a simple summary of the "auto-failover":
It is from a package "mysql-utilities";
It only works with MySQL5.6+ (GITDs);
It could automatically make a slave become new master if the current master dead;
But must manually make the old-master restored, and then it stopped working until restarted.

, ,

  1. #1 by Tadeu on November 28, 2013 - 2:29 am

    Hello,

    Thank you for the great post. I have a question: How can I connect mysql utilities to the master and slaves via SSL? I must use encryption and I setup the replication user with the "REQUIRE SSL" parameter and so I can only connect if its via SSL.

    Thank you in advance!

    • #2 by mcsrainbow on November 28, 2013 - 8:42 am

      Sorry, I didn't try the SSL.
      Maybe you can use XtraDB Cluster instead, on my blog "How to setup Percona XtraDB Cluster", it seems better than Utilities.

  2. #3 by Ravi on November 24, 2014 - 5:39 pm

    Hello ... in Ubuntu mysql 5.6.

    1. Can I have master on HOST1 and 2 slaves on HOST2 and perform the above steps..

    2. Does this work only on clusters or can be set up on VMs too?
    . WIP... but trying to see if it will work if I know ahead of time :)

    Thanks

  3. #4 by Ravi on November 24, 2014 - 5:40 pm

    Also.. will this worked with bin log MIXED format?

  4. #5 by Aneesha on April 24, 2017 - 10:28 pm

    Hi All,

    I have tried to implement automatic failover and recovery in replication using mysqlfailover command.
    When I execute the replication status by mysqlrplshow and mysqlrplcheck, slave and master are listed correctly.

    But when I try to execute mysqlfailover command, failover mode=auto, but under replication health status "0 rows found" .

    Am I missing anything? Could you please help me to figure out the issue

(will not be published)

*


Fork me on GitHub