MENU

AWS RDS +EC2实例主从

2018 年 10 月 26 日 • 阅读: 3604 • aws亚马逊,mysql备份

2018-10-26 01:07:32
近期业务需要想要把mysql从RDS迁移至EC2。由于主库业务不能有任何问题,不确定的情况下不修改主库基础参数,在EC2上重做从库。

1,申请一个配置有多可用区的测试实例:giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com

2,授权复制帐号和测试帐号(sysbench做测试)

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'repl'@'117.160.%' IDENTIFIED BY 'repl'; 
grant select,create,index,insert,update,delete,drop  on *.* to gktest@'117.160.%' identified by 'giikin';

#使其立刻生效
flush privileges;

3,得到主库上的binlog位置

>show master status\G
*************************** 1. row ***************************
             File: mysql-bin-changelog.000023
         Position: 5768524
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.01 sec)

4,在EC2准备一台从库,指定好主从关系

>CHANGE MASTER TO MASTER_HOST='giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin-changelog.000023', MASTER_LOG_POS=5768524, MASTER_USER='repl', MASTER_PASSWORD='repl';

Query OK, 0 rows affected, 2 warnings (0.02 sec)
>start slave;
Query OK, 0 rows affected (0.00 sec)
>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin-changelog.000023
          Read_Master_Log_Pos: 5768524
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 293
        Relay_Master_Log_File: mysql-bin-changelog.000023
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: mysql
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: mysql.%
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 5768524
              Relay_Log_Space: 460
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 459205842
                  Master_UUID: 46da27c8-007c-11e7-9b5b-06051bcfdaae
             Master_Info_File: /data/dbdata/mysqldata/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

5,用sysbench往RDS写入数据的时候,通过重启实例来模拟主库failover的时候的对从库的影响。

/usr/local/sysbench/bin/sysbench --mysql-host=giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com --mysql-port=3306 --mysql-user=gktest --mysql-password=giikin   --mysql-db=failovertest --oltp-tables-count=5 --oltp-table-size=6000000  --max-requests=100000000 --test=/usr/local/sysbench/tests/db/select.lua prepare

6,此时连mysql的时候会报错,管理后台显示该实例正在修复中

$ mysql -uroot -p -hgiikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com 
ERROR 2003 (HY000): Can't connect to MySQL server on 'giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com' (110)

7,查看slave的复制状态是否正常。

>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event ##正在等待同步
                  Master_Host: giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin-changelog.000024
          Read_Master_Log_Pos: 24121968
               Relay_Log_File: relay-log.000004
                Relay_Log_Pos: 24122141
        Relay_Master_Log_File: mysql-bin-changelog.000024
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: mysql
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: mysql.%
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 24121968
              Relay_Log_Space: 24122365
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 459205842
                  Master_UUID: 46da27c8-007c-11e7-9b5b-06051bcfdaae
             Master_Info_File: /data/dbdata/mysqldata/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

多次执行show slave statusG命令发现复制状态都显示正常,但是位置一直卡在mysql-bin-changelog.000024 24121968这里不动,此时RDS上面的主库一直有用sysbench写数据进来。可以断定虽然复现状态显示正常,但是其实连接已经失效。

8,在RDS主库上面查看一下binary log的信息可以发现从库已经执行完了mysql-bin-changelog.000024日志的所有event,从日志偏移量可以看出mysql-bin-changelog.000024是failover之前的日志,mysql-bin-changelog.000025是failover之后的日志。

>show binary logs;
+----------------------------+-----------+
| Log_name                   | File_size |
+----------------------------+-----------+
| mysql-bin-changelog.000020 | 134243795 |
| mysql-bin-changelog.000021 | 111694820 |
| mysql-bin-changelog.000022 | 134243505 |
| mysql-bin-changelog.000023 | 134244377 |
| mysql-bin-changelog.000024 |  24121968 |
| mysql-bin-changelog.000025 | 134244377 |
| mysql-bin-changelog.000026 | 134243504 |
| mysql-bin-changelog.000027 | 134243503 |
| mysql-bin-changelog.000028 | 114316703 |
+----------------------------+-----------+
9 rows in set (0.01 sec)

9,挠头想了半天怎么继续从RDS主库拉日志..,最后谷歌到可能由于EC2的从库连接RDS是用的域名,failover后这个域名已经指向了另外一个ip,而当前复制的连接一直没有释放,在是连的之前对应的ip。
感觉很有道理,然后在从库上面执行stop slave; start slave;再查看复制状态时,发现并不是这个问题。

>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin-changelog.000039
          Read_Master_Log_Pos: 660
               Relay_Log_File: looaon-relay-bin.000004
                Relay_Log_Pos: 706
        Relay_Master_Log_File: mysql-bin-changelog.000025###############此处不同步
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1146
                   Last_Error: Error 'Table 'mysql.rds_heartbeat2' doesn't exist' on query. Default database: 'mysql'. Query: 'INSERT INTO mysql.rds_heartbeat2(id, value) values (1,1540479644062) ON DUPLICATE KEY UPDATE value = 1540479644062'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 405
              Relay_Log_Space: 23301
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1146
               Last_SQL_Error: Error 'Table 'mysql.rds_heartbeat2' doesn't exist' on query. Default database: 'mysql'. Query: 'INSERT INTO mysql.rds_heartbeat2(id, value) values (1,1540479644062) ON DUPLICATE KEY UPDATE value = 1540479644062'
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1670357905
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
                   Using_Gtid: No
                  Gtid_IO_Pos: 
1 row in set (0.00 sec)

测试:

Master_Log_File可以获取到,由此可见主库放RDS,从库放EC2这样的形式理论上是可行的。

未解决问题:

Relay_Master_Log_File 不同步

未解报错:

Last_Error: Error 'Table 'mysql.rds_heartbeat2' doesn't exist' on query. Default database: 'mysql'. Query: 'INSERT INTO mysql.rds_heartbeat2(id, value) values (1,1540479644062) ON DUPLICATE KEY UPDATE value = 1540479644062'

重点配置:

给账号一个maseter
change maste to
master_host=‘giikin_test’
master_usr='repl'
master_password='repl'
master_log_file='mysql-bin.ooooo1',
master_log_pos=660

用到的命令

CHANGE MASTER TO MASTER_HOST='giikin-test.civxfbosrcnl.ap-southeast-1.rds.amazonaws.com', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin-changelog.000024', MASTER_LOG_POS=660, MASTER_USER='root', MASTER_PASSWORD='looaon.com';

查看状态

show slave status\G

查看数据库表

show databases;

重置slave

start slave;  #使配置生效
stop slave;   #停止服务
reset slave;  #启动服务

设置ecs为只读库

show global variables like "%read_only%";
flush tables with read lock;
set global read_only=1;
show global variables like "%read_only%";
set global read_only=1;    #1是只读,0是读写

可能遇到的问题:

主库failover时需要马上在从库上面执行stop slave;start slave;命令,这样从库不至于落后主库同步太长时间。

兼总条贯 知至知终

最后编辑于: 2018 年 12 月 06 日