On Mysql's async replication for production
How async replication works in mysql
- Async replication is the default behavior. Semi-sync replication is introduced in 5.7
- Master writes changes to a local binlog file. Note SELECT or SHOW will not be written to the binlog since they don’t affect slave state
- Note that this binlog is different from innodb’s undo and redo log. It is main purpose is for replication and recovery after restoring from a backup
- On AWS RDS, binlog is not turned on by default and can be configured to remain alive for max 7 days
- This additional binlog write introduces additional performance penality. On Aurora, we observe at least 1/3 reduction in through if we turn on binlog
- The binlog file has a buffer controlled by
binlog_cache_size
. Temp file will be open buffer is not enough - Since it is just a file, it is subject to failures during flush/fsych experienced by normal files. The default behavior is that server will shut down upon detecting such issues
- Slave pulls from the master. Master does NOT try pushing to the slave
- By default, slave does not write it is own binlog
- Any detected corruption/data loss in the binlog means that replication should be restarted from scractch again. Current slave’s data should be discarded
Complications from production setup
On production, mysql is normally on active-standby with semisync or sync replicaiton between them. This introduces additonal complications on async replicaiton. We call current active X, standby Y, the X-Y cluster source, and the async replica sink
From the source pov:
- To ensure async replication remains working after Y got promoted to master after X failed for some reason, at minimum GTID mode has to be turned, but this caused write stalls on master db, especially Aurora
- Note that even with GTID mode on, we can not ensure no loss of binlog data which has been persisted on X, although chance is greatly reduced
From the sink pov:
- Suppose failover between X and Y happened, the pulling connection to X may still be active and alive, while the new active is Y. This means when the sink pulls binlog from a connection pool (most likely in the network applications), it may be pulling from both X and Y , i.e., a split brain problem.
- The root cause is that during failover, by CAP, the async replicaiton should be killed and not available to avoid the split brain, but the implmentation does not obey this
- In practice, split brain problem is not likely to cause problem by turning on
log_slave_updates
, but split brain is a critical situation that should be avoided conceptually. - When such split brain replication happens, either fix it through domain knowledge, or discard the current sink and restart the replication from the scratch
- This means async replication is good for the case where some slacks are allowed in data consistencies and availabilities, e.g.,
- Cross region failover instances, where we accept the loss/corruption of couple rows so that we can recover in the case of main region data loss
- read slave for OLAP workload/pipelines. So we have a mostly up-to-date data source with little additional workload on the master
- Similarly, async replication is not a good fit for mission critical systems - they will have to be on master or a sync-replicated replica