VillageSQL is a drop-in replacement for MySQL with extensions.
All examples in this guide work on VillageSQL. Install Now →
How Replication Works
- The source commits a transaction and writes it to the binary log.
- The replica’s I/O thread connects to the source, reads new binlog events, and writes them to a local relay log.
- The replica’s SQL thread reads from the relay log and executes each event.
Setting Up a Replica
On the source — enable binary logging and set a unique server ID inmy.cnf:
--source-data:
SOURCE_AUTO_POSITION = 1 enables GTID-based replication. If you’re not using GTIDs, specify SOURCE_LOG_FILE and SOURCE_LOG_POS instead (values from the snapshot file header).
GTID Replication
Global Transaction Identifiers (GTIDs) assign a unique ID to every committed transaction. They make failover and replica setup simpler — instead of tracking binlog filenames and positions, MySQL tracks which transactions each server has applied. Enable GTIDs on both source and replica:SOURCE_AUTO_POSITION = 1 is all you need — MySQL figures out which transactions the replica is missing and replays them automatically.
Monitoring Replication
| Field | What it means |
|---|---|
Replica_IO_Running | Yes = I/O thread connected and reading from source |
Replica_SQL_Running | Yes = SQL thread applying events |
Seconds_Behind_Source | Replication lag in seconds (0 = caught up) |
Last_SQL_Error | Last error that stopped the SQL thread |
Executed_Gtid_Set | GTIDs this replica has applied (GTID mode) |
Replica_IO_Running and Replica_SQL_Running must be Yes for replication to be working.
Replication Lag
Replication lag happens when the replica’s SQL thread can’t keep up with the source’s write rate. Common causes:- Single-threaded SQL thread: by default, replicas apply events serially. Enable parallel replication with
replica_parallel_workers:
- Long-running queries on the replica: queries that lock rows block the SQL thread.
- Network latency: slow I/O thread causes the SQL thread to run out of work.
- Disk I/O on replica: syncing relay log or data files is the bottleneck.
Replication Architectures
The default setup — one source, one or more async replicas — isn’t the only option. MySQL offers several replication architectures with different consistency and availability trade-offs.| Architecture | Consistency guarantee | Failover | Write overhead | Complexity |
|---|---|---|---|---|
| Async replication (default) | None — replica may lag arbitrarily | Manual; some transactions may be lost | Minimal — source doesn’t wait | Low |
| Semi-sync replication | At least one replica has received the event before the source commits | Manual; at most one transaction lost | ~1 RTT added to each write | Low–Medium |
| Group Replication | All members agree before commit (multi-primary or single-primary) | Automatic within the group | Higher — consensus round per transaction | High |
| InnoDB Cluster | Same as Group Replication (built on it) | Automatic with MySQL Router for client failover | Same as Group Replication | High |
- Async replication — the right default for read scaling and disaster-recovery replicas where some lag is acceptable.
- Semi-sync — when you need durability insurance against source failure but can’t tolerate the operational complexity of Group Replication. Adds ~1 network round trip per write commit.
- Group Replication / InnoDB Cluster — when you need automatic failover and can afford the write latency increase. InnoDB Cluster wraps Group Replication with MySQL Router and MySQL Shell for easier management.
group_replication plugin; configuration is beyond the scope of this guide.
Read Scaling with Replicas
Direct read-heavy queries to replicas to reduce load on the source:Replication vs. Alternatives
Replication solves specific problems well and is the wrong tool for others. Before reaching for it, check whether it actually fits.| Scenario | Replication | Sharding | Read replicas only | Distributed DB (e.g., Vitess, TiDB) |
|---|---|---|---|---|
| Read-heavy workload, writes fit on one server | Good fit | Overkill | Simpler — just add replicas | Overkill |
| Write volume exceeds one server | Won’t help — all replicas replay every write | Good fit | Won’t help | Good fit |
| HA / automatic failover | Needs Group Replication or InnoDB Cluster | Doesn’t help directly | Doesn’t help directly | Built in |
| Geo-distribution / low-latency reads in multiple regions | Works well with delayed or regional replicas | Possible but complex | Works well | Built in for some |
| Point-in-time recovery | Good — binlog enables PITR | Complicates PITR | Good | Varies |
| Schema is highly partitionable by tenant/key | Works but wastes replica write bandwidth | Better fit | Works but same waste | Good fit |
| Tight read-your-writes consistency required | Use source for those reads | N/A | Use source for those reads | Depends on product |
Stopping and Starting Replication
Frequently Asked Questions
Does replication work across MySQL major versions?
MySQL supports replicating from an older source to a newer replica, but not the reverse. A MySQL 8.4 replica can follow a MySQL 8.0 source. A MySQL 8.0 replica cannot follow a MySQL 8.4 source. Always upgrade replicas before upgrading the source.What’s the difference between semi-synchronous and asynchronous replication?
In asynchronous replication (the default), the source commits without waiting for any replica to acknowledge receipt. In semi-synchronous replication, the source waits for at least one replica to confirm it received the event before returning to the client. Semi-sync reduces the risk of data loss on source failure, at the cost of slightly higher write latency.Troubleshooting
| Problem | Solution |
|---|---|
Replica_IO_Running: No | Check Last_IO_Error — typically a connection or auth failure; verify SOURCE_HOST, user, password, and REPLICATION SLAVE grant |
Replica_SQL_Running: No | Check Last_SQL_Error — usually a data conflict; decide whether to skip the event or fix the root cause |
| Replication lag growing | Enable replica_parallel_workers; check for long-running queries on the replica blocking the SQL thread |
ERROR 1062: Duplicate entry on replica | Replica has a row the source doesn’t — indicates manual writes to the replica; use replica_skip_errors cautiously or resync |
| GTID gap after failover | New source is missing GTIDs the old source had applied — use SET GTID_PURGED carefully during failover |

