How MySQL replication works: source and replica setup, binary log-based replication, GTID replication, replication lag, monitoring with SHOW REPLICA STATUS, and common replication problems.
Use this file to discover all available pages before exploring further.
VillageSQL is a drop-in replacement for MySQL with extensions.
All examples in this guide work on VillageSQL. Install Now →
MySQL replication copies changes from one server (the source) to one or more servers (replicas). The source writes every change to the binary log; each replica reads that log and replays the changes. Replication is the foundation for read scaling, high availability, and geographic distribution.
SOURCE_AUTO_POSITION = 1 enables GTID-based replication. If you’re not using GTIDs, specify SOURCE_LOG_FILE and SOURCE_LOG_POS instead (values from the snapshot file header).
Global Transaction Identifiers (GTIDs) assign a unique ID to every committed transaction. They make failover and replica setup simpler — instead of tracking binlog filenames and positions, MySQL tracks which transactions each server has applied.Enable GTIDs on both source and replica:
[mysqld]gtid_mode = ONenforce_gtid_consistency = ON
With GTIDs enabled, SOURCE_AUTO_POSITION = 1 is all you need — MySQL figures out which transactions the replica is missing and replays them automatically.
The default setup — one source, one or more async replicas — isn’t the only option. MySQL offers several replication architectures with different consistency and availability trade-offs.
Architecture
Consistency guarantee
Failover
Write overhead
Complexity
Async replication (default)
None — replica may lag arbitrarily
Manual; some transactions may be lost
Minimal — source doesn’t wait
Low
Semi-sync replication
At least one replica has received the event before the source commits
Manual; at most one transaction lost
~1 RTT added to each write
Low–Medium
Group Replication
All members agree before commit (multi-primary or single-primary)
Automatic within the group
Higher — consensus round per transaction
High
InnoDB Cluster
Same as Group Replication (built on it)
Automatic with MySQL Router for client failover
Same as Group Replication
High
When to use each:
Async replication — the right default for read scaling and disaster-recovery replicas where some lag is acceptable.
Semi-sync — when you need durability insurance against source failure but can’t tolerate the operational complexity of Group Replication. Adds ~1 network round trip per write commit.
Group Replication / InnoDB Cluster — when you need automatic failover and can afford the write latency increase. InnoDB Cluster wraps Group Replication with MySQL Router and MySQL Shell for easier management.
Note: Group Replication requires all tables to use InnoDB and have a primary key. Enable it with the group_replication plugin; configuration is beyond the scope of this guide.
Direct read-heavy queries to replicas to reduce load on the source:
-- Application connects to replica for readsSELECT * FROM orders WHERE customer_id = 42;-- Connect to source for writesINSERT INTO orders (customer_id, total) VALUES (42, 99.99);
The application must tolerate replication lag. A replica may not have the row inserted a few milliseconds ago on the source. For reads that must reflect the latest write, connect to the source.
Replication solves specific problems well and is the wrong tool for others. Before reaching for it, check whether it actually fits.
Scenario
Replication
Sharding
Read replicas only
Distributed DB (e.g., Vitess, TiDB)
Read-heavy workload, writes fit on one server
Good fit
Overkill
Simpler — just add replicas
Overkill
Write volume exceeds one server
Won’t help — all replicas replay every write
Good fit
Won’t help
Good fit
HA / automatic failover
Needs Group Replication or InnoDB Cluster
Doesn’t help directly
Doesn’t help directly
Built in
Geo-distribution / low-latency reads in multiple regions
Works well with delayed or regional replicas
Possible but complex
Works well
Built in for some
Point-in-time recovery
Good — binlog enables PITR
Complicates PITR
Good
Varies
Schema is highly partitionable by tenant/key
Works but wastes replica write bandwidth
Better fit
Works but same waste
Good fit
Tight read-your-writes consistency required
Use source for those reads
N/A
Use source for those reads
Depends on product
The short version: replication is the right call when one server handles all your writes and you need read scale, HA, or a warm standby. If your write throughput is the bottleneck, replication copies the problem to every replica — sharding or a distributed database addresses it at the source.
STOP REPLICA; -- stop both threadsSTOP REPLICA IO_THREAD; -- stop only the I/O threadSTOP REPLICA SQL_THREAD; -- stop only the SQL threadSTART REPLICA;
To skip a single failing event (use carefully — skipping events causes the replica to diverge):
-- GTID mode: inject an empty transaction for the failed GTIDSET GTID_NEXT = 'source_uuid:N';BEGIN; COMMIT;SET GTID_NEXT = 'AUTOMATIC';START REPLICA;
Does replication work across MySQL major versions?
MySQL supports replicating from an older source to a newer replica, but not the reverse. A MySQL 8.4 replica can follow a MySQL 8.0 source. A MySQL 8.0 replica cannot follow a MySQL 8.4 source. Always upgrade replicas before upgrading the source.
What’s the difference between semi-synchronous and asynchronous replication?
In asynchronous replication (the default), the source commits without waiting for any replica to acknowledge receipt. In semi-synchronous replication, the source waits for at least one replica to confirm it received the event before returning to the client. Semi-sync reduces the risk of data loss on source failure, at the cost of slightly higher write latency.