To handle a large number of reads, sites use replication to create copies of the master and then let the slaves handle all read requests while the master handles the write requests. This replication is considered asynchronous because the master does not wait for the slaves to apply the changes, but instead just dispatches each change request to the slaves and assumes they will catch up eventually and replicate all the changes. This technique for improving performance is usually a good idea when you are scaling out.
In contrast, synchronous replication keeps the master and slaves in sync and does not allow a transaction to be committed on the master unless the slave agrees to commit it as well. That is, synchronous replication makes the master wait for all the slaves to keep up with the writes.
Asynchronous replication is a lot faster than synchronous replication, for reasons our description should make obvious. Compared to asynchronous replication, synchronous replication requires extra synchronizations to guarantee consistency. It is usually implemented through a protocol called two-phase commit, which guarantees consistency between the master and slaves, but requires extra messages to ping-pong between them. Typically, it works like this:
- When a commit statement is executed, the transaction is sent to the slaves and the slave is asked to prepare for a commit.
- Each slave prepares the transaction so that it can be committed, and then sends an OK (or ABORT) message to the master, indicating that the transaction is prepared (or that it could not be prepared).
- The master waits for all slaves to send either an OK or an ABORT message.
- If the master receives an OK message from all slaves, it sends a commit message to all slaves asking them to commit the transaction.
- If the master receives an ABORT message from any of the slaves, it sends an abort message to all slaves asking them to abort the transaction.
- If the master receives an OK message from all slaves, it sends a commit message to all slaves asking them to commit the transaction.
- Each slave is then waiting for either an OK or an ABORT message from the master.
- If the slaves receive the commit request, they commit the transaction and send an acknowledgment to the master that the transaction is committed.
- If the slaves receive an abort request, they abort the transaction by undoing any changes and releasing any resources they held, then send an acknowledgment to the master that the transaction was aborted.
- If the slaves receive the commit request, they commit the transaction and send an acknowledgment to the master that the transaction is committed.
- When the master has received acknowledgments from all slaves, it reports the transaction as committed (or aborted) and continues with processing the next transaction.
What makes this protocol slow is that it requires a total of four messages, including the messages with the transaction and the prepare request. The major problem is not the amount of network traffic required to handle the synchronization, but the latency introduced by the network and by processing the commit on the slave, together with the fact that the commit is blocked on the master until all the slaves have acknowledged the transaction. In contrast, asynchronous replication requires only a single message to be sent with the transaction. As a bonus, the master does not have to wait for the slave, but can report the transaction as committed immediately, which improves performance significantly.
So why is it a problem that synchronous replication blocks each commit while the slaves process it? If the slaves are close to the master on the network, the extra messages needed by synchronous replication make little difference, but if the slaves are not nearby—maybe in another town or even on another continent—it makes a big difference.
The table below shows some examples for a server that can commit 10,000 transactions per second. This translates to a commit time of 0.1 ms (but note that some implementations, such as MySQL Cluster, are able to process several commits in parallel if they are independent). If the network latency is 0.01 ms (a number we’ve chosen as a baseline by pinging one of our own computers), the transaction commit time increases to 0.14 ms, which translates to approximately 7000 transactions per second. If the network latency is 10 ms (which we found by pinging a server in a nearby city), the transaction commit time increases to 40.1 ms, which translates to about 25 transactions per second! In contrast, asynchronous replication introduces no delay at all, because the transactions are reported as committed immediately, so the transaction commit time stays at the original 10,000 per second, just as if there were no slaves.
The performance of asynchronous replication comes at the price of consistency. Recall that in asynchronous replication the transaction is reported as committed immediately, without waiting for any acknowledgment from the slave. This means the master may consider the transaction committed when the slave does not. As a matter of fact, it might not even have left the master, but is still waiting to be sent to the slave.
There are two problems with this that you need to be aware of:
- In the event of crashes on the master, transactions can “disappear.”
- A query executed on the slaves might return old data.
Server bottlenecks and failures are a fact of life in any database deployment, but they don't have to bring everything to a halt. MySQL provides several features that can protect you from outages, whether you're running directly on the hardware, on virtual machines, or in the cloud. This book shows you how to use these features effectively, and helps you determine which combination of features will give you the most reliable system for a price you can afford.




Help





