Jump to content

How to Scale a Relational Database with Amazon EC2

+ 2
  chco's Photo
Posted Apr 16 2011 12:59 AM

It’s amazing what the AWS team achieved with RDS in a little over a year since it was introduced. The team added the high-availability feature it calls multi-AZ, or multiavailability zone. And after that, it also added read replicas. With these new features, in addition to the existing scalability features, RDS is growing into a serious RDBMS service. You can easily scale up while minimizing downtime, and also scale out without too much hassle. The excerpt below from the O'Reilly publication Programming Amazon EC2 can show you how.


If your app gets to the point that you need to start scaling either up or out, it is a good idea to switch to multi-AZ if you don’t run it already. If you have a simple RDS instance, you will degrade your service significantly while scaling, as you can expect to lose the ability to write and/or read. With multi-AZ RDS instances, your service is almost uninterrupted.

Scaling Up (or Down)

Scaling up is so easy it is almost ridiculous. The only drawback is that you have some downtime during the operation. If you don’t have multi-AZ enabled, the downtime of your RDS instance could be several minutes, as you have to wait until a new instance is launched and fully functional. For multi-AZ RDS instances, you will experience some downtime as a failover is initiated after the slave has been scaled up (or down). This failover doesn’t take more than a minute most of the time.

If you initiate a scaling activity via the Console, make sure you enable Apply Immediately if you are in a hurry. If you don’t, scaling will take place during the scheduled maintenance period (Figure 3-4).

Figure 3-4. Modify the RDS instance (scaling up)

Attached Image


Scaling using the command-line tools is a two-step process. First scale, and then reboot:

$ rds-modify-db-instance production \
        --db-instance-class db.m1.xlarge --apply-immediately
$ rds-reboot-db-instance production



DB instance classes

Of course, every service in AWS uses a slightly different naming convention. The equivalent of EC2 Instance Type for RDS is called DB Instance Class. Luckily, the classes themselves are more or less similar to the types in EC2. The smallest possible RDS instance you can have is comparable to a small EC2 instance, for example, though we experience the performance as a bit more consistent with the RDS instance. Here are all the instance classes with their descriptions as AWS advertises them:


Small DB Instance

1.7 GB memory, 1 EC2 Compute Unit (1 virtual core with 1 ECU), 64-bit platform, moderate I/O capacity


Large DB Instance

7.5 GB memory, 4 ECUs (2 virtual cores with 2 ECUs each), 64-bit platform, high I/O capacity


Extra Large DB Instance

15 GB memory, 8 ECUs (4 virtual cores with 2 ECUs each), 64-bit platform, high I/O capacity


High-Memory Extra Large DB Instance

17.1 GB memory, 6.5 ECUs (2 virtual cores with 3.25 ECUs each), 64-bit platform, high I/O capacity


High-Memory Double Extra Large DB Instance

34 GB memory, 13 ECUs (4 virtual cores with 3.25 ECUs each), 64-bit platform, high I/O capacity


High-Memory Quadruple Extra Large DB Instance

68 GB memory, 26 ECUs (8 virtual cores with 3.25 ECUs each), 64-bit platform, high I/O capacity


Scaling Out

You can scale out a relational database in two different ways:

  • Using read-only slaves (read replicas in AWS)

  • Sharding or partitioning


There are still some hard problems to solve, as sharding/partitioning has not been addressed yet with RDS. Master-slave type scaling is available, though. A slave, or read replica, is easily created from the Console (Figure 3-5). The only requirement on the master RDS instance is that backups are not disabled by setting the backup retention period to 0. Currently, you can have up to five read replicas that you have to launch one by one. Amazon is working on the ability to launch multiple replicas at once, but that is not yet available.

On a multi-AZ RDS instance, launching a read replica goes unnoticed. A snapshot is taken from the standby, the replica is launched, and when it is ready, it starts to catch up with the master. For a normal RDS instance, there is a brief I/O suspension in the order of one minute. AWS advises to use the same instance classes, as differing classes may incur replica lag. With read replicas, you basically introduce eventual consistency in your database (cluster).

Note: The read replica mechanism uses MySQL’s native, asynchronous replication. This means replicas might be lagging behind the master as they try to catch up with writes. The interesting thing about this is that multi-AZ RDS instances apparently use another, proprietary type of synchronous replication.

Figure 3-5. Create a read replica

Attached Image


Storage engine

The default storage engine with RDS is InnoDB, but you are free to choose another, like the popular MyISAM. It is important to realize that read replicas on nontransactional storage engines (like MyISAM) require you to freeze your databases, as the consistency cannot be guaranteed when snapshotting. But if you use InnoDB, you are safe, and the only thing you have to do is fire up a new read replica.

Tips and Tricks

RDS is MySQL, but you don’t have as much control as you would with MySQL. There are certain peculiarities in RDS’ MySQL implementation that can cause a lot of frustration.

Disk is slow

A disk is always slower than memory. If you run your own MySQL using local disks, that’s slow as well. But using disk-based operations in RDS is just horrible. Minimizing disk usage means implementing proper indexes, something you always want to do. Other operations that require fast disk access are views. For those experienced with Oracle, this is a huge surprise, but the views in MySQL are not that mature yet.

Slow log

We always enable slow query logging. Sometimes the slow query grows too much. You can access the slow log through the slow_log table in the mysql database. But you can’t just truncate; you have to use the procedure rds_rotate_slow_log by executing the following command in your MySQL client:

> CALL rds_rotate_slow_log



Storage

RDS storage is independent of RDS instance classes. Every class can have from 5 GB to 1 TB of storage associated. Scaling up the storage is easy, and you can do it using the Console. It does require a reboot. On the other hand, scaling down the storage is impossible.

Programming Amazon EC2

Learn more about this topic from Programming Amazon EC2.

If you plan to build applications to run on Amazon's Web Services, the end-to-end approach in this book will save you needless trial and error. You'll find practical guidelines for designing and building applications with Amazon Elastic Compute Cloud (EC2) and a host of supporting AWS tools, with a focus on critical issues such as load balancing, monitoring, and automation.

See what you'll learn


Tags:
1 Subscribe


0 Replies