Jump to content

MySQL cluster VS split database load to multiple computers

cjapes's Photo
Posted Jan 24 2011 05:46 PM
3213 Views

MySQL's cluster software has some serious drawbacks.
Some of the main disadvantages include:

  • There is no referential integrity when using a cluster. Foreign Key constraints are ignored.
  • There are limits to the names of databases objects, and to the sizes of certain objects.
  • You cannot use full text indexes.
  • Changes to table structures are not automatically sent to all nodes on the cluster.


Much more disadvantages can be seen in Known Limitations of MySQL Cluster.

If we have an existing application running with MySQL database it is probable that we will have lots of work to work around the limitations of MySQL Cluster…

Taking this is consideration is it better to have a MySQL cluster or to split database load to multiple computers?

1 Reply

+ 1
  BillyFish's Photo
Posted Jan 28 2011 03:32 AM

Before answering the question - one correction in your list of limitations, schema changes that you make on one MySQL server will be visible on the others.

The answer (probably not a surprise) is that it depends on your application behaviour and requirements.

First some background... The Cluster architecture consists of a pool of MySQL Servers that are accessed by the application(s); these MySQL Servers don't actually store the Cluster data, the data is partitioned over the pool of data nodes below. Every MySQL Server has access to the data in all of the data nodes. If one MySQL server changes a piece of data then it is instantly visible to all of the other MySQL Servers.

Obviously, this architecture makes it extremely easy to scale out the database. Unlike sharding, the application doesn't need to know where the data is held - it can just load balance across all available MySQL Servers. Unlike scaling out with MySQL replication Cluster allows you to scale writes just as well as reads. New data nodes or MySQL servers can be added to an existing Cluster with no loss of service to the application.

MySQL Cluster's shared-nothing architecture means that it can deliver extremely high availability (99.999%+). Every time you change data, it is synchronously replicated to a second data node; if one data node fails then the applications read & write requests are automatically handled by the backup data node.

You are correct that MySQL Cluster doesn't support Foreign Keys - there are workarounds using stored procedures but if they're essential to you application then InnoDB is the only MySQL storage engine that supports them.

You are also correct that there are some limits on the schema (e.g. total number of tables, number of columns in a table (512) and size of a row (8K - excluding BLOBs)). If you can't live with or workaround these limits then Cluster shouldn't be used for those tables.

As with InnoDB, full text searches are not supported - people often use Sphinx to complement MySQL if they need this as well as ACID transactions (MyISAM is the only storage engine to support full text search natively but it is not transactional).

Due to the distributed nature of MySQL Cluster, some operations can be slower (for example JOINs that have thousands of interim results - though there is a prototype solution available which addresses this) but others can be very fast and can scale extremely well (e.g. primary key reads and writes). You have the option of storing tables (or even columns) in memory or on disk and by choosing the memory option (with changes checkpointed to disk in the backgoround) transactions can be *very* quick.

MySQL Cluster can be more complex to set up than a single MySQL server but it can prevent you having to implement sharding or read/write splitting in your application. Swings and roundabouts.

To get the best performance and scalability out of MySQL Cluster you need may need to tweak your application (see Cluster performance tuning white paper). If you own the application this isn't normally a big deal but if you're using someone else's application that you can't modify then it could be a problem.

A final note is that it doesn't need to be all or nothing - you can choose to store some of your tables in Cluster and some using other storage engines, this is a per-table option. Also you can replicate between Cluster and other storage engines (for example, use Cluster for your run-time database and then replicate to InnoDB to generate complex reports).

Another useful white paper to learn more about MySQL Cluster: architecture and new features white paper