How to Stop Playing “Hop and Seek”: MySQL Cluster and TokuDB
As a TokuDB storage engine developer, numerous times I’ve been struck by the similarities between MySQL Cluster and TokuDB. Namely, many times where I find myself thinking, “TokuDB would benefit from this feature”, I also end up thinking “MySQL Cluster would benefit from this feature” as well.
At first glance, one may wonder why. TokuDB is a storage engine designed to work well on big data, providing compression, agility, and performance, while MySQL Cluster is a distributed database solution (http://www.mysql.com/products/cluster/) that provides (among many other things) auto sharding and 99.999% availability. TokuDB’s innovation, Fractal Trees® indexes, are designed to drastically reduce the number of disk seeks performed, but TokuDB still operates on a hard disk. MySQL Cluster operates over a network. How can we be two peas in a pod?
But when I look at the two products from the perspective of an engineer whose job is to develop TokuDB, the similarities are endless, and here’s why. When I think a potential MySQL feature helps TokuDB (or possibly InnoDB) reduce its number of disk seeks, I notice that same feature probably helps MySQL cluster reduce its network hops.
One successful example so far is Hot Column Addition and Deletion (HCAD). The storage engine API in MySQL 5.1 does not provide the storage engine with a mechanism for adding a dropping a column without going through a table rebuild. However, the API in MySQL Cluster did, as adding a column remotely on a server beats rebuilding a table over a network. To implement HCAD, we ported the changes in the API over to MySQL 5.1. Thanks to this work done by MySQL Cluster, we were able to release a great feature that our users love. Rich provides more details in http://www.tokutek.com/2012/06/addressing-hot-schema-changes-in-mysql/.
Over the next couple of blog posts, I will present some more examples of possible features that both TokuDB and MySQL Cluster can benefit greatly from.