Recovery Time for TokuDB
Last week Tokutek released version 3.0.0 of TokuDB, adding ACID transactions to its list of features. This post discusses an experiment we ran to measure recovery time following a system crash.
In summary, while actively inserting records into a MySQL database using iiBench, we compared the time to recover from a power-cord pull for both InnoDB and TokuDB.
|Storage Engine||Recovery Time|
|TokuDB||501s (8 min, 21 sec)|
|InnoDB||18505s (5 hours, 8 min, 25 sec)|
This is by no means an exhaustive look at recovery performance, but does illustrate the benefits of Tokutek’s approach.
We ran iiBench on the following server:
Sunfire x4150 Dual socket quad core Xeon 3.1GHz 16GiB memory 2 SAS 146GB 6 SAS 146GB hardware RAID 0 256KiB stripe CentOS 5.1 Ext3 file system
The disks and RAID controller were configured with the write-caches OFF using arcconf. (Note: an experiment with the RAID controller write-cache ON resulted in a non-recoverable database, as you should expect)
We ran an insert-only iiBench test. After inserting about 250M rows and while actively inserting more, we manually pulled the plugs (2 on a Sunfire x4150). We then powered up the server and timed how long it took to start mysqld.
The results for InnoDB are not surprising – they are in line with results reported elsewhere. During recovery, InnoDB reported 1 transaction to be rolled back or cleaned up, and 68 row operations to undo.
The results for TokuDB are encouraging, in that they indicate recovery can be performed in minutes instead of hours. In this experiment, rebooting the server almost took as long as recovering the TokuDB database.
The primary reason for TokuDB’s quick recovery time is the use of regular, automatic checkpoints. This feature (introduced in version 2.0.0) ensures that work is completed and written to disk (including the necessary fsync) on a regular basis. Recovery is limited to processing uncommitted transactions and work since the last checkpoint. When not pushing performance to the absolute max we find recovery times typically take less than a minute.