<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tokutek &#187; Martin Farach-Colton</title>
	<atom:link href="http://www.tokutek.com/author/martin/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tokutek.com</link>
	<description></description>
	<lastBuildDate>Thu, 02 Feb 2012 15:28:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Announcing TokuDB v5.2: Improved Multi-Client Scaling and Faster Queries</title>
		<link>http://www.tokutek.com/2012/01/announcing-tokudb-v5-2-improved-multi-client-scaling-and-faster-queries/</link>
		<comments>http://www.tokutek.com/2012/01/announcing-tokudb-v5-2-improved-multi-client-scaling-and-faster-queries/#comments</comments>
		<pubDate>Thu, 19 Jan 2012 16:26:00 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[Announcement]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[Fractal Tree indexes]]></category>
		<category><![CDATA[hot schema changes]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[NewSQL]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[TokuDB]]></category>
		<category><![CDATA[Tokutek]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=3675</guid>
		<description><![CDATA[TokuDB® v5.2, the latest version of Tokutek&#8217;s flagship storage engine for MySQL and MariaDB, is now available.
This version offers performance enhancements over previous releases, especially for multi-client scale up and point queries, and extends the cases where ALTER TABLE&#8230;]]></description>
			<content:encoded><![CDATA[<p>TokuDB<sup>®</sup> v5.2, the latest version of Tokutek&#8217;s flagship storage engine for MySQL and MariaDB, is now available.</p>
<p>This version offers performance enhancements over previous releases, especially for multi-client scale up and point queries, and extends the cases where ALTER TABLE is non-blocking, in particular adding Hot Column Rename.</p>
<p>TokuDB v5.2 maintains all our established advantages: fast trickle load, fast bulk load, fast range queries through clustering indexes, hot schema changes, great compression, no fragmentation, and full MySQL compatibility for ease of installation. See our <a href="/resources/benchmarks">benchmark</a> page for details.</p>
<h2><strong>Multi-client workloads</strong></h2>
<p>In TokuDB v5.2, we have reworked our locking scheme to better support multi-client workloads, and as always, we have focused on large databases. How did we do?  Let&#8217;s check out some benchmark numbers.  </p>
<h3>SysBench</h3>
<p>This is a <a href="http://www.mysqlperformanceblog.com/2006/08/18/sysbench-benchmark-tool/" target="_blank">SysBench</a> comparison of InnoDB 1.1.8 and TokuDB v5.2. Prior to the run we started the database from a cold back-up (the cache is empty at the beginning of the 1 client thread run) and ran for 1 hour at each number of client threads. The following graph shows a significant performance improvement (10%-60%) at all measured levels of concurrency. The values shown are the average transactions per second for the final 15 minutes of the benchmark.<br />
<a href="/wp-content/uploads/2012/01/SysBench.png" rel="shadowbox[sbpost-3675];player=img;" target="_blank"><img title="Sysbench" src="/wp-content/uploads/2012/01/SysBench.png" alt="" width="600" /></a><br />
Additional details on the software settings for Sysbench can also be found in the <a title="Appendix" href="#Appendix">Appendix</a> at the end of this page.</p>
<h3>TPCC</h3>
<p>This is a TPCC-like comparison of InnoDB and TokuDB v5.2 on a 5000 warehouse database. The horizontal axis is the number of clients, the vertical axis shows throughput (New Order Transactions/10 seconds). Our multi-client work brings us to parity with InnoDB for this test.<br />
<a href="/wp-content/uploads/2012/01/TPCC-5000W.png" rel="shadowbox[sbpost-3675];player=img;" target="_blank"><img title="TPCC5000W" src="/wp-content/uploads/2012/01/TPCC-5000W.png" alt="" width="600" /></a></p>
<h2><strong>Other key improvements</strong></h2>
<p>Both the Sysbench and the TPCC-like benchmarks have strong point-query components.  Our improved performance over InnoDB, even with one client, shows that we are now outperforming InnoDB for point queries, at least in these tests.  We&#8217;ll be blogging more specifically about point query performance so stay tuned.  One way we achieve better point query performance is to have a different read-block size and write-block size.  I&#8217;ll explain what that means in later blog posts, but one consequence is that read-intensive loads on RAIDed disks now perform many fewer I/Os.</p>
<p>In other news, we previously released <a href="http://www.tokutek.com/2011/04/hot-indexing-part-i-new-feature/" target="_blank">Hot Indexing</a> (HI)  and <a href="http://www.tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/" target="_blank">Hot Column Addition and Deletion</a> (HCAD).  In both cases, the downtime of these Alter Table operations goes from hours to seconds.</p>
<p>In v5.2, we have added Hot Column Rename to the suite of online operations we support.  You&#8217;ll be able to change the name of a column in a matter of seconds, just as you can now add or delete columns.  We have also made Optimize Table hot, but it&#8217;s important to note that in TokuDB, Optimize Table only flushes background work, such as that produced by a column addition or deletion.  It does not rebuild indexes, nor does it need to, because TokuDB indexes <a href="http://www.tokutek.com/2010/11/avoiding-fragmentation-with-fractal-trees/">don&#8217;t fragment</a>.</p>
<h2><strong>Summary</strong></h2>
<p>TokuDB v5.2 offers great scaling with increasing client thread count, improved point query performance, and Hot Column Rename.  In the next couple of weeks, we&#8217;ll be posting more performance information, so stay tuned.  TokuDB v5.2 is available for <a href="/products/downloads/">download</a>.</p>
<hr />
<h3><a name="Appendix"></a><strong>Appendix &#8211; Configuration Details  </strong></h3>
<p><strong>Hardware</strong></p>
<pre>Centos 5.7; 2x Xeon L5520; 72GB RAM; 8x 300GB 10k SAS in RAID10.
TokuDB (running MySQL 5.1.52) is configured to use 36GB cache,
and InnoDB (running MySQL 5.5.16) with 52GB cache.</pre>
<p>  The difference is because InnoDB uses direct I/O whereas TokuDB reserves space for the OS cache.</p>
<p><strong>TokuDB MySQL Config File (TokuDB v5.2 on MySQL 5.1.52)</strong></p>
<pre>[mysqld]
max_connections=400
table_open_cache=2048</pre>
<p><strong>InnoDB MySQL Config File (InnoDB v1.1.8 on MySQL 5.5.16)</strong></p>
<pre>[mysqld]
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=0
innodb_log_file_size=1900M
innodb_log_files_in_group=2
innodb_file_per_table=true
innodb_log_buffer_size=16M
innodb_file_format=barracuda
innodb_buffer_pool_size=52G
innodb_flush_log_at_trx_commit=1
max_connections=400
table_open_cache=2048</pre>
<p><strong>TPCC</strong><br />
All TPCC-like benchmarks were run with the following command line:</p>
<pre>tpcc-mysql/tpcc_start localhost tpcc root 5000 \
         ${num_threads} 10 3600</pre>
<p><strong>Sysbench</strong><br />
All sysbench benchmarks were run with the following command line:</p>
<pre>sysbench --test sysbench-0.5/sysbench/tests/db/oltp.lua
--oltp_tables_count 16  --oltp-table-size 50000000 --rand-init on
--num-threads ${num_threads} --oltp-read-only off
--report-interval 10 --rand-type uniform --mysql-socket
/tmp/mysql.sock --mysql-table-engine tokudb --max-time 3600
--mysql-user root --mysql-password --mysql-db sbtest
--max-requests 0 --percentile 99 run</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2012/01/announcing-tokudb-v5-2-improved-multi-client-scaling-and-faster-queries/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Query Planner Gotchas</title>
		<link>http://www.tokutek.com/2011/06/query-planner-gotchas/</link>
		<comments>http://www.tokutek.com/2011/06/query-planner-gotchas/#comments</comments>
		<pubDate>Wed, 29 Jun 2011 05:34:33 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=2859</guid>
		<description><![CDATA[Indexes can reduce the amount of data your query touches by orders of magnitude.  This results in a proportional query speedup.  So what happens when you define a nice set of indexes and you don’t get the performance pop you&#8230;]]></description>
			<content:encoded><![CDATA[<p>Indexes can reduce the amount of data your query touches by orders of magnitude.  This results in a proportional query speedup.  So what happens when you define a nice set of indexes and you don’t get the performance pop you were expecting?  Consider the following example:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;">mysql<span style="color: #66cc66;">&gt;</span> <span style="color: #993333; font-weight: bold;">SHOW</span> <span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> t;
<span style="color: #66cc66;">|</span> t     <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #ff0000;">`t`</span> <span style="color: #66cc66;">&#40;</span>
  <span style="color: #ff0000;">`a`</span> <span style="color: #993333; font-weight: bold;">VARCHAR</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">255</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">,</span>
  <span style="color: #ff0000;">`b`</span> <span style="color: #993333; font-weight: bold;">BIGINT</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">20</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #ff0000;">'0'</span><span style="color: #66cc66;">,</span>
  <span style="color: #ff0000;">`c`</span> <span style="color: #993333; font-weight: bold;">BIGINT</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">20</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #ff0000;">'0'</span><span style="color: #66cc66;">,</span>
  <span style="color: #ff0000;">`d`</span> <span style="color: #993333; font-weight: bold;">BIGINT</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">20</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">,</span>
  <span style="color: #ff0000;">`e`</span> <span style="color: #993333; font-weight: bold;">CHAR</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">255</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">`b`</span><span style="color: #66cc66;">,</span><span style="color: #ff0000;">`c`</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #ff0000;">`a`</span> <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">`a`</span><span style="color: #66cc66;">,</span><span style="color: #ff0000;">`b`</span><span style="color: #66cc66;">,</span><span style="color: #ff0000;">`d`</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> ENGINE<span style="color: #66cc66;">=</span>InnoDB <span style="color: #993333; font-weight: bold;">DEFAULT</span> CHARSET<span style="color: #66cc66;">=</span>latin1</pre></div></div>

<p>Now we’d like to perform the following query:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span> sql_no_cache <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">FROM</span> t <span style="color: #993333; font-weight: bold;">WHERE</span> a <span style="color: #66cc66;">=</span> <span style="color: #ff0000;">'this is a test'</span> <span style="color: #993333; font-weight: bold;">AND</span> b <span style="color: #993333; font-weight: bold;">BETWEEN</span> <span style="color: #cc66cc;">8000000</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #cc66cc;">8100000</span>;</pre></div></div>

<p>Great!  We have index <strong>a</strong>, which cover this query.  Using <strong>a</strong> should be really fast.  You’d expect to use the index to jump to the beginning of the ‘this is a test’ values for <strong>a</strong> and from there continue using the index to the 8000000 section of the <strong>b</strong>‘s.  From there it’s just a range query.</p>
<p>We added some test data (see <a href="https://s3.amazonaws.com/tokutek-pub/mysql.query.plan.gendata.py">this script</a>) to see what would happen.  We added 10M lines and made the value of <strong>a</strong> be ‘this is a test’ in every field.  We got:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;">mysql<span style="color: #66cc66;">&gt;</span> <span style="color: #993333; font-weight: bold;">SELECT</span> sql_no_cache <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">FROM</span> t <span style="color: #993333; font-weight: bold;">WHERE</span> a <span style="color: #66cc66;">=</span> <span style="color: #ff0000;">'this is a test'</span> <span style="color: #993333; font-weight: bold;">AND</span> b <span style="color: #993333; font-weight: bold;">BETWEEN</span> <span style="color: #cc66cc;">8000000</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #cc66cc;">8100000</span>;
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----------+</span>
<span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----------+</span>
<span style="color: #66cc66;">|</span>   <span style="color: #cc66cc;">100001</span> <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----------+</span>
<span style="color: #cc66cc;">1</span> <span style="color: #993333; font-weight: bold;">ROW</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">SET</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">3.53</span> sec<span style="color: #66cc66;">&#41;</span></pre></div></div>

<p>So the question is, is this any good?  Checking the <strong>explain</strong>, we get:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;">mysql<span style="color: #66cc66;">&gt;</span> <span style="color: #993333; font-weight: bold;">EXPLAIN</span> <span style="color: #993333; font-weight: bold;">SELECT</span> sql_no_cache <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">FROM</span> t <span style="color: #993333; font-weight: bold;">WHERE</span> a <span style="color: #66cc66;">=</span> <span style="color: #ff0000;">'this is a test'</span> <span style="color: #993333; font-weight: bold;">AND</span> b <span style="color: #993333; font-weight: bold;">BETWEEN</span> <span style="color: #cc66cc;">8000000</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #cc66cc;">8100000</span>;
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----+-------------+-------+------+---------------+------+---------+-------+--------+--------------------------+</span>
<span style="color: #66cc66;">|</span> id <span style="color: #66cc66;">|</span> select_type <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">TYPE</span> <span style="color: #66cc66;">|</span> possible_keys <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">KEY</span>  <span style="color: #66cc66;">|</span> key_len <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">REF</span>   <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">ROWS</span>   <span style="color: #66cc66;">|</span> Extra                    <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----+-------------+-------+------+---------------+------+---------+-------+--------+--------------------------+</span>
<span style="color: #66cc66;">|</span>  <span style="color: #cc66cc;">1</span> <span style="color: #66cc66;">|</span> SIMPLE      <span style="color: #66cc66;">|</span> t     <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">REF</span>  <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">PRIMARY</span><span style="color: #66cc66;">,</span>a     <span style="color: #66cc66;">|</span> a    <span style="color: #66cc66;">|</span> <span style="color: #cc66cc;">258</span>     <span style="color: #66cc66;">|</span> const <span style="color: #66cc66;">|</span> <span style="color: #cc66cc;">201438</span> <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">USING</span> <span style="color: #993333; font-weight: bold;">WHERE</span>; <span style="color: #993333; font-weight: bold;">USING</span> <span style="color: #993333; font-weight: bold;">INDEX</span> <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----+-------------+-------+------+---------------+------+---------+-------+--------+--------------------------+</span>
<span style="color: #cc66cc;">1</span> <span style="color: #993333; font-weight: bold;">ROW</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">SET</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">0.00</span> sec<span style="color: #66cc66;">&#41;</span></pre></div></div>

<p>Yow!  Index <strong>a</strong> is used, but it’s not being used as expected. The <strong>ref type</strong> in the <strong>explain</strong> means that MySQL is performing a scan on all rows where <strong>a</strong> is ‘this is a test’ and not filtering on the values of <strong>b</strong>.  This translates into looking at 10M rows instead of 100K, so this query plan is looking at 100x too much data.</p>
<p>This is a MySQL query planner bug, and it has been reported <a href="http://bugs.mysql.com/bug.php?id=61631">here.</a> However,  there is a workaround (in this case):</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;">mysql<span style="color: #66cc66;">&gt;</span> <span style="color: #993333; font-weight: bold;">SELECT</span> sql_no_cache <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">FROM</span> t <span style="color: #993333; font-weight: bold;">USE</span> <span style="color: #993333; font-weight: bold;">INDEX</span><span style="color: #66cc66;">&#40;</span>a<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">WHERE</span> a <span style="color: #66cc66;">=</span> <span style="color: #ff0000;">'this is a test'</span> <span style="color: #993333; font-weight: bold;">AND</span> b <span style="color: #993333; font-weight: bold;">BETWEEN</span> <span style="color: #cc66cc;">8000000</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #cc66cc;">8100000</span>;
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----------+</span>
<span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----------+</span>
<span style="color: #66cc66;">|</span>   <span style="color: #cc66cc;">100001</span> <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----------+</span>
<span style="color: #cc66cc;">1</span> <span style="color: #993333; font-weight: bold;">ROW</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">SET</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">0.04</span> sec<span style="color: #66cc66;">&#41;</span></pre></div></div>

<p>By adding a <strong>use index(a)</strong> — which shouldn’t do anything because we’re already using index <strong>a</strong>! — we get an 88x speedup. This is in line with looking at 1% of the data.  Consulting the <strong>explain</strong>, we see:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;">mysql<span style="color: #66cc66;">&gt;</span> <span style="color: #993333; font-weight: bold;">EXPLAIN</span> <span style="color: #993333; font-weight: bold;">SELECT</span> sql_no_cache <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span>d<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">FROM</span> t <span style="color: #993333; font-weight: bold;">USE</span> <span style="color: #993333; font-weight: bold;">INDEX</span><span style="color: #66cc66;">&#40;</span>a<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">WHERE</span> a <span style="color: #66cc66;">=</span> <span style="color: #ff0000;">'this is a test'</span> <span style="color: #993333; font-weight: bold;">AND</span> b <span style="color: #993333; font-weight: bold;">BETWEEN</span> <span style="color: #cc66cc;">8000000</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #cc66cc;">8100000</span>;
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----+-------------+-------+-------+---------------+------+---------+------+--------+--------------------------+</span>
<span style="color: #66cc66;">|</span> id <span style="color: #66cc66;">|</span> select_type <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">TYPE</span>  <span style="color: #66cc66;">|</span> possible_keys <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">KEY</span>  <span style="color: #66cc66;">|</span> key_len <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">REF</span>  <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">ROWS</span>   <span style="color: #66cc66;">|</span> Extra                    <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----+-------------+-------+-------+---------------+------+---------+------+--------+--------------------------+</span>
<span style="color: #66cc66;">|</span>  <span style="color: #cc66cc;">1</span> <span style="color: #66cc66;">|</span> SIMPLE      <span style="color: #66cc66;">|</span> t     <span style="color: #66cc66;">|</span> range <span style="color: #66cc66;">|</span> a             <span style="color: #66cc66;">|</span> a    <span style="color: #66cc66;">|</span> <span style="color: #cc66cc;">266</span>     <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #66cc66;">|</span> <span style="color: #cc66cc;">201438</span> <span style="color: #66cc66;">|</span> <span style="color: #993333; font-weight: bold;">USING</span> <span style="color: #993333; font-weight: bold;">WHERE</span>; <span style="color: #993333; font-weight: bold;">USING</span> <span style="color: #993333; font-weight: bold;">INDEX</span> <span style="color: #66cc66;">|</span>
<span style="color: #66cc66;">+</span><span style="color: #808080; font-style: italic;">----+-------------+-------+-------+---------------+------+---------+------+--------+--------------------------+</span>
<span style="color: #cc66cc;">1</span> <span style="color: #993333; font-weight: bold;">ROW</span> <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #993333; font-weight: bold;">SET</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">0.00</span> sec<span style="color: #66cc66;">&#41;</span></pre></div></div>

<p>We’re still using index <strong>a</strong>, but the <strong>type</strong> of the query plan has changed to <strong>range</strong> (which is what we would have expected all along).  Now we  are filtering on <strong>a</strong> and <strong>b</strong>.</p>
<h2>Take home messages</h2>
<ul>
<li>Query planners are complicated things, and the MySQL query planner has some bugs.</li>
<li>It’s possible, in many cases, to deal with those bugs by checking the <strong>explain</strong> of a query that doesn’t behave as expected.</li>
<li>Although adding a <strong>use index</strong> isn’t the best software engineering practice, if you need to get fast query speeds and you’re getting a faulty query plan, it’s a handy tool to know about.</li>
</ul>
<p>Of course, Tokutek customers care a lot about indexes.  <a href="http://tokutek.com/products/downloads/">TokuDB for MySQL and MariaDB</a> lets you define lots of indexes, because indexing is fast.  You can define indexes with random keys, indexes that won’t fit in memory, clustering indexes that cover all queries, ….  This insertion speed translates directly into query speed, so TokuDB can address many performance complaints, ranging from write- to read-intensive workloads.</p>
<p>Remember that if you take care of your indexes, they’ll take care of you!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/06/query-planner-gotchas/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Percona Live, NYC</title>
		<link>http://www.tokutek.com/2011/05/percona-live-nyc/</link>
		<comments>http://www.tokutek.com/2011/05/percona-live-nyc/#comments</comments>
		<pubDate>Fri, 27 May 2011 21:52:34 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[Percona Live]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2606</guid>
		<description><![CDATA[Yesterday, Percona held <a href="http://www.percona.com/live/nyc-2011/">Percona Live NYC</a>, which they describe as an &#8220;intensive one-day MySQL summit.&#8221;  They meant it.  It was like drinking from a firehose.  There was too much for me to give a complete report, so I&#8217;d like&#8230;]]></description>
			<content:encoded><![CDATA[<p>Yesterday, Percona held <a href="http://www.percona.com/live/nyc-2011/">Percona Live NYC</a>, which they describe as an &#8220;intensive one-day MySQL summit.&#8221;  They meant it.  It was like drinking from a firehose.  There was too much for me to give a complete report, so I&#8217;d like to highlight two sessions that stuck out for me.</p>
<h3>Why SQL Wins</h3>
<p>Sergei Tsarev (Clustrix) gave a great overview of the last 50 years of database development.  He talked about the early days, in which what we now think of as database functionality had to be implemented in each application.  Programmer productivity was therefore low.</p>
<p>
As modern SQL databases emerged, productivity shot up since databases bundled up common functionality with an easy-to-code interface.  This now seems like a golden age of databases, in which transactional semantics were hashed out.</p>
<p>
Fast forward to today. Database performance has failed to keep up with database demands. Sergei didn&#8217;t talk much about why this is.  Personally, I have my own take on what caused this problem.  The demand side is due to the success of SQL semantics.  Relational databases capture what many (most?) people want to do with a database.  So they keep pounding harder on the database.  On the performance side, the data structures that drive databases (B-trees) were invented in early &#8217;70s, when computers had a very different balance of resources.  As the decades went by they grew to be more and more out of balance with each new generation of hardware, creating many of today&#8217;s performance bottlenecks.  And now, back to Sergei&#8217;s talk. </p>
<p>
Sergei talked about how NoSQL was a reaction to these performance issues.  He pointed out that what NoSQL does is kill off chunks of database functionality.  That doesn&#8217;t make those requirements go away.  It just makes the database appear to go faster &#8212; but, it&#8217;s important to note, not the system.  It means that the &#8220;slow code&#8221; gets moved to the application layer.  So the performance problem isn&#8217;t solved <b>and</b> programmer productivity goes down.</p>
<p>
He predicts a move back to SQL solutions, as these performance problems get solved. (And since our TokuDB storage engine solves a bunch of them, I agree!)</p>
<h3>Migrating From MyISAM to InnoDB</h3>
<p>Matt Yonkovit (Percona) gave a very detailed talk on moving MyISAM tables to InnoDB.  The audience was really into it, and the questions were coming fast and furious.  As someone interested in transactional databases, I don&#8217;t think that much about MyISAM, but it was clearly on a lot of people&#8217;s minds.</p>
<p>The most striking thing about this talk was the contrast with Sergei&#8217;s talk.  Matt did a great job of detailing all the parameters you need to worry about when running InnoDB &#8212; parameters you don&#8217;t need for MyISAM.  I expected this to be the source of some disappointment. After all, databases, in Sergei&#8217;s conception, are supposed to abstract away common problems so that the application programmer can work on their problem-specific code.</p>
<p>
Instead, InnoDB comes with a large number of tuning parameters, some of which are set by default to values that have to be changed to make InnoDB useable at all.  The odd part for me is that there was a consensus in the room that the large number of parameters was a strength of InnoDB, and the lack of tunability was a weakness of MyISAM.  Call me old fashioned, but this kind of resetting of defaults is computer work, not work for humans. </p>
<p><p> I came away from this talk more proud that ever that TokuDB has almost no tuning parameters.  Expect this to continue.  We even have the math for why this no-tuning approach works.</p>
<h3>Loads of other talks</h3>
<p>There were loads of other great talks &#8212; on solid state drives, on replication, on many of the topics that are critical to the care and feeding of a MySQL database.  Maybe I can wrangle a trip to Percona Live London so I can report on those.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/05/percona-live-nyc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Never Settle for a &#8220;B&#8221;</title>
		<link>http://www.tokutek.com/2011/05/never-settle-for-a-b/</link>
		<comments>http://www.tokutek.com/2011/05/never-settle-for-a-b/#comments</comments>
		<pubDate>Tue, 24 May 2011 15:50:38 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[hot column addition]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[partitioning]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2580</guid>
		<description><![CDATA[<a href="http://www.readwriteweb.com/cloud/2011/04/the-newsql-movement.php">OldSQL</a> DBs based on B-trees have some well-known problems and workarounds.  TokuDB is a NewSQL storage engines based on Fractal Tree indexing, so the natural question is how InnoDB practice translates into TokuDB.   This post gives a quick overview.&#8230;]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.readwriteweb.com/cloud/2011/04/the-newsql-movement.php">OldSQL</a> DBs based on B-trees have some well-known problems and workarounds.  TokuDB is a NewSQL storage engines based on Fractal Tree indexing, so the natural question is how InnoDB practice translates into TokuDB.   This post gives a quick overview.  Enjoy!</p>
<h2>FAQ</h2>
<p><strong>Q: How do I tune TokuDB?</strong><br />
<strong>A: You don&#8217;t!</strong></p>
<p>TokuDB has almost no parameters you can set, and many of them are related to diagnostics &#8212; things like how often should the information about insertion or query progress be updated.  We recommend that parameters be left unchanged.</p>
<p><strong>Q: What do I do when indexes are bigger than memory?</strong><br />
<strong>A: Nothing!</strong></p>
<p>TokuDB is based on Fractal Trees, which are designed to index fast, even when indexes are bigger than memory.</p>
<p><strong>Q: What&#8217;s the best way to partition in TokuDB?</strong><br />
<strong>A: Don&#8217;t!</strong></p>
<p><a href="http://tokutek.com/2011/03/mysql-partitioning-a-flow-chart/">Partitions are a poor replacement for covering indexes</a>.  Define the indexes that actually solve your problem, and TokuDB will keep up with the insertion rate.</p>
<p><strong>Q: How often should I run OPTIMIZE TABLE?</strong><br />
<strong>A: Never!</strong></p>
<p>TokuDB tables <a href="http://tokutek.com/2010/11/avoiding-fragmentation-with-fractal-trees/">don&#8217;t fragment</a>, so OPTIMIZE TABLE doesn&#8217;t defragment them.</p>
<p><strong>Q: How often should I dump and reload a table?</strong><br />
<strong>A: Never!</strong></p>
<p>Same as OPTIMIZE TABLE.</p>
<p><strong>Q: How much downtime should I schedule for adding an index?</strong><br />
<strong>Q: How much downtime should I schedule for column addition/deletion?</strong><br />
<strong>A: None!</strong></p>
<p>Well, maybe a few seconds.  In <a href="http://tokutek.com/2011/03/announcing-tokudb-v5-0-making-big-data-agile/">TokuDB 5.0</a> we announced <a href="http://tokutek.com/2011/04/hot-column-addition-and-deletion-part-ii-how-it-works/">Hot Column Addition and Deletion</a> and <a href="http://tokutek.com/2011/04/hot-indexing-part-i-new-feature/">Hot Indexing</a>, so that these ALTER TABLEs incur minimal downtime, on the order of a few seconds.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/05/never-settle-for-a-b/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Covering Indexes: How many indexes do you need?</title>
		<link>http://www.tokutek.com/2011/05/covering-indexes-how-many-indexes-do-you-need/</link>
		<comments>http://www.tokutek.com/2011/05/covering-indexes-how-many-indexes-do-you-need/#comments</comments>
		<pubDate>Thu, 12 May 2011 12:48:09 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[clustering indexes]]></category>
		<category><![CDATA[covering indexes]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2544</guid>
		<description><![CDATA[I&#8217;ve recently been <a href="http://tokutek.com/2011/03/mysql-partitioning-a-flow-chart/">blogging</a> about how partitioning is a poor man&#8217;s answer to covering indexes.  I got the following comment from Jaimie Sirovich:
&#8220;There are many environments where you could end up creating N! indices to cover queries for&#8230;]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently been <a href="http://tokutek.com/2011/03/mysql-partitioning-a-flow-chart/">blogging</a> about how partitioning is a poor man&#8217;s answer to covering indexes.  I got the following comment from Jaimie Sirovich:</p>
<blockquote><p>&#8220;There are many environments where you could end up creating N! indices to cover queries for queries against lots of dimensions.&#8221;</p></blockquote>
<p>[Just a note: this is only one of several points he made.  I just wanted to dig into this one in some detail.  Here goes...]</p>
<p>Although it is, in theory, possible to generate a workload that would take N! indexes, this is not a realistic (or useful) bound (leaving aside that this workload would kill partitioning!).  For one thing, it would take N! queries to exercise all those indexes.  And the queries would have to include every field in the where clause &#8212; as we&#8217;ll get into below.</p>
<p>So what is a reasonable bound on the number of covering indexes that would be useful?  Not surprisingly, there&#8217;s no one answer, so let&#8217;s look at a couple of cases.</p>
<p>Before we get started, let&#8217;s simplify the discussion by making all indexes <a href="http://tokutek.com/2009/05/introducing_multiple_clustering_indexes/">CLUSTERING</a>, which means they include all fields, and therefore, we only need to look at the WHERE clause of our queries to figure out what needs covering.  More on that later.</p>
<h3>How many indexes does it take to beat partitioning?</h3>
<p>When partitions are used to avoid table scans, it&#8217;s because the secondary indexes do not cover the queries.  All it takes to make them cover whatever query they are used on is to declare them CLUSTERING. Once you do that, they will be faster than using the same indexes with partitions.</p>
<p>So not only do you not need N! indexes, what you need is <strong> the same number of indexes!</strong> (Here the first &#8216;!&#8217; is factorial, and the second is emphasis!)</p>
<p>A little detail here: suppose you have partitioned on field T and you have defined an index on (A,B,C).  Now get rid of partitioning and define CLUSTERING index (T,A,B,C).  Lather, rinse, repeat.</p>
<h3>How many indexes does it take in general to cover your queries?</h3>
<p>As noted above, you only need to look at the WHERE clause because making your indexes CLUSTERING takes care of the rest of the fields you might need in your query.</p>
<p>Next we need to consider what rows get examined in a query.  Suppose you have a query like:</p>
<blockquote><p>SELECT * from foo WHERE A = 5 and B = 10 and C &lt; 20 and D = 30;</p></blockquote>
<p>What do you need to cover this query?  It turns out that once you hit an inequality, MySQL stops searching in the index and starts a scan of the index.  So that means that a CLUSTERING index on (A,B,C) is all you need.  There&#8217;s no need for D in the index definition.  If you also had:</p>
<blockquote><p>SELECT * from foo WHERE A = 5 and B &gt; 10 and D = 30;</p></blockquote>
<p>the CLUSTERING index on (A,B,C) would also work for this query.</p>
<p>So how many indexes do you need?  Certainly not N!.  Take a look at your most important queries.  Consider their WHERE clauses and throw away all the fields after the first inequality test.  Throw away any set of fields that&#8217;s a prefix of another set.  So if you end up with:</p>
<blockquote><p>(A,B,C)</p>
<p>(A,B)</p>
<p>(C,D,A)</p></blockquote>
<p>you don&#8217;t need the (A,B).  The remaining sets of field are your candidates.  Each CLUSTERING index is as big as the primary table &#8212; that&#8217;s what makes them so good for covering lots of different queries!  While this may sound like it would take up a lot of space, TokuDB uses very aggressive compression, typically getting 10x-15x compression, so it&#8217;s possible to add quite a few such indexes.</p>
<h3>So how many indexes do I need?</h3>
<p>How many mission-critical queries do you have?  How many of them can be clumped together using CLUSTERING indexes?  When talking to customers, the most number of useful indexes I&#8217;ve seen is 20.  That sounds like a lot for two reasons: size and update time.  But the total size should be less than twice the original data size, assuming at least 10x compression.  And TokuDB is all about updating indexes fast, even when they don&#8217;t fit in memory, so this hasn&#8217;t been a complaint either. I&#8217;m going to go out on a limb and say that a typical upper bound is 20.</p>
<h3>What if I don&#8217;t know what queries I&#8217;ll be performing?</h3>
<p>In this case, partitioning isn&#8217;t the answer.  We already beat the performance of partitioning by switching over the indexes to CLUSTERING.  What you need in this case is flexibility.  When you get a new query type, what you need is TokuDB&#8217;s <a href="http://tokutek.com/2011/04/hot-indexing-part-i-new-feature/">hot indexing</a>, which allows you to add an index with basically no downtime (3 seconds downtime is typical).</p>
<p>But what happens if I always get queries that are unrelated to previous queries and they are mission critical and they are never repeated?  Such a hypothetical leads us down the N!-index path.  It doesn&#8217;t seem that realistic.</p>
<h3>Good but not perfect covering is OK too</h3>
<p>Finally, it&#8217;s not necessary to tailor your indexes perfectly to your queries.  The CLUSTERING index (A,B) covers the query</p>
<blockquote><p>SELECT * from foo where A = 5 and B = 10 and C = 20 and D = 30;</p></blockquote>
<p>even though C and D are not mentioned.  CLUSTERING indexes cover all queries.  It&#8217;s all a matter of getting the table into the best possible order for your queries.  You can still get benefit from an index that isn&#8217;t the perfect order, and that index will cover more queries.  It&#8217;s a tradeoff.</p>
<p>If you are interested in evaluating TokuDB for deployment and want to know how to proceed, try the following</p>
<ol>
<li><a href="http://tokutek.com/products/downloads/">Download </a><a href="http://tokutek.com/products/downloads/">TokuDB</a> (free for trials and evaluations).</li>
<li>Review the <a href="http://tokutek.com/2009/10/tokudb-for-mysql-evaluation-guide/">evaluation guide</a>.</li>
<li>Convert an existing InnoDB table to TokuDB.  Make sure that the table is big enough to be realistic.  You&#8217;ll really notice the difference when the table is bigger than memory.</li>
<li>Convert your indexes to CLUSTERING indexes.  Maybe they don&#8217;t all need to be CLUSTERING, but the easiest eval is to make them all CLUSTERING and cut back if needed.</li>
<li>Add more indexes using the steps outlined above.</li>
<li>Feel free to contact support@tokutek.com.  We&#8217;re here to help.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/05/covering-indexes-how-many-indexes-do-you-need/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Elephants on a Trapeze: Keeping Big Data Agile</title>
		<link>http://www.tokutek.com/2011/05/elephants-on-a-trapeze-keeping-big-data-agile/</link>
		<comments>http://www.tokutek.com/2011/05/elephants-on-a-trapeze-keeping-big-data-agile/#comments</comments>
		<pubDate>Fri, 06 May 2011 19:03:13 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2491</guid>
		<description><![CDATA[On April 1st, the <a href="http://www.cs.rutgers.edu">Department of Computer Science</a> at Rutgers University, where I am a professor, held an open house.  I gave a talk called &#8220;Elephants on a Trapeze: Keeping Big Data Agile&#8221;.
The talk is an introduction to&#8230;]]></description>
			<content:encoded><![CDATA[<p>On April 1st, the <a href="http://www.cs.rutgers.edu">Department of Computer Science</a> at Rutgers University, where I am a professor, held an open house.  I gave a talk called &#8220;Elephants on a Trapeze: Keeping Big Data Agile&#8221;.</p>
<p>The talk is an introduction to performance issues related to big data without getting too technical.  You&#8217;ll have to decide if I succeeded with the &#8220;not too technical&#8221; part.  My take is on how to keep big data indexed &#8212; not surprising since the work in this talk is the basis for <a href="http://tokutek.com/products">TokuDB®</a>, Tokutek&#8217;s MySQL storage engine for keeping large data indexed.  A video of my talk can be found <a href="http://vimeo.com/23367484">here</a>.</p>
<p><iframe src="http://player.vimeo.com/video/23367484?title=0&amp;byline=0&amp;portrait=0" width="400" height="300" frameborder="0"></iframe>
<p><a href="http://vimeo.com/23367484">Elephants on a Trapeze: Keeping Big Data Agile</a> from <a href="http://vimeo.com/user6997348">Tokutek</a> on <a href="http://vimeo.com">Vimeo</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/05/elephants-on-a-trapeze-keeping-big-data-agile/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OldSQL Tricks or NewSQL Treats</title>
		<link>http://www.tokutek.com/2011/04/oldsql-tricks-or-newsql-treats/</link>
		<comments>http://www.tokutek.com/2011/04/oldsql-tricks-or-newsql-treats/#comments</comments>
		<pubDate>Fri, 08 Apr 2011 17:16:30 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[disk seek]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2325</guid>
		<description><![CDATA[Why do B-trees need &#8220;Tricks&#8221; to work?
Marko Mäkelä recently posted a couple of &#8220;<a href="http://blogs.innodb.com/wp/2011/04/tips-and-tricks-for-faster-ddl/">tips and tricks</a>&#8221; you can use to improve InnoDB performance. Tips and tricks.  A general purpose relational database like MySQL shouldn&#8217;t need &#8220;tips and tricks&#8221;&#8230;]]></description>
			<content:encoded><![CDATA[<h3>Why do B-trees need &#8220;Tricks&#8221; to work?</h3>
<p>Marko Mäkelä recently posted a couple of &#8220;<a href="http://blogs.innodb.com/wp/2011/04/tips-and-tricks-for-faster-ddl/">tips and tricks</a>&#8221; you can use to improve InnoDB performance. Tips and tricks.  A general purpose relational database like MySQL shouldn&#8217;t need &#8220;tips and tricks&#8221; to perform well, and I lay the blame on design choices that were made in the early &#8217;70s: the B-tree data structure underlying all OldSQL databases.  B-trees were designed for machines that had very different performance characteristics than the machines of today.  Hardware has changed, but B-trees are the same.  Tips and Tricks are an attempt to make up the difference.</p>
<p>So B-tree implementers &#8212; InnoDB, Oracle, MS SQL Server &#8212; are fighting an uphill battle; they&#8217;re fighting the future.  B-trees just aren&#8217;t meant to cope with high-bandwidth, slow-seek-time storage systems, because they perform unnecessary disk seeks.  In fact, they aren&#8217;t meant to cope with lower-bandwidth, fast-seek-time storage systems (like SSDs), because they waste bandwidth.  Hardware trends move computers further and further from B-trees&#8217; sweet spot.  In 1972, they were a great idea, but a lot has changed.</p>
<p>Don&#8217;t get me wrong.  InnoDB is a great piece of software.  In my own testing, its B-tree implementation outperforms the big enterprise implementations, though it lacks some features like <a href="http://tokutek.com/2011/04/hot-indexing-part-i-new-feature/">hot indexing</a> and <a href="http://tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/">hot column addition and deletion</a>.</p>
<p>So the problem is OldSQL B-trees, which have two big shortcomings. These are both illustrated very well in the <a href="http://blogs.innodb.com/wp/2011/04/tips-and-tricks-for-faster-ddl/">tips and tricks</a> posting.  They are slow to index, and they fragment. Let&#8217;s look at Marko&#8217;s points one by one:</p>
<ul>
<li style="margin-bottom: 1em;">Marko&#8217;s first point is that &#8220;Data Dictionary Language (DDL) operations have traditionally been slow in MySQL&#8221;.  The slow part of DDL is mostly ALTER TABLE commands.
<ul>
<li>Hot Column Addition and Deletion and Hot Indexing, recently released in <a href="http://tokutek.com/2011/03/announcing-tokudb-v5-0-making-big-data-agile/">TokuDB 5.0</a>, brings the downtime of these schema changes from hours to seconds.</li>
</ul>
<li style="margin-bottom: 1em;">Marko&#8217;s second point is that insertions, deletions and updates &#8220;can be slow [with InnoDB] even despite the change buffering in MySQL 5.5.&#8221;
<ul>
<li><a href="http://tokutek.com/customers/a-social-networking-case-study/">Fast insertions</a>, deletions and updates have been a hallmark of TokuDB&#8217;s Fractal Tree Indexes since they were introduced.</li>
</ul>
</li>
<li style="margin-bottom: 1em;"> Marko&#8217;s next point is that if you populate an InnoDB index out of order, it will fragment.  His proposed solution involves dropping and adding indexes, which means table downtime.
<ul>
<li>The best way to deal with fragmentation is to not fragment your tables in the first place.  So what&#8217;s the solution?  You guessed it: <a href="http://tokutek.com/2010/11/avoiding-fragmentation-with-fractal-trees/">TokuDB tables don&#8217;t fragment</a> <em>by design</em>, so this particular chunk of maintenance is not needed.</li>
</ul>
<li style="margin-bottom: 1em;">Finally, Marko discusses fill factors in B-trees, and the impact of packed versus unpacked nodes on future modifications.
<ul>
<li>TokuDB is based on cache-oblivious analysis, which means that you don&#8217;t need to worry about fill factors, and it means that there are very few parameters to set.  In fact, <a href="http://technocation.org/content/oursql-episode-39%3A-tokudb-5.0-part-1">we recommend</a> that you leave the parameters set to the defaults.</li>
</ul>
</li>
</ul>
<h3>Treats, not tricks</h3>
<p>In short, TokuDB&#8217;s Fractal Tree indexes are fast to update (20x-80x faster than InnoDB) and never fragment.  Hot schema changes, fast insertions, no fragmentation, and no tuning.  That&#8217;s what happens when your solution matches your hardware.</p>
<p>We have <a href="http://goo.gl/9gZkl">sessions</a> coming up at both O&#8217;Reilly and Collaborate 11 conferences next week, as well as a booth at O&#8217;Reilly. Please come by and tell us your &#8220;tips and tricks&#8221; (and band-aids and duct tape) for working around old database structures, and we&#8217;ll work to get you on a better path with Tokutek&#8217;s <a href="http://www.readwriteweb.com/cloud/2011/04/the-newsql-movement.php">NewSQL</a> alternative.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/04/oldsql-tricks-or-newsql-treats/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hot Column Addition and Deletion Part II: How it works</title>
		<link>http://www.tokutek.com/2011/04/hot-column-addition-and-deletion-part-ii-how-it-works/</link>
		<comments>http://www.tokutek.com/2011/04/hot-column-addition-and-deletion-part-ii-how-it-works/#comments</comments>
		<pubDate>Thu, 07 Apr 2011 16:31:24 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[alter table]]></category>
		<category><![CDATA[hot column addition]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2279</guid>
		<description><![CDATA[Hot Column Addition and Deletion (HCAD)
In <a href="http://tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/">the previous HCAD post</a>, I described HCAD and showed that it can reduce the downtime of column addition (or deletion) from 18 hours to 3 seconds.  In fact, the downtime of InnoDB&#8230;]]></description>
			<content:encoded><![CDATA[<h3>Hot Column Addition and Deletion (HCAD)</h3>
<p>In <a href="http://tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/">the previous HCAD post</a>, I described HCAD and showed that it can reduce the downtime of column addition (or deletion) from 18 hours to 3 seconds.  In fact, the downtime of InnoDB is proportional to the size of the database, whereas the downtime for TokuDB 5.0 depends on the time it takes for MySQL to close and reopen a table &#8212; a time  that&#8217;s independent of database size.  Go ahead and build bigger tables.  The HCAD downtime for TokuDB won&#8217;t increase.</p>
<p>You may be wondering how we do HCAD.  Here goes:</p>
<h3>Under the hood</h3>
<p>TokuDB is based on Fractal Tree indexing, one of the cools features of which is that they <a href="http://en.wikipedia.org/wiki/TokuDB">replace random I/O with sequential I/O</a>.  The way this happens has an impact on how HCAD happens, so here&#8217;s the 20,000 foot view.</p>
<p>You can think of everything in a TokuDB table (or index) as a message.  There can be messages to <a href="http://tokutek.com/2010/06/disk-seeks-are-evil-so-let’s-avoid-them-pt-3-deletions/">insert</a> a row, or <a href="http://tokutek.com/2010/06/making-replace-into-fast-by-avoiding-disk-seeks/">update</a> a row, or <a href="http://tokutek.com/2010/06/disk-seeks-are-evil-so-let’s-avoid-them-pt-3-deletions/">delete</a> a row.  Rather than delivering these messages immediately, the messages are bundled up by common destination, and they progress towards the leaves of the Fractal Tree indexes when there are enough of them to make the disk-head movement worthwhile.  They are applied in the right order, obviously, in order to keep the semantics of the SQL commands correct.</p>
<p>Even a query can be thought of as a message, except in the case of a query, it has to be delivered on the spot, even if that means moving the disk head.  The query &#8220;sees&#8221; all the messages ahead of it, so it gets all the right answers, once again, according to the SQL semantics. </p>
<p>An HCAD command generates yet another type of message: it&#8217;s a broadcast message that needs to be applied to every row.  Sticking the message into the Fractal Tree indexes is fast.  In fact, the downtime associated with HCAD has nothing to do with the TokuDB work.  Instead, MySQL closes and reopens a table on an alter table command.  This causes dirty pages associated with that table to be flushed.  The downtime is on the order of seconds, though we have seen extreme cases with many dirty pages and very large RAM where this can take a couple of minutes.</p>
<p>And since this is TokuDB, you also get to define <a href="http://tokutek.com/2009/05/clustering_indexes_vs_covering_indexes/">clustering indexes</a>, which reference all columns and therefore speed up lots of queries.  The HCAD message gets injected into the primary table and into each clustering index.</p>
<p>The work of changing the rows does not happen when the HCAD message is injected.  Rather, the broadcast message makes its way down to the leaves as other messages push it along. In this process, when an HCAD message reaches a row &#8212; either because other messages push it along or because of a query &#8212; the row gets rewritten to include the added column or exclude the delete column, as the case may be.  Once the work is done to rewrite a row, the HCAD work is done for that row. The user can choose to have this work done immediately &#8212; say by a query that touches all rows &#8212; or lazily &#8212; as part of the normal operation of the database.  Neither case involves downtime, and once the work is done to rewrite a row, the HCAD work is done for that row.</p>
<p>You can schedule the background work of updating the rows as   follows.  Recall that any query that touches a row, HCADs the row.   A query of the form </p>
<pre>
SELECT COUNT (*) FORCE INDEX(X);
</pre>
<p>where X is primary or one of the clustering indexes, will touch each row and finish all the HCAD work for that index. I should emphasize that you don&#8217;t need to do this.  Everything works fine without it.  This<br />
is an option if you want to get the work done all at once.</p>
<h3>Trying it out</h3>
<p>This is a 3 step process:</p>
<ol>
<li>Download <a href="http://tokutek.com/products/">TokuDB</a> (free for evaluation purposes or in production up to 50GB of user data).</li>
<li>Read section 3.4 of the User&#8217;s Guide.</li>
<li>Load your tables and go!</li>
</ol>
<h3>Learning more</h3>
<p>I had the privilege of sitting down for the <a href="http://dev.mysql.com/podcasts/">MySQL Community Podcast</a> with Sheeri Cabral and Sarah Novotny where we spoke about the new TokuDB 5.0 features in depth. See Episodes <a href="http://technocation.org/content/oursql-episode-39%3A-tokudb-5.0-part-1">39</a> and <a href="http://technocation.org/content/oursql-episode-40%3A-tokudb-5.0-part-2">40</a>.</p>
<p>
Enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/04/hot-column-addition-and-deletion-part-ii-how-it-works/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Hot Indexing Part I: New Feature</title>
		<link>http://www.tokutek.com/2011/04/hot-indexing-part-i-new-feature/</link>
		<comments>http://www.tokutek.com/2011/04/hot-indexing-part-i-new-feature/#comments</comments>
		<pubDate>Tue, 05 Apr 2011 19:06:38 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[alter table]]></category>
		<category><![CDATA[Hot Indexing]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2227</guid>
		<description><![CDATA[From 31 minutes to 2 seconds
Hot Indexing Overview
<a href="http://tokutek.com/2011/03/announcing-tokudb-v5-0-making-big-data-agile/">TokuDB v5.0</a> introduces several features that are new to the MySQL world.  Recently, we posted on <a href="http://tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/">HCAD: Hot Column addition and Deletion</a>.  In this post, we talk about Hot&#8230;]]></description>
			<content:encoded><![CDATA[<h2><em>From 31 minutes to 2 seconds</em></h2>
<h3>Hot Indexing Overview</h3>
<p><a href="http://tokutek.com/2011/03/announcing-tokudb-v5-0-making-big-data-agile/">TokuDB v5.0</a> introduces several features that are new to the MySQL world.  Recently, we posted on <a href="http://tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/">HCAD: Hot Column addition and Deletion</a>.  In this post, we talk about Hot Indexing.</p>
<p>What happens when you try to add a new index, as follows?</p>
<pre>mysql&gt; create index example_idx on example_tbl (example_field);
</pre>
<p>In standard MySQL 5.1 InnoDB, the table example_tbl gets locked while all indexes, including the primary key, get rebuilt.  In the InnoDB plugin for 5.1, as well as in previous releases of TokuDB, things are improved in that the table is only locked while the one index is built.  This still  however can easily cause hours of downtime.</p>
<p>TokuDB v5.0 introduces Hot Indexing.  You can add an index to an existing table with minimal downtime.  The total downtime is seconds to a few minutes, because when the index is finished being built, MySQL closes and reopens the table.  This means that (unlike HCAD) the downtime is not at the time of the command, but later on.  Still, it is quite minimal, as the following experiment shows.  The details of the table are <a href="http://tokutek.com/air-traffic-data-hcad/">here</a>.</p>
<p>Adding index (Year, Month, DayofWeek) took 31 minutes, 34 seconds for the InnoDB 5.1 plugin, during which the table was locked for insertions/deletions/updates.</p>
<p>TokuDB 5.0 took 9 minutes, 30 seconds to add the same index.  At the end of this time, the table was locked for under 2 seconds (we polled the database at 1 second intervals, and it was only locked at one of these test points).</p>
<p>The v5.0 release is fun for me.  I get to blog about great features that bring great value to people trying to run big data in MySQL. Stay tuned for the details of how we do hot indexing.</p>
<h3>Learning More</h3>
<ol>
<li> I had the privilege of sitting down for the <a href="http://dev.mysql.com/podcasts/">MySQL Community Podcast</a> with Sheeri Cabral and Sarah Novotny where we spoke about the new TokuDB 5.0 features in depth. See Episodes <a href="http://technocation.org/content/oursql-episode-39%3A-tokudb-5.0-part-1">39</a> and <a href="http://technocation.org/content/oursql-episode-40%3A-tokudb-5.0-part-2">40</a>.</li>
<li> Try <a href="http://tokutek.com/products/">TokuDB v5.0</a> for   yourself.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/04/hot-indexing-part-i-new-feature/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Hot Column Addition and Deletion Part I &#8211; Performance</title>
		<link>http://www.tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/</link>
		<comments>http://www.tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/#comments</comments>
		<pubDate>Wed, 30 Mar 2011 13:49:43 +0000</pubDate>
		<dc:creator>Martin Farach-Colton</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[alter table]]></category>
		<category><![CDATA[hot column addition]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2150</guid>
		<description><![CDATA[From 18 hours to 3 seconds!
Hot Column Addition and Deletion (HCAD) Overview
<a href="http://tokutek.com/products/tokudb-for-mysql-v4/">TokuDB v5.0</a> introduces several features that are new to the MySQL world. In this series of posts, we&#8217;re going to present some information on these features:&#8230;]]></description>
			<content:encoded><![CDATA[<h3><em>From 18 hours to 3 seconds!</em></h3>
<h3>Hot Column Addition and Deletion (HCAD) Overview</h3>
<p><a href="http://tokutek.com/products/tokudb-for-mysql-v4/">TokuDB v5.0</a> introduces several features that are new to the MySQL world. In this series of posts, we&#8217;re going to present some information on these features: what&#8217;s the feature, how does it work under the hood, and how do you get the most out of this feature in your MySQL setup.</p>
<p>Today we start with HCAD: Hot Column Addition and Deletion. Many users have had the experience of loading a bunch of data into a table and associated indexes, only to find that adding some columns or removing them would be useful. An</p>
<pre>alter table X add column Y int default 0;</pre>
<p>or the like takes a long time &#8212; hours or more &#8212; during which time the table is write locked, meaning no insertions/deletions/updates and no queries on the new column until the alter table is done.</p>
<p>Mark Callaghan <a href="https://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932">points out</a> that changing the row format in InnoDB is a &#8220;significant project&#8221;, so it looked like slow alter tables were going to be a challenge for MySQL for the foreseeable future. Slow alter tables is a reason for the inability of MySQL to scale to large tables.</p>
<p>TokuDB v5.0 changes all that with the introduction of HCAD. You can add or delete columns from an existing table with minimal downtime &#8212; just the time for MySQL itself to close and reopen the table. The total downtime is seconds to minutes.</p>
<p>Here we present an example of HCAD in action. See <a href="http://tokutek.com/air-traffic-data-hcad/">this page</a> for details of the experiment. Drum roll&#8230;</p>
<p>TokuDB:</p>
<pre>mysql&gt; alter table ontime add column totalTime int default 0;
Query OK, 0 rows affected (3.33 sec)</pre>
<p>InnoDB:</p>
<pre>mysql&gt; alter table ontime add column totalTime int default 0;
Query OK, 122225386 rows affected (17 hours 44 min 40.85 sec)</pre>
<p>That&#8217;s 19,000x faster! Goodbye long downtimes.</p>
<p>As a note, the &#8220;0 rows affected&#8221; for TokuDB means that the column addition work happens in the background. All queries on the table, however, will see the new column as soon as the alter table returns, in this case after 3.33 sec.</p>
<p>We&#8217;re psyched about being able to provide this feature to our users. In the <a href="http://www.tokutek.com/2011/04/hot-column-addition-and-deletion-part-ii-how-it-works/">next post</a>, we&#8217;ll take a look at how TokuDB is able to achieve HCAD. Finally, we&#8217;ll present a how-to on getting the most out of HCAD.</p>
<p>Click here for <a href="http://www.tokutek.com/2011/04/hot-column-addition-and-deletion-part-ii-how-it-works/">Part II</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/03/hot-column-addition-and-deletion-part-i-performance/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

