<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tokutek &#187; zardosht</title>
	<atom:link href="http://www.tokutek.com/author/zardosht/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tokutek.com</link>
	<description></description>
	<lastBuildDate>Thu, 02 Feb 2012 15:28:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>A Case for Write Optimizations in MySQL</title>
		<link>http://www.tokutek.com/2011/11/a-case-for-write-optimizations-in-mysql/</link>
		<comments>http://www.tokutek.com/2011/11/a-case-for-write-optimizations-in-mysql/#comments</comments>
		<pubDate>Mon, 21 Nov 2011 15:22:33 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[Fractal Tree indexes]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[storage engine]]></category>
		<category><![CDATA[TokuDB]]></category>
		<category><![CDATA[Tokutek]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=3531</guid>
		<description><![CDATA[As a storage engine developer, I am excited for MySQL 5.6. Looking at <a href="http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html" target="_blank">http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html</a>, there has been plenty of work done to improve the performance of reads in MySQL for all storage engines (provided they take advantage of&#8230;]]></description>
			<content:encoded><![CDATA[<p>As a storage engine developer, I am excited for MySQL 5.6. Looking at <a href="http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html" target="_blank">http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html</a>, there has been plenty of work done to improve the performance of reads in MySQL for all storage engines (provided they take advantage of the new APIs).</p>
<p>What would be great to add is API improvements to increase the performance of writes, and more specifically, updates. For many applications that perform updates, such as applications that do click counting or impression counting, there are significant opportunities for improving write performance.</p>
<p>Take the following example of click counting (or impression counting). You have a website and want to save the number of times links on your website have been clicked. Your table may look something like:</p>
<p><code><br />
create table num_clicks( link_id int, num_clicks int);<br />
</code></p>
<p>To update the number of clicks, you do something like:<code></code></p>
<p><code><br />
insert into num_clicks (LINK_ID, 1) on duplicate key update set num_clicks=num_clicks+1;<br />
</code></p>
<p>With MySQL as it currently works, this is slower than it needs to be, as I explained <a href="http://www.tokutek.com/2010/07/why-insert-on-duplicate-key-update-may-be-slow-by-incurring-disk-seeks/" target="_blank">here</a>. At a high level, the reason is that MySQL forces the storage engine to check in the table if a value exists for LINK_ID. If a row is returned, MySQL performs the increment away from the storage engine, and passes a new row to the storage engine for an update. The check incurs a disk seek, which is very costly in terms of latency. Disks can do only hundreds of seeks per second. Furthermore, NoSQL solutions based on B-trees are similarly limited and can&#8217;t be significantly accelerated because updates incur disk I/O.</p>
<p>However, with some changes to MySQL, a storage engine can take advantage of this knowledge to improve its algorithms. All that&#8217;s needed is for the storage engine to know that the user wants to perform an insert or to perform this particular update, as opposed to getting individual handler calls of write_row, index_read, and update_row (which is the current design). Hence, what&#8217;s needed is a way for the storage engine layer to be able to apply updates on its own.</p>
<p>This change can help all storage engines. Although I am not an expert in MySQL Cluster, I imagine reducing these individual handler calls also helps MySQL Cluster avoid network hops to retrieve information. For in-memory databases, performance may increase due to reducing the number of calls made by the handler. InnoDB can potentially use its insertion buffer to store the &#8220;insert &#8230; on duplicate key update&#8221; operation, thereby giving the operation the same boost insertions into secondary keys get. For TokuDB, we estimate that these types of updates aided by this additional information could run much faster. In future posts, I will expand on how we think TokuDB can do this.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/11/a-case-for-write-optimizations-in-mysql/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Understanding Indexing – NY Effective MySQL Meetup</title>
		<link>http://www.tokutek.com/2011/10/understanding-indexing-%e2%80%93-ny-effective-mysql-meetup/</link>
		<comments>http://www.tokutek.com/2011/10/understanding-indexing-%e2%80%93-ny-effective-mysql-meetup/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 15:05:41 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[indexes]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=3357</guid>
		<description><![CDATA[At next week’s <a href="http://ny.effectivemysql.com/events/34148552/" target="_blank">NY Effective MySQL Meetup</a>, I will give a talk: “Understanding Indexing: Three rules on making indexes around queries to provide good performance.” The meetup is 7 pm Tuesday, October 11th, and will be held at&#8230;]]></description>
			<content:encoded><![CDATA[<p>At next week’s <a href="http://ny.effectivemysql.com/events/34148552/" target="_blank">NY Effective MySQL Meetup</a>, I will give a talk: “Understanding Indexing: Three rules on making indexes around queries to provide good performance.” The meetup is <strong>7 pm Tuesday</strong>, <strong>October 11th</strong>, and will be held at <strong>Hive at 55</strong> (<a href="http://maps.google.com/maps?q=55+Broad+Street%2C+New+York%2C+NY" target="_blank">55 Broad Street, New York, NY</a>). Thanks to host <a href="http://ronaldbradford.com/" target="_blank">Ronald Bradford</a> for the invitation.</p>
<p>Application performance often depends on how fast a query can respond and query performance almost always depends on good indexing. So one of the quickest and least expensive ways to increase application performance is to optimize the indexes. This talk presents three simple and effective rules on how to construct indexes around queries that result in good performance.</p>
<p>This is a general discussion applicable to all databases using indexes and is not specific to any particular MySQL<sup>®</sup> storage engine (e.g., InnoDB, TokuDB<sup>®</sup>, etc.). The rules are explained using a simple model that does NOT rely on understanding B-trees, Fractal Tree<sup>®</sup> indexing, or any other data structure used to store the data on disk.</p>
<p>The rules are derived from these simple properties:</p>
<ul>
<li>Point queries are slow</li>
<li>Range queries are fast</li>
</ul>
<p>I hope to see you there!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/10/understanding-indexing-%e2%80%93-ny-effective-mysql-meetup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Indexing: The Director’s Cut</title>
		<link>http://www.tokutek.com/2011/07/indexing-the-director%e2%80%99s-cut/</link>
		<comments>http://www.tokutek.com/2011/07/indexing-the-director%e2%80%99s-cut/#comments</comments>
		<pubDate>Fri, 15 Jul 2011 16:01:26 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[fragmentation]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[storage engine]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=3014</guid>
		<description><![CDATA[Thanks again to Erin O’Neill and Mike Tougeron for having me at the <a href="http://www.sfmysql.org/events/19550211/">SF MySQL Meetup</a> last month for the talk on “Understanding Indexing.” The crowd was very interactive, and I appreciated that over 100 people signed up for&#8230;]]></description>
			<content:encoded><![CDATA[<p>Thanks again to Erin O’Neill and Mike Tougeron for having me at the <a href="http://www.sfmysql.org/events/19550211/">SF MySQL Meetup</a> last month for the talk on “Understanding Indexing.” The crowd was very interactive, and I appreciated that over 100 people signed up for the event and left some very positive <a href="http://www.sfmysql.org/events/19550211/?eventId=19550211&amp;action=detail">comments and reviews</a>.</p>
<p style="text-align: left;">Thanks to Mike, a <a href="http://vimeo.com/26454091">video</a> of the talk is now available:</p>
<div style="text-align: center;">
<p><iframe src="http://player.vimeo.com/video/26454091?title=0&amp;byline=0&amp;portrait=0" width="400" height="300" frameborder="0"></iframe></p>
</div>
<p>As a brief overview –  Application performance often depends on how fast a query can respond and query performance almost always depends on good indexing. So one of the quickest and least expensive ways to increase application performance is to optimize the indexes. This talk presents three simple and effective rules on how to construct indexes around queries that result in good performance.</p>
<p>&#8212;</p>
<p>“Director’s Cut” Bonus Feature: During the talk, a number of people wanted the lowdown on Fractal Tree™ Indexing. After the first hour on general indexing, I was able to address this and give a <a href="http://vimeo.com/26471692">whiteboard overview</a> of Fractal Tree indexes:</p>
<div style="text-align: center;">
<p><iframe src="http://player.vimeo.com/video/26471692?title=0&amp;byline=0&amp;portrait=0" width="400" height="300" frameborder="0"></iframe></p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/07/indexing-the-director%e2%80%99s-cut/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Understanding Indexing – SF MySQL Meetup</title>
		<link>http://www.tokutek.com/2011/06/understanding-indexing-%e2%80%93-sf-mysql-meetup/</link>
		<comments>http://www.tokutek.com/2011/06/understanding-indexing-%e2%80%93-sf-mysql-meetup/#comments</comments>
		<pubDate>Tue, 21 Jun 2011 05:14:12 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=2854</guid>
		<description><![CDATA[At this week’s <a href="http://www.sfmysql.org/events/19550211/">SF MySQL Meetup</a>, I will give a talk: “Understanding Indexing: Three rules on making indexes around queries to provide good performance.” The meetup is 7 pm tomorrow (Wednesday, 6/22), and will be held at CBS Interactive&#8230;]]></description>
			<content:encoded><![CDATA[<p>At this week’s <a href="http://www.sfmysql.org/events/19550211/">SF MySQL Meetup</a>, I will give a talk: “Understanding Indexing: Three rules on making indexes around queries to provide good performance.” The meetup is 7 pm tomorrow (Wednesday, 6/22), and will be held at CBS Interactive (235 2nd St., San Francisco). Thanks to hosts Erin O’Neill and Mike Tougeron for the invitation and location.</p>
<p>Application performance often depends on how fast a query can respond and query performance almost always depends on good indexing. So one of the quickest and least expensive ways to increase application performance is to optimize the indexes. This talk presents three simple and effective rules on how to construct indexes around queries that result in good performance.</p>
<p>This is a general discussion applicable to all databases using indexes   and is not specific to any particular MySQL storage engine (e.g.,   InnoDB, TokuDB, etc.). The rules are explained using a simple model that does NOT rely on understanding B-Trees, Fractal Tree™ indexing, or any other data structure used to store the data on disk.</p>
<p>The rules are derived from these simple properties:</p>
<ul>
<li>Point queries are slow</li>
<li>Range queries are fast</li>
</ul>
<p>I hope to see you there!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/06/understanding-indexing-%e2%80%93-sf-mysql-meetup/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Effective MySQL, a New York City Meetup</title>
		<link>http://www.tokutek.com/2011/04/effective-mysql-a-new-york-city-meetup/</link>
		<comments>http://www.tokutek.com/2011/04/effective-mysql-a-new-york-city-meetup/#comments</comments>
		<pubDate>Tue, 26 Apr 2011 14:39:05 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[Fractal Tree™ indexes]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>
		<category><![CDATA[Tokutek]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=2465</guid>
		<description><![CDATA[Kudos to Ronald Bradford for creating a new MySQL meetup group in New York city and giving MySQL related talks. The next one is tonight, titled &#8220;MySQL Idiosyncrasies That Bite&#8221;. Information on it can be found at <a href="http://ny.effectivemysql.com/events/16884850/">http://ny.effectivemysql.com/events/16884850/</a>.
We&#8217;ll&#8230;]]></description>
			<content:encoded><![CDATA[<p>Kudos to Ronald Bradford for creating a new MySQL meetup group in New York city and giving MySQL related talks. The next one is tonight, titled &#8220;MySQL Idiosyncrasies That Bite&#8221;. Information on it can be found at <a href="http://ny.effectivemysql.com/events/16884850/">http://ny.effectivemysql.com/events/16884850/</a>.</p>
<p>We&#8217;ll have a contingent from our New York office there this evening. We went to the last one on indexing (a favorite topic of ours) in March and it was excellent.</p>
<p>We look forward to seeing folks there as well as at upcoming NY events, including Percona Live (May 26th) and future Effective MySQL meetups.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/04/effective-mysql-a-new-york-city-meetup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding Indexing &#8211; MySQL Meetup</title>
		<link>http://www.tokutek.com/2011/03/understanding-indexing-mysql-meetup/</link>
		<comments>http://www.tokutek.com/2011/03/understanding-indexing-mysql-meetup/#comments</comments>
		<pubDate>Wed, 16 Mar 2011 22:01:21 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=1989</guid>
		<description><![CDATA[Yesterday, at the <a href="http://www.meetup.com/mysqlbos/events/16457333/">Boston MySQL Meetup</a>, I gave a talk on indexing. It is posted <a href="http://www.tokutek.com/wp-content/uploads/2011/03/Tokutek_Understanding_Indexes.pdf">here</a> (also <a href="http://goo.gl/S2LBe">goo.gl/S2LBe</a>).
In short, indexes are used to improve query performance. As a result, good indexes are designed around queries that&#8230;]]></description>
			<content:encoded><![CDATA[<p>Yesterday, at the <a href="http://www.meetup.com/mysqlbos/events/16457333/">Boston MySQL Meetup</a>, I gave a talk on indexing. It is posted <a href="http://www.tokutek.com/wp-content/uploads/2011/03/Tokutek_Understanding_Indexes.pdf">here</a> (also <a href="http://goo.gl/S2LBe">goo.gl/S2LBe</a>).</p>
<p>In short, indexes are used to improve query performance. As a result, good indexes are designed around queries that users find important in their application. The presentation covers <em><strong>three</strong></em> simple and effective rules on how to construct indexes around queries that result in good performance.</p>
<p>The rules are explained using a simple model that does NOT rely on understanding B-trees, Fractal Tree™ indexing, or any other data structure used to store the data on disk. They are derived from these simple properties:</p>
<ul>
<li>Point queries are slow</li>
<li>Range queries are fast</li>
</ul>
<p>As always, comments, questions and thoughts are welcome!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/03/understanding-indexing-mysql-meetup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scenarios where TokuDB&#8217;s Loader is Used</title>
		<link>http://www.tokutek.com/2010/09/scenarios-where-tokudbs-loader-is-used/</link>
		<comments>http://www.tokutek.com/2010/09/scenarios-where-tokudbs-loader-is-used/#comments</comments>
		<pubDate>Thu, 16 Sep 2010 17:06:21 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[mysql tokudb loader]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=1700</guid>
		<description><![CDATA[TokuDB&#8217;s loader uses the available multicore computing resources of the machine to presort and insert the data. In the last couple of posts (<a href="http://tokutek.com/2010/08/loading-air-traffic-control-data-with-tokudb-4-1-1/" />here and <a href="http://tokutek.com/2010/09/loading-tables-with-tokudb-4-0/" />here), Rich and Dave presented performance results of TokuDB&#8217;s loader.&#8230;]]></description>
			<content:encoded><![CDATA[<p>
TokuDB&#8217;s loader uses the available multicore computing resources of the machine to presort and insert the data. In the last couple of posts (<a href="http://tokutek.com/2010/08/loading-air-traffic-control-data-with-tokudb-4-1-1/" />here</a> and <a href="http://tokutek.com/2010/09/loading-tables-with-tokudb-4-0/" />here</a>), Rich and Dave presented performance results of TokuDB&#8217;s loader. Comparing load times with TokuDB 2.1.0, Rich found a 2.1x speedup on a 2 core machine, and a 4.2x speedup on an 8 core machine. Comparing load times with TokuDB 3.1, Dave found an 8.2x speedup on Amazon Web Services c1.large node with 8 cores while loading a table with 256 byte rows.</p>
<p>
This leads to these natural questions: how does one use the TokuDB loader? Under what scenarios is it used?</p>
<p>
The loader has two purposes:<br />
<UL><br />
 <LI>to ease migration of data from other sources (e.g. other storage engines, data files) to TokuDB.<br />
 <LI>to build newly defined indexes (that TokuDB can maintain in real time) very fast.<br />
</UL><br />
So, the loader is designed to operate on empty tables. </p>
<p>
The loader is integrated into the TokuDB storage engine. Using it does not require any external binaries. Users can use the loader by inserting data into an empty TokuDB table with any of the following commands:<br />
<UL><br />
 <LI>insert into<br />
 <LI>load data infile<br />
 <LI>alter table &#8230; engine=TokuDB<br />
 <LI>alter table &#8230; add index&#8230;<br />
</UL></p>
<p>
Scenarios that do not yet work are:<br />
<UL><br />
 <LI>replace into<br />
 <LI>insert ignore<br />
</UL></p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2010/09/scenarios-where-tokudbs-loader-is-used/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On &#8220;Replace Into&#8221;, &#8220;Insert Ignore&#8221;, Triggers, and Row Based Replication</title>
		<link>http://www.tokutek.com/2010/08/on-replace-into-insert-ignore-triggers-and-row-based-replication/</link>
		<comments>http://www.tokutek.com/2010/08/on-replace-into-insert-ignore-triggers-and-row-based-replication/#comments</comments>
		<pubDate>Wed, 11 Aug 2010 04:14:02 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[disk seek]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[insert ignore]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[replace into]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[TokuDB]]></category>
		<category><![CDATA[triggers]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=1602</guid>
		<description><![CDATA[In posts on <a href="http://tokutek.com/2010/06/making-replace-into-fast-by-avoiding-disk-seeks/" />June 30 and <a href="http://tokutek.com/2010/07/making-insert-ignore-fast-by-avoiding-disk-seeks/" />July 6, I explained how implementing the commands “replace into” and “insert ignore” with <a href="http://tokutek.com/2010/04/how-fractal-trees-work-talk-at-mysql-2010/" />TokuDB’s fractal trees data structures can be two orders of magnitude faster than&#8230;]]></description>
			<content:encoded><![CDATA[<p>
In posts on <a href="http://tokutek.com/2010/06/making-replace-into-fast-by-avoiding-disk-seeks/" />June 30</a> and <a href="http://tokutek.com/2010/07/making-insert-ignore-fast-by-avoiding-disk-seeks/" />July 6</a>, I explained how implementing the commands “replace into” and “insert ignore” with <a href="http://tokutek.com/2010/04/how-fractal-trees-work-talk-at-mysql-2010/" />TokuDB’s fractal trees data structures</a> can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, I hinted at that there are some caveats that complicate the story a little. On <a href="http://tokutek.com/2010/07/on-replace-into-insert-ignore-and-secondary-keys/" />July 21st</a> I explained one caveat, secondary keys, and on <a href="http://tokutek.com/2010/08/1589/" />August 3rd</a>, Rich explained another caveat. In this post, I explain the other two caveats: triggers and replication.</p>
<p>
First, let&#8217;s look at triggers. The key to &#8220;replace into&#8221; and &#8220;insert ignore&#8221; being fast is that the uniqueness check is not done when the command is executed. The uniqueness check is deferred to a more opportune time. Triggers do not allow the uniquness check to be deferred. For triggers to properly work, we must know exactly what rows applied what modifications. For &#8220;replace into&#8221;, we must know which rows resulted in an insertion and which rows resulted in a deletion and insertion. For &#8220;insert ignore&#8221;, we must know which rows resulted in an insertion and which rows performed no operation. This information is not available unless disk seeks are performed.</p>
<p>
Now, let&#8217;s look at row based replication. The problem here lies within mysql (as I describe in <a href="http://bugs.mysql.com/bug.php?id=53561">bug 53561</a>, although that bug only addresses &#8220;replace into&#8221;, similar issues exist with &#8220;insert ignore&#8221;). Row based replication is not designed to handle anything other than normal insertions, which return an error if a duplicate key is found. So, a successful execution of &#8220;replace into&#8221; or &#8220;insert ignore&#8221; on the master maps to a normal insertion on the slave. Because normal insertions have different semantics than &#8220;replace into&#8221; and &#8220;insert ignore&#8221;, row based replication does not work if you use the altered semantics.</p>
<p>
The problems with triggers are semantic. Their requirements incur disk seeks, which slows performance. The problem with row based replication is a design issue which will hopefully be fixed one day. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2010/08/on-replace-into-insert-ignore-triggers-and-row-based-replication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On &#8220;Replace Into&#8221;, &#8220;Insert Ignore&#8221;, and Secondary Keys</title>
		<link>http://www.tokutek.com/2010/07/on-replace-into-insert-ignore-and-secondary-keys/</link>
		<comments>http://www.tokutek.com/2010/07/on-replace-into-insert-ignore-and-secondary-keys/#comments</comments>
		<pubDate>Wed, 21 Jul 2010 20:51:55 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[disk seek]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[insert ignore]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[replace into]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=1584</guid>
		<description><![CDATA[In posts on <a href="http://tokutek.com/2010/06/making-replace-into-fast-by-avoiding-disk-seeks/" />June 30 and <a href="http://tokutek.com/2010/07/making-insert-ignore-fast-by-avoiding-disk-seeks/" />July 6, I explained how implementing the commands &#8220;replace into&#8221; and &#8220;insert ignore&#8221; with <a href="http://tokutek.com/2010/04/how-fractal-trees-work-talk-at-mysql-2010/" />TokuDB&#8217;s fractal trees data structures can be two orders of magnitude faster than&#8230;]]></description>
			<content:encoded><![CDATA[<p>
In posts on <a href="http://tokutek.com/2010/06/making-replace-into-fast-by-avoiding-disk-seeks/" />June 30</a> and <a href="http://tokutek.com/2010/07/making-insert-ignore-fast-by-avoiding-disk-seeks/" />July 6</a>, I explained how implementing the commands &#8220;replace into&#8221; and &#8220;insert ignore&#8221; with <a href="http://tokutek.com/2010/04/how-fractal-trees-work-talk-at-mysql-2010/" />TokuDB&#8217;s fractal trees data structures</a> can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, I hinted at that there are some caveats that complicate the story a little. In this post, I explain one of the complications: secondary indexes.</p>
<p>
Secondary indexes act the same way in TokuDB as they do in InnoDB. They store the defined secondary key, and the primary key as a pointer to the rest of the row. So, say the table foo has the following schema:</p>
<pre>
create table (a int, b int, c int, primary key (a), key(b));
</pre>
<p>And we did:</p>
<pre>
insert into foo values (1,10,100),(2,20,200);
</pre>
<p>
Logically, there is one dictionary that stores all the data (this is the clustered primary key). Let us call it the main dictionary:</p>
<pre>
key  value
1    10,100
2    20,200
</pre>
<p>And there is another dictionary for the secondary key that stores the column &#8216;b&#8217; and the primary key, &#8216;a&#8217;:</p>
<pre>
key  value
10   1
20   2
</pre>
<p>
For secondary indexes to work properly, there must be a one to one correspondence between elements in the secondary index and in the primary index. If this correspondence is broken, then the table is corrupt.</p>
<p>
Now suppose we were to execute:</p>
<pre>
replace into foo values (1,1000,1000);
</pre>
<p>
This does:<br />
<UL><br />
<LI> in main dictionary, overwrite the value of key &#8217;1&#8242; and value &#8217;10,100&#8242; with key &#8217;1&#8242; and value &#8217;1000,1000&#8242;.<br />
<LI> in secondary dictionary, remove the key &#8217;10&#8242; with value &#8217;1&#8242;.<br />
<LI> in secondary dictionary, insert the key &#8217;1000&#8242; and key &#8217;1&#8242;.<br />
</UL></p>
<p>
Notice that we cannot perform the second step unless we know the content of the existing row that is being replaced. Learning the content of the existing row requires a lookup in the main dictionary, which incurs a disk seek.</p>
<p>
So, when executing &#8220;replace into&#8221; or &#8220;insert ignore&#8221; on tables with secondary keys, all engines must still incur a disk seek on the primary dictionary to learn where associated elements are in a secondary index, whereas if no secondary keys exist, then TokuDB&#8217;s fractal trees can avoid this disk seek.</p>
<p>
Even with secondary indexes, fractal tree indexes are preferred. B-trees still incur additional disk seeks on insertions into secondary indexes that fractal trees do not. However, with no secondary indexes, fractal trees can do away with the mandatory disk seek whereas B-trees do not.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2010/07/on-replace-into-insert-ignore-and-secondary-keys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why &#8220;insert &#8230; on duplicate key update&#8221; May Be Slow, by Incurring Disk Seeks</title>
		<link>http://www.tokutek.com/2010/07/why-insert-on-duplicate-key-update-may-be-slow-by-incurring-disk-seeks/</link>
		<comments>http://www.tokutek.com/2010/07/why-insert-on-duplicate-key-update-may-be-slow-by-incurring-disk-seeks/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 19:17:49 +0000</pubDate>
		<dc:creator>zardosht</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[disk seek]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[insert]]></category>
		<category><![CDATA[insert ignore]]></category>
		<category><![CDATA[insert on duplicate key update]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[replace into]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=1580</guid>
		<description><![CDATA[In my post on June 18th, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. I previously explained why it would be better <a&#8230;]]></description>
			<content:encoded><![CDATA[<p>
In my post on June 18th, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. I previously explained why it would be better <a href="http://tokutek.com/2010/06/making-replace-into-fast-by-avoiding-disk-seeks/" />to use &#8220;replace into&#8221;</a> or <a href="http://tokutek.com/2010/07/making-insert-ignore-fast-by-avoiding-disk-seeks/" />to use &#8220;insert ignore&#8221;</a> over normal inserts. In this post, I explain why another alternative to normal inserts, &#8220;insert &#8230; on duplicate key update&#8221; is no better in MySQL, because the command incurs disk seeks.</p>
<p>
The reason &#8220;insert ignore&#8221; and &#8220;replace into&#8221; can be made fast with <a href="http://tokutek.com/2010/04/how-fractal-trees-work-talk-at-mysql-2010/" />TokuDB&#8217;s fractal trees</a> is that the semantics of what to do in case a duplicate key is found is simple. In one case, you ignore, and in the other, you overwrite. With specific tombstone messages defined for these simple semantics, we defer the uniqueness check to a more opportune time.</p>
<p>
The semantics of &#8220;insert &#8230; on duplicate key update&#8221; are not simple:<br />
<UL><br />
<LI>if the primary (or unique) key does not exist, insert the new row<br />
<LI>if the primary key does exist, perform some update as defined in the SQL statement<br />
</UL></p>
<p>
The problem is we do not have a way of encoding the SQL update function into a message, the way we are able to encode &#8220;replace into&#8221; as an &#8216;i&#8217; and &#8220;insert ignore&#8221; as an &#8216;ii&#8217;. If we did, we could similarly make &#8220;insert &#8230; on duplicate key update&#8221; fast.</p>
<p>
I am not claiming that this is not theoretically possible, just that the storage engine API in MySQL does not allow for the encoding of updates as messages. Instead, what MySQL does is the following:<br />
<UL><br />
<LI>call handler::write_row to attempt an insertion, if it succeeds, we are done<br />
<LI>if handler::write_row returns an error indicating a duplicate key, outside of the handler, apply the necessary update to the row<br />
<LI>call handler::update_row to apply the update<br />
</UL></p>
<p>
The storage engine API does not have any access to the function that applies an update to the existing row. This is why the storage engine has no way of encoding any SQL update function (even some simple ones, such as &#8220;increment column a&#8221;).</p>
<p>
So, in the meantime, to implement these semantics, B-trees and Fractal Tree data structures both:<br />
<UL><br />
<LI>look up the primary (or unique) key to verify existence<br />
<LI>take the appropriate action based on whether the primary (or unique) key exists<br />
</UL></p>
<p>
The first step incurs a disk seek on large data sets with an ad-hoc primary (or unique key). And that is why it is slow.</p>
<p>
So, the moral of the story is this. In MySQL, &#8220;insert &#8230; on duplicate key update&#8221; is slower than &#8220;replace into&#8221;. Although the sematics are slightly different in the case where the primary key is found (the former is defined as an update, whereas the latter is defined as a delete followed by an insert), if possible, the simpler semantics of &#8220;replace into&#8221; allow it to be faster than &#8220;insert &#8230; on duplicate key update&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2010/07/why-insert-on-duplicate-key-update-may-be-slow-by-incurring-disk-seeks/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

