<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tokutek &#187; Michael Bender</title>
	<atom:link href="http://www.tokutek.com/author/mbender/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tokutek.com</link>
	<description></description>
	<lastBuildDate>Thu, 02 Feb 2012 15:28:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>It Actually is Easy Being Green</title>
		<link>http://www.tokutek.com/2011/08/it-actually-is-easy-being-green/</link>
		<comments>http://www.tokutek.com/2011/08/it-actually-is-easy-being-green/#comments</comments>
		<pubDate>Thu, 11 Aug 2011 14:54:45 +0000</pubDate>
		<dc:creator>Michael Bender</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[Flash Drives]]></category>
		<category><![CDATA[Fractal Tree™ indexes]]></category>
		<category><![CDATA[Green]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>
		<category><![CDATA[Tokutek]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=3094</guid>
		<description><![CDATA[Fractal Tree™ indexes are green. They have the potential to be greener still. Here&#8217;s why: 
Remarkably, <a href="http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf">data centers consume 1-3 percent of all the US electricity</a>. A majority of this power is used to drive servers and storage systems.&#8230;]]></description>
			<content:encoded><![CDATA[<div id="attachment_3098" class="wp-caption aligncenter" style="width: 160px"><a href="http://www.youtube.com/watch?v=hpiIWMWWVco" rel="shadowbox[sbpost-3094];player=swf;width=640;height=385;"><img class="size-full wp-image-3098" title="(Fractal) Tree Frog" src="http://www.tokutek.com/wp-content/uploads/2011/08/frog-150x150.png" alt="" width="150" height="150" /></a><p class="wp-caption-text">(Fractal) Tree Frog</p></div>
<p><em>Fractal Tree™ indexes are green. They have the potential to be greener still. Here&#8217;s why: </em></p>
<p>Remarkably, <a href="http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf">data centers consume 1-3 percent of all the US electricity</a>. A majority of this power is used to drive servers and storage systems. Significant energy savings remain on the table.</p>
<p>Here&#8217;s why Fractal Tree indexing enables more energy-efficient storage: Data centers typically use many small-capacity disks rather than a few large-capacity disks. Why? One reason is to harness more spindles to obtain more I/Os per second. In some high-performance applications, users go so far as to employ techniques such as “<a href="http://en.wikipedia.org/wiki/Disk_drive_performance_characteristics#Short_stroking">short stroking</a>” to get more performance (and less storage) out of drives. But Fractal Tree indexes are so I/O-efficient that they don&#8217;t need as many I/Os.</p>
<p>Consider the power consumption of disks. An enterprise 80 to 160 GB disk runs at something like 4W (idle power), while an enterprise 1-2 TB disk runs at something like 8W (idle power). If you replace many small-capacity disks by a small number of large-capacity disks, you can maintain the same capacity, but reduce your storage power consumption per GB by close to an order of magnitude. So Fractal Tree indexes enable energy-efficient hardware when the metric is Watts per GB.</p>
<p class="size-thumbnail wp-image-3134 " title="Fractal Tree Frog">For a databases, however, joules per DB operation may be a better metric. Fractal Tree indexes are so I/O efficient, that they are terrific when measured as Joules per operation.</p>
<p>What about power consumed by servers? A lot of our customers see an increase in server activity due to the increase in throughput. Fractal Tree indexes are so <a href="http://www.tokutek.com/downloads/tokudb-performance-brief.pdf">I/O-efficient</a> that they drive CPUs harder, consequently using more power. But, assuming that a user is trying to keep the same overall target number of inserts/deletes, Fractal Tree indexes are still more efficient in terms of joules per database insert/delete.</p>
<p>Given how important these topics are, Bradley and I recently attended the <a href="http://www.nsf.gov/">National Science Foundation</a> workshop &#8220;<a href="http://gold.cs.pitt.edu/seedm/">Energy-Efficient Data Management</a>&#8221; in Arlington, VA. This was a two-day planning meeting, where researchers from industry and academia convened to discuss open problems in energy-efficient data management. We discussed how to devise and deploy new data-management methods and new data-intensive applications that are more energy efficient.</p>
<p>I spoke about how better data structures have the potential to deliver energy savings. For details, see the slides themselves: &#8220;<a href="http://www.tokutek.com/wp-content/uploads/2011/08/green-indexing.pdf">How Fast Indexing Makes Databases Greener</a>.&#8221;</p>
<p>The main purpose of the talk was to discuss open areas for research. Here are three open problems I covered in my talk. For more details see the slides.</p>
<ul>
<li>Area 1: Develop a massively multithreaded Fractal Tree variant that could run on future-generation machines consisting of thousands very very slow cores.</li>
<li>Area 2: Develop an Energy-Efficient SSD/Rotational Disk Hybrid.</li>
<li>Area 3: The proof is in the pudding.</li>
</ul>
<p>Thanks again to the NSF for supporting Tokutek through SBIR grants on topics like these.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/08/it-actually-is-easy-being-green/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Don’t Thrash: How to Cache your Hash on Flash</title>
		<link>http://www.tokutek.com/2011/07/dont-thrash-how-to-cache-your-hash-on-flash/</link>
		<comments>http://www.tokutek.com/2011/07/dont-thrash-how-to-cache-your-hash-on-flash/#comments</comments>
		<pubDate>Thu, 07 Jul 2011 11:02:21 +0000</pubDate>
		<dc:creator>Michael Bender</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[TokuDB]]></category>

		<guid isPermaLink="false">http://www.tokutek.com/?p=3005</guid>
		<description><![CDATA[Last week I gave a talk entitled “Don’t Thrash: How to Cache your Hash.” The talk took place at the <a href="http://www.dis.uniroma1.it/%7Edemetres/events/ads11/">Workshop on Algorithms and Data Structures (ADS)</a> in a medieval castle turned conference center in <a href="http://www.centrocongressibertinoro.it/index_en.cfm">Bertinoro, Italy</a>.&#8230;]]></description>
			<content:encoded><![CDATA[<div>
<p>Last week I gave a talk entitled “Don’t Thrash: How to Cache your Hash.” The talk took place at the <a href="http://www.dis.uniroma1.it/%7Edemetres/events/ads11/">Workshop on Algorithms and Data Structures (ADS)</a> in a medieval castle turned conference center in <a href="http://www.centrocongressibertinoro.it/index_en.cfm">Bertinoro, Italy</a>. An earlier version of this work (with the same title) appeared at the <a href="http://www.usenix.org/event/hotstorage11/">HotStorage</a> conference in Portland, OR. Tokutek co-founders Bradley, Martin, and I are coauthors on the work, along with students and other faculty at <a href="http://www.cs.sunysb.edu/">Stony Brook University</a>.</p>
<p>The talk title is colorful and doggerel-y. Here’s what the title means. “Cache your hash”—the so-called Bloom Filter type data structure. A Bloom filter acts like a negative cache, letting you know that a particular element is <em>not</em> stored in a database. The problem is that Bloom filters don’t scale outside of RAM (they “thrash”). In the paper, the basis for my talk, we wrote about an alternative data structure that scales beyond main memory as data goes to SSDs.</p>
<p>The attendees at HotStorage comprised researchers/practitioners in storage technology for both MySQL and other databases. The attendees of ADS comprised mathematicians and theoretical computer scientists, studying algorithms and data structures. Since storage technologists, mathematicians and computer scientists often speak different languages, it was great to see that this time around they all responded to the material.  The theoretical computer scientists and mathematicians commented on the elegance—but so did the storage practitioners. The storage practitioners praised the real horsepower and the application of SSDs—but so did the algorithmicists.  From my perspective, it was gratifying to get this type of reaction. After all, it’s exactly this type of theoretical/practical conflux that got me excited about starting Tokutek six years ago – applying abstract algorithms to alleviate the all-too-real database bottlenecks our customers face every day.</p>
<p>To learn more about how this proposed alternative data structure supports over half a million insertions/deletions per second and over 500 point queries per second on a commodity flash-based SSD, please <a href="http://www.usenix.org/events/hotstorage11/tech/">click here</a> for the technical paper, slides, and mp3.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/07/dont-thrash-how-to-cache-your-hash-on-flash/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>My-Shhhh!-QL</title>
		<link>http://www.tokutek.com/2011/02/my-shhhh-ql/</link>
		<comments>http://www.tokutek.com/2011/02/my-shhhh-ql/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 20:58:29 +0000</pubDate>
		<dc:creator>Michael Bender</dc:creator>
				<category><![CDATA[TokuView]]></category>
		<category><![CDATA[B-tree]]></category>
		<category><![CDATA[Fractal Trees]]></category>
		<category><![CDATA[indexes]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://tokutek.com/?p=1816</guid>
		<description><![CDATA[I gave a talk entitled &#8220;<a href="http://tokutek.com/?attachment_id=1821">How to Index Massive Data Sets Quickly</a>&#8221; at the <a href="http://www.lift.org/morrellyhomelandsecuritycenter.php">Morrelly Homeland Security Center</a>. The event was hosted at the <a href="http://www.lift.org/">Long Island Forum for Technology (LIFT)</a> and was jointly supported by <a href="http://www.nystar.state.ny.us/">NYSTAR</a>&#8230;]]></description>
			<content:encoded><![CDATA[<p>I gave a talk entitled &#8220;<a href="http://tokutek.com/?attachment_id=1821">How to Index Massive Data Sets Quickly</a>&#8221; at the <a href="http://www.lift.org/morrellyhomelandsecuritycenter.php">Morrelly Homeland Security Center</a>. The event was hosted at the <a href="http://www.lift.org/">Long Island Forum for Technology (LIFT)</a> and was jointly supported by <a href="http://www.nystar.state.ny.us/">NYSTAR</a> and the <a href="http://www.sensorcat.sunysb.edu/">Stony Brook Sensor CAT</a>. The audience was composed primarily of technologists in the area. While many of the people in the audience were not necessarily MySQL users, the problem of working with massive data sets is one that many MySQL users face.</p>
<p>One topic we discussed in detail was the data-ingest problem common to three letter agencies: sensors generate thousands or millions of data items per second. The sensors could be recording mouse-click data, taking temperature or chemical readings, or measuring network performance. The data must be stored so that analytical queries can be made on live data in real time.</p>
<p>From the talk, we know that people are interested in indexing and querying live data. <a href="http://en.wikipedia.org/wiki/TokuDB#Fractal_Tree_Indexes">Fractal Tree indexes</a>, with insertions that run up to two orders of magnitude faster than B-tree insertions, are an optimal technology for solving the data-ingest problem on large data sets. (Here are more details about <a href="http://tokutek.com/2009/10/performance-brief/">Fractal Tree algorithm performance</a>.)  Through this blog and ongoing discussions, we look forward to hearing more about these applications and welcome comments and feedback.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2011/02/my-shhhh-ql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fractal Trees May Be Useful for Making Energy-Efficient Databases</title>
		<link>http://www.tokutek.com/2009/07/fractal_trees_may_be_useful_for_making_energy_efficient_databases/</link>
		<comments>http://www.tokutek.com/2009/07/fractal_trees_may_be_useful_for_making_energy_efficient_databases/#comments</comments>
		<pubDate>Mon, 06 Jul 2009 18:59:00 +0000</pubDate>
		<dc:creator>Michael Bender</dc:creator>
				<category><![CDATA[TokuView]]></category>

		<guid isPermaLink="false">http://fractal_trees_may_be_useful_for_making_energy_efficient_databases</guid>
		<description><![CDATA[On April 9-10 the National Science Foundation hosted the <a href="http://scipm.cs.vt.edu/">Workshop on the Science of Power Management (SciPM 2009)</a>, where I gave an invited talk. Here I give a brief summary of my talk along with a <a href="http://scipm.cs.vt.edu/Slides/5.MichaelBender.pdf">pointer</a> to&#8230;]]></description>
			<content:encoded><![CDATA[<p>On April 9-10 the National Science Foundation hosted the <a href="http://scipm.cs.vt.edu/">Workshop on the Science of Power Management (SciPM 2009)</a>, where I gave an invited talk. Here I give a brief summary of my talk along with a <a href="http://scipm.cs.vt.edu/Slides/5.MichaelBender.pdf">pointer</a> to the slides.</p>
<p>The talk describes how MySQL with TokuDB can provide a path to more energy-efficient database implementations. It&#8217;s a theoretical talk. That is, rather than presenting results from an existing implementation, it provides food for thought about future possibilities.</p>
<p>Here&#8217;s an executive summary of the talk. </p>
<p>Disks use a substantial fraction of the computing power in a typical database application. Although different workloads and configurations can give very different values, somewhere around 1/3 to 2/3 of the total energy consumed by the computing unit seems like a good ballpark.</p>
<p>Computation is only one part of the power equation for a data center. However, many other components (such as cooling) consume power roughly proportionally to (or at least correlated with) computation. Thus, my analysis of power consumed by computation can give insight into how other parts of the system are affected.</p>
<p>Typically B-tree-based databases are configured with many small-capacity disks rather than a small number of large capacity disks. A 120GB disk consumes roughly half of the power of a 2TB disk, even though it is about 17 times smaller. Using larger-capacity disks has the potential to reduce the power consumed by disks by as an order of magnitude in Watts per GB.</p>
<p>So why not use large capacity disks? Well, B-tree-based storage engines scale with seek time. Increasing the number of spindles drives up the number of random seeks per second in the system.</p>
<p>Fractal-tree-based storage engines (such as TokuDB) scale with bandwidth rather than with disk seeks. (Bandwidth scales as the square root of disk capacity.) Consequently, a well-balanced system based on fractal trees can use fewer larger-capacity disks.</p>
<p>Cutting the power consumption of disks (there are a couple of ways to measure power consumption, as explored in the talk itself) can have a big impact on the overall power consumption of the system.</p>
<p>I&#8217;ve left out most of the details. That&#8217;s what the <a href="http://scipm.cs.vt.edu/Slides/5.MichaelBender.pdf">talk</a> is for.</p>
<p>Comments welcome!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tokutek.com/2009/07/fractal_trees_may_be_useful_for_making_energy_efficient_databases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

