Long Index Keys

Posted On June 1, 2009 | By Bradley C. Kuszmaul | 12 comments

In this post we’ll describe a query that accrued significant performance advantages from using a relatively long index key. (This posting is by Zardosht and Bradley.)

We ran across this query recently when interacting with a customer (who gave us permission to post this sanitized version of the story):

SELECT name,
       Count(e2) AS CountOfe2
 FROM (SELECT distinct name, e2
        FROM (SELECT atable.NAME AS name,
                     pd1.NAME AS e2
               FROM atable INNER JOIN atable AS pd1
               ON  (atable.id = pd1.id)
               AND (atable.off = pd1.off)
               AND (atable.len = pd1.len)) ent
 WHERE ((ent.name<>ent.e2))) outside
 GROUP BY outside.name order by CountOfe2 desc;

With a table defined as

CREATE TABLE `atable` (
   `id` varchar(25) DEFAULT NULL,
   `off` bigint(20) DEFAULT NULL,
   `len` bigint(20) DEFAULT NULL,
   `name` varchar(1000) DEFAULT NULL,
   `x` bigint(20) DEFAULT NULL,
   `y` bigint(20) DEFAULT NULL,
   `pid` varchar(25) DEFAULT NULL,
   `typ` varchar(25) DEFAULT NULL,
   KEY `nameIDX` (`name`),
   KEY `typIDX` (`typ`),
   KEY `idIDX` (`id`),
   KEY `lenIDX` (`len`),
   KEY `offIDX` (`off`),
   KEY `pidIDX` (`pid`)
  ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
  SET character_set_client = @saved_cs_client;

The actual table has about 30,000 rows, so it fits in main memory.

We concluded that with an index better suited for this query, we can speed up this query using either MyISAM and TokuDB’s storage engine for MySQL. However, TokuDB’s indexing flexibility provides about a 3x performance advantage over MyISAM.

All experiments were run on a Core 2 duo 2.4Ghz machine with 3GB RAM running Linux.

Here is what we did.

First, we ran the query as-is, to see the difference between TokuDB and MyISAM:

  • MyISAM ran the query in 98s.
  • TokuDB ran the query in 23.7s.

Next, we added an index to atable to make the inner join run faster. For TokuDB, we added a covering index comprising (id, off, len, name). For MyISAM, we added an index comprising (id, off, len), which covers the predicate, but not the whole query. We could not add name to MyISAM’s index, because MyISAM’s key size limit would not allow it. The customer had defined the length of the name field, so shortening it didn’t seem feasible. The new query times were:

  • MyISAM: 17.149s
  • TokuDB: 17.4s

which helped MyISAM a lot, and helped TokuDB a little.

To get more speedup, we observed that this query generates two temporary tables for the following parts:

  • select distinct name, e2 ...
  • select name, count(e2) as CountOfe2 ...

These temporary tables do not have a good index of (name, e2). If the temporary tables had this index, we could speed up the query even further. To do this, we transformed the query into:

create temporary table tmp_prox (name varchar(1000), e2 varchar(1000), key (name, e2))engine=tokudb;
create temporary table tmp_prox2 (name varchar(1000), e2 varchar(1000), key (name, e2))engine=tokudb;
insert into tmp_prox ( SELECT atable.NAME AS name,
                              pd1.NAME AS e2
                         FROM atable INNER JOIN atable AS pd1
                         ON  (atable.id = pd1.id)
                         AND (atable.off = pd1.off)
                         AND (atable.len = pd1.len) );
insert into tmp_prox2 ( SELECT distinct name, e2 from tmp_prox where ((name<>e2)) );
SELECT name, Count(e2) AS CountOfe2 FROM tmp_prox2 GROUP BY name order by CountOfe2 desc;

We could not do this with MyISAM, because the index (name, e2) exceeded MyISAM’s maximum index size. Running these statements with TokuDB took 4.9 seconds.

Thus, a covering index speeds up MyISAM by a factor of 5.7, and TokuDB’s indexing flexibility provided an additional 3x speedup. This transformation isn’t really specific to TokuDB. TokuDB really differentiates itself from other storage engines when maintaining indexes that don’t fit in main memory. This table fits in main memory, so transformation would probably work on some other storage engines that accept bigger index key lengths than does MyISAM. When the table does get bigger, TokuDB’s Fractal Tree® indexes can create and maintain those various indexes on cheap hardware even at high data arrival rates.

12 thoughts

  1. Hi!

    well, I don’t want to argue against the merits of either TokuDB or long covering indexes, but it seems to me the strategy for optimization is wrong here.

    I personally think it should have started by schema optimization and query rewriting. Here’s my reasoning:

    1) The table `atable` does not have a PRIMARY KEY. Almost by definition, this is a flaw that must be fixed first

    2) The original query can be rewritten to one that does not have any subqueries. This will lose both temporary tables in the execution plan. In addition, it reduces multiple passes over the data to only one (required for the GROUP BY).

    Here’s how:

    step 1: WHERE ((ent.name<>ent.e2))
    Despite the superfluous parenthesis, the WHERE can be removed al together and moved into the JOIN condition of inmost subquery. The inmost subquery can thus be rewritten to:

    (
    SELECT atable.NAME AS name
    , pd1.NAME AS e2
    FROM atable
    INNER JOIN atable AS pd1
    ON atable.id = pd1.id
    AND atable.off = pd1.off
    AND atable.len = pd1.len
    AND atable.name != pd1.name
    )

    (btw – the structure of the join tempts me to assume that the PRIMARY KEY of atable should have been {id, off, len, name})

    step 2: the DISTINCT in the outmost subquery creates unique pairs of names, and the outer query uses GROUP BY to count the number of 2nd names for each individual first name. This is a duplication of effort. The DISTINCT can be moved to inside the COUNT. This leaves the outmost subquery as plain selecting all rows from the inmost subquery, which means we can remove it. The final step is to apply the GROUP BY directly to the subquery, which means we can remove that too.

    This is the final result:

    SELECT name,
    Count(DISTINCT e2) AS CountOfe2
    FROM atable
    INNER JOIN atable AS pd1
    ON atable.id = pd1.id
    AND atable.off = pd1.off
    AND atable.len = pd1.len
    AND atable.name != pd1.name
    GROUP BY outside.name
    ORDER BY CountOfe2 desc;

  2. Arjen Lentz says:

    Might be possible to rewrite the query, it’s not that pretty regardless of the engine.

    Secondly, a composite index on the joined columns should do well for the MyISAM engine also. If you want to optimise and compare, do it on “both sides” of what you’re comparing. Otherwise you can’t tell anything from it…

  3. Björn says:

    Hm, that should be equal to:

    SELECT
    atable.name,
    COUNT(DISTINCT pd1.name) CountOfe2
    FROM
    atable
    INNER JOIN
    atable pd1 USING (id, off, len)
    WHERE
    atable.name <> pd1.name
    GROUP BY
    atable.name
    ORDER BY
    CountOfe2 DESC;

    I wonder, how does that compare to your solution for the customer’s dataset?

    With an index on (id, len, off) and some (probably bad) artificial data (22k rows), I get:

    The original query: 17404 rows in set (34.01 sec)
    The simple version: 17404 rows in set (1.40 sec)

    So it’s faster by a factor of 24.3 for me.

    Using a bigger dataset (32k rows) I get:
    The original query: 27460 rows in set (1 min 23.77 sec)
    The simple query: 27460 rows in set (2.62 sec)

    So we got a factor of about 32 now. That the gains increase with a growing dataset is probably also due to the original query getting pretty huge temporary tables (the test with 32k rows used almost 3GB worth of temp. tables here), while the simple query is pretty modest using well below 30MB for its temporary table (And I’m on a slow RAID-1). At least for the random dataset I experimented with.

    Interestingly, with my dataset, the original query doesn’t see any performance improvements at all when adding the the index on (id, off, len) compared to the table created by the DDL statement you gave. So I guess my testing dataset is pretty different from the actual data you’re working with (who would have guessed ;-)).

    Any chance to get some real number for the simple query? Would be interesting to see how that compares to your solution.

    Of course, this is all assuming that the query you’ve shown is close enough to the real thing to make the simple version applicable.

  4. PaulM says:

    Got to love blogs and comments eh. You have solid comments. Please retry if you have time and update the post.
    The other issue with your rewrite, is you have split 1 SQL into 5 pieces of SQL… what happens if the data changes.

    The original SQL with its nested scalar queries looks to have written by someone with a background in a database which has a optimizer which can handle this better.

    When I see this type of join the table on itself stuff it is screaming poor design. You shouldn’t need to de-dupe atable.name in this manner.

    Can I guess, this is reporting unique names and counts from a coordinate/tree/matrix table?

    Have Fun

  5. Steven says:

    Swapping out MySql engines is a powerful concept — but horribly misapplied when being used to mask bad design and/or lack of SQL knowledge. I can think of several instances where I’ve wanted to index a 1k key — but none that I could justify.

    The design problems start in the base table, but sometimes those issues are harder to fix in the short term. Here a quick fix is to think about what the query actually has to do in order to produce the required result set. Most importantly we don’t need to join each and every combination – only the unique combinations. The following query works well with or without an additional index. Slightly better with.

    SELECT a.name,
    Count(pd1.name) AS CountOfe2
    FROM
    atable a
    inner join (select
    name,id,offset,len
    from
    atable
    group by
    name,id,offset,len
    ) pd1
    ON (a.id = pd1.id)
    AND (a.offset = pd1.offset)
    AND (a.len = pd1.len)
    WHERE ((a.name<>pd1.name))
    group by a.name

    NOTES:
    1) I had to fabricate a dataset, different datasets produce different times. I had to come up with a moderately poor dataset since several sets caused MySql to take in excess of 5 minutes.
    2) The original query took 174 seconds on my windows laptop (dual core 2ghz w/2gb memory) running mysql 5.1.
    3) The modified query took .984 seconds without a composite index and .928 seconds with it.
    4) SQL Server managed 34 seconds for the original query without a composite index and less than a second for all other combinations.
    5) Both the original query and the modified query return the same resultset.

  6. Hey guys, are you still hoarding my comment? When I posted it I was told it was awaiting moderation because it was labelled as spam.

  7. Hey Roland, sorry for the quarantine – you’re comment is much appreciated and most definitely *not* SPAM. Not sure why it got flagged, but it’s up there now as the first comment.

    Many thanks to everyone else for the great comments! We’re running tests on the actual data with the various suggestions and we’ll post the results soon.

  8. I tried the suggested query rewrites. For each, I tried three experiments: a non-covering key of (id, off, len) for MyISAM, the same non-covering key for TokuDB, and a covering key of (id, off, len, name) for TokuDB. The results are summarized in the following table:

    Original Bjorn/Roland Steven
    MyISAM non-covering 17.149s 3.38s 9.02s
    TokuDB non-covering 3.63s 9.16s
    TokuDB covering 4.9s 1.36s 7.02s

    A couple of notes: (1) Bjorn’s and Roland’s suggestions are almost identical, and the gave the same times. (2) Steven reports that he gets the same running time with MyISAM both with and without the non-covering index defined about. My results are very different: MyISAM with no index takes 84s. I don’t know how to explain the discrepancy.

    Conclusion: (1) Query optimization makes a big difference in the run time. (2) Covering indexes also can make a big difference.

  9. Ok, so the table got all screwy. Here’s the data in a more readable form:

    O is for the original posting.
    BR is for Bjorn/Roland.
    S is for Steven.

    Original Bjorn/Roland Steven
    MyISAM non-covering
    O: 17.149s BR: 3.38s S: 9.02s

    TokuDB non-covering
    BR: 3.63s S: 9.16s

    TokuDB covering
    O: 4.9s BR: 1.36s S: 7.02s

    Not the easiest to read, and if I figure out to post tables I will.

  10. I agree with Steven that having a 1K “key” is somewhat dubious.  But that’s the way covering indexes are implemented in MySQL.  

    A richer syntax, in which some fields would be part of the key definition in a secondary index (in this case only (id, off, len) are needed to define the sort order) and other fields (name, in this case) are used to cover the query, would be cleaner, and you’d end up with a smaller key.  

    There are other databases that allow for such secondary index definitions.  I discussed this somewhat in a blog posting on clustering indexes.  

    So I think that in the MySQL case, long keys are the price one pays for covering indexes.

  11. Zardosht,

    thank you so much for the update. I’m glad to see the query suggestions did a lot to improve performance.

    I am still wondering about the primary key for the atable table. In particular, I am wondering whether my assumptions that is was {id, off, len, name} is correct. (I mean conceptual pk – I didn’t realize at first but you guys were correct to point out that name can’t fit into a MyISAM index)

  12. Roland,

    This is a problem a customer presented us. I do not know other scenarios under which this table is used, so we cannot comment on what the primary key should be.

    Regardless, I agree with your point that tables should have primary keys defined

Leave a Reply

Your email address will not be published. Required fields are marked *