A Second Rule of Thumb for Choosing Column Order in Indexes
A couple of weeks ago, Baron Schwartz wrote an interesting post describing a rule of thumb he sometimes uses to choose the order of columns in an index. In a nutshell, he recommends putting highly selective columns first. This is a very good rule of thumb.
I would like to add another rule of thumb: columns that are used for equality comparisons are better than columns that are used for range queries.
If one has a query “select count(*) from foo where a BETWEEN 10 and 20 and b=5″, an index of (b,a) will always be as good, and likely better, than an index of (a,b), regardless of the selectivity of “a” or “b”.
This is because of how MySQL performs range queries. For an index of (a,b), a range query will be done from (10,5) to (20,5). This will require many rows to be processed where 10 < a < 20, but b != 5. On the other hand, for an index of (b,a), a range query will be done from (5,10) to (5,20). In this case, only rows that are going to be counted are processed. No rows that do not fall in the bounds of the query are processed.
There’s a general rule of thumb that more selective fields should go first. And intuitively, an equality comparison is likely to yield a more selective result. But my point is that it doesn’t matter in this case which is more selective. It is always better to put an equality field before a range field, or at the very least, no worse.