eCommerce Personalization


Issues addressed:

  • Providing strong performance for both planned and ad hoc analysis of live production data to identify business opportunities.
  • Delivering a best-in-class user experience with accurate-up-to-the-minute query results.
  • Enabling the creation of additional services.
  • Efficient and cost-effective clickstream analysis on live data.

High-performance indexing improved the speed of creating new revenue opportunities

The Company: KAYAK, the world’s largest travel search website, is in the business of helping people find and compare the best travel choices quickly and easily. Founded in 2004, the company provides powerful flight, hotel, car rental and vacation search results, as well as travel search applications for mobile devices.

The Challenge: KAYAK’s success is built on its ability to monitor vast amounts of data from multiple external data sources and to quickly present relevant, actionable search results to its clients. “Speed and accuracy are core to the service KAYAK offers the customer. We need to respond quickly to consumer requests for flights, accommodations, and travel packages,” said Paul Schwenk, SVP of Engineering at KAYAK.

As with most website data management systems, there are four challenges:

  • Knowing what portion of visitor data should be stored
  • Handling the sheer volume of information
  • Querying the information quickly enough to personalize a visitor’s live experience
  • Enabling analysts to run ad hoc queries for deep data mining

The Solution: KAYAK uses TokuDB to improve the relevance and speed of travel offers and also to run ad hoc queries from internal analysts for constantly improving KAYAK’s user experience.

Visitor activity is stored in TokuDB which supports operational queries for driving website functionality (e.g., the “Deals” tab) as well as ad hoc analytic queries (e.g., which flights are picked most often on winter trips to Dubai) for discovering new patterns that might evolve into future website features.

KAYAK had been using MySQL previously so the TokuDB installation was straightforward and required no “rip and replace” migration to a new RDBMS.

The Benefits: “Tokutekʼs high-performance indexing allows us to execute our workhorse queries across much larger data and thereby provide a better user experience,” said Schwenk. “In addition, TokuDB’s compatibility with MySQL made integration easy.”

Scaling: TokuDB has been in deployment at KAYAK since April 2009 and a critical factor in their choosing it was how it would handle future capacity. TokuDB’s performance doesn’t encounter performance issues when the database’s size exceeds main memory. It is designed for stable, predictable performance with even the largest databases and Tokutek’s testing verifies this behavior for databases of over 10 billion rows.

Compression: In addition to fast insertion rates, TokuDB provides data compression so that handling those higher data rates doesn’t exhaust disk capacity. KAYAK saw a 4x-5x compression advantage versus InnoDB by using TokuDB. “Disk compression allows me to run our systems on fewer disks at a lower operational cost today and it means we can trim our projected disk budget looking into the future,” said Schwenk.

No Aging: Operational costs are further reduced with TokuDB’s unique immunity to database aging. Conventional databases eventually require dump/reloads to remove the fragmentation that accumulates over time and which severely reduces query performance. TokuDB’s proprietary Fractal Tree indexing technology eliminates the need for database “defragmentation”. “Like all websites we run 24×7. What’s great about TokuDB is it requires no dump and reload downtime. It just works.”

“TokuDB’s advantages over other MySQL storage engines and even non-MySQL solutions mean I can focus on how KAYAK can deliver the best possible customer experience and know that my database infrastructure will be up to the task,” adds Schwenk.