Indexing Big Data – NSF Workshop on Research Directions in the Principles of Parallel Computation
I attended the NSF Workshop on Research Directions in the Principles of Parallel Computation in Pittsburgh on 6/27/12. The workshop brought together researchers from academia and industry to explore visions for the future of parallel computing. I was one of 17 invited speakers. We were asked to address the question: “what are three big research challenges in the principles of parallel computing?”
My talk made the point that parallel computing is about high performance, and to get high performance, we need high-performance I/O. I explained how write-optimized techniques, such as Fractal-Tree indexes help maintain microdata (i.e., metadata in a file-system or small row lengths in a databases). John Esmet’s talk at HotStorage ’12 provides more background.
The organizers (Guy Blelloch at CMU and Phil Gibbons at Intel Labs) did a great job organizing and coordinating the workshop. Imagine how difficult it must be to keep the scheduling from lagging when there are 17 ten-minute talks in a single day. Guy and Phil handled this by using a second projector to project the time remaining. They instructed the audience to applaud when the timer reached zero. Speakers dropped like flies. I ended my talk on time, but with less than two seconds to spare.
Here’s a link to my presentation.