CMU-CS-07-120
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-07-120

Just-In-Time Indexing for Interactive Data Exploration

Phillip B. Gibbons*, Lily Mummert*, Rahul Sukthankar*
M. Satyanarayanan, Larry Huston**

April 2007

CMU-CS-07-120.pdf


Keywords: Indexed search, interactive search, discard-based search, Diamond, searchlets, filters, non-indexed data

Interactive search of complex data poses significant challenges for traditional indexing methods because of the infeasibility of determining an effective set of indices a priori. This paper proposes just-in-time indexing, a new strategy that mitigates these challenges by exploiting a key characteristic of interactive data exploration: iterative query refinement. During the refinement process, just-in-time indexing takes advantage of user think time to create indices on-the-fly for query terms likely to be relevant to the current user. Moreover, because a user typically refines a query after observing only a subset of the results, just-in-time indexing indexes only subsets of the data at a time. We present strategies for selecting which query terms to index at any point in time, balancing the needs of the current user (immediate workload) versus the projected needs of future users (long-term workload). We have implemented just-in-time indexing in the Diamond architecture and validated its effectiveness for exploring image databases.

25 pages

*Intel Research Pittsburgh
**Arbor Networks


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu