You are here


BlinkDB is a database system that runs on top of Hadoop (MapReduce), running SQL queries and translating them into MapReduce jobs. The key idea is that rather than running queries over the entire data set, it runs queries on a random (precomputed) sample of the data, and uses sampling theory to estimate the true query answer.

More Information:

Benjamin Letham, Katherine A. Heller