Via a co-worker, here’s a nice overview of the technologies that Google created to deal with the mounds of data it houses:

Google Scalability Conference Trip Report: MapReduce, BigTable, and Other Distributed System Abstractions for Handling Large Datasets