The Parable of Google Flu: Traps in Big Data Analysis

D. Lazer, R. Kennedy, G. King, A. Vespignani
Science Vol. 343
Issue 6176, pp. 1203-1205 (2014)
March 14, 2014

Abstract

Large errors in flu  prediction were largely avoidable, which offers lessons for the use of big  data. In February 2013, Google Flu Trends (GFT) made headlines but not for a  reason that Google executives or the creators of the flu tracking system  would have hoped. Nature reported that GFT was predicting more than double  the proportion of doctor visits for influenza-like illness (ILI) than the  Centers for Disease Control and Prevention (CDC), which bases its estimates  on surveillance reports from laboratories across the United States ( 1, 2).  This happened despite the fact that GFT was built to predict CDC reports.  Given that GFT is often held up as an exemplary use of big data ( 3, 4), what  lessons can we draw from this error?

Related publications