Big Data Backlash

Jun 5, 2013, 4:01 PM, Posted by

Brian Quinn Brian Quinn, assistant vice president, Research and Evaluation

The big data hype cycle is playing out in predictable ways. Perhaps it’s inevitable that, after all the talk about how big data is going to save the world, we’re starting to see a similar rash of stories about how the promise of big data has been oversold. Microsoft Research’s Kate Crawford has been particularly outspoken as of late, with Quentin Hardy recounting her “six myths of big data” in The New York Times last weekend and Kate’s own Foreign Policy piece in May, which pointed out that big data put our privacy at risk, in addition to being susceptible to bias, misunderstanding, limitations and discriminatory outcomes.

I’m all for a little healthy skepticism. In fact, Pioneer seeks out those who are asking questions that others are not. But the potential of big data to take on some of health and health care’s most intractable problems is something we’re excited about here at RWJF. Too many Americans are unhealthy, our health care system isn’t working and I’m confident that effective analysis and use of big data has (at the very least a small) role to play in turning things around. I don’t want this backlash to stifle explorations into what that role could be.

Big data’s detractors leave me wondering, “What’s the alternative?”

No data? It’s pretty obvious that’s a bad idea. From baseball to politics, we see that decision makers who use data to guide their actions outperform those who don’t.

So, assuming that data is better than no data, are we really meant to continue to rely on the sources of data we’ve counted on for years? In health and health care, that would mean continuing to rely on survey and administrative data, which are replete with their own shortcomings. Survey data is suffering from declining response rates and questions about nonresponse bias. And as an example of just how inadequate administrative data can be, a recent study from researchers at Columbia found that death certificates—a significant source of epidemiologic data—can be incorrect as much as one-third of the time.

A few of my RWJF colleagues were at Health Datapalooza in D.C. this week, engaging in a dialogue about how government can “liberate” and take better advantage of health data. Similarly, my colleagues on the Pioneer team have begun a new initiative—the Health Data Exploration Project—designed in part to begin to sort through the methodological issues surrounding the analysis of big data from health devices. These are the conversations I want to have: those that focus on how we maximize the effectiveness of all the data we have at our disposal.


This commentary originally appeared on the RWJF Pioneering Ideas blog.