Big Data and Predictive Analytics in Program Evaluation
March 10, 2015 | Chris Scharenbroch, Senior Research Associate
At NCCD, we work with clients who want insights into their programs and practices. Some come to us with specific questions they want to answer, and some bring a simple desire to learn more about their work. Traditional evaluations often take a directed approach to answer a specific question: does this program have the desired effect—yes or no? But what happens when the question is not clear or if there is no evaluation plan in place?
At NCCD, we work with clients who want insights into their programs and practices. Some come to us with specific questions they want to answer, and some bring a simple desire to learn more about their work. Traditional evaluations often take a directed approach to answer a specific question: does this program have the desired effect—yes or no? But what happens when the question is not clear or if there is no evaluation plan in place? At NCCD we’ve learned that we can gain valuable insights into program operation and impact by leveraging “big data”—loosely defined as collection of large inter-related datasets—with the emerging field of predictive analytics.
Predictive analytics is about exploration. We go into the data-mining process without a preconceived expectation or hypothesis to investigate, aiming to discover what data are available in the system and what relationships or patterns may exist. We examine multitudes of variables and many different units of analysis. This exploration often leads to the discovery of unforeseen relationships that can help agencies fine tune a program or better target the population they want to serve.
The real-world value of this type of analysis is part of NCCD’s everyday work. We use predictive analytics to help our clients understand the population they serve. This puts agencies in a better position to implement or refine programs, expand or scale back services, and develop adequate reporting and outcome requirements. Agencies need actionable insights that often fall outside of the scope of traditional evaluations, which are often limited to a specific study question. Instead, big data and predictive analytics examine all the information available.
So, rather than evaluating a specific program, we can use predictive analytics to better understand a specific intervention opportunity better. For example, an agency might wonder, can we use data from child welfare and juvenile probation to identify which youth currently involved in child welfare might later become involved in juvenile justice? The answer is, yes, we can. It requires some initial work to establish data sharing, data integration, and data cleaning. But once a file is ready for analysis, the predictive modeling can proceed. Using a set of classification algorithms, we can identify many combinations of information that together tell us how to tell the difference between children in the child welfare system who are more and less likely to later become involved in the juvenile justice system. Once we have those results, we can check them with other statistical approaches, test them with other data, and ensure that they simply make sense. Of course, the results of the predictive analytics are just half the story—the other half is how to make meaningful programmatic decisions with the information it provides. The question then becomes, now that we can say which children are most likely to later become involved in juvenile justice, what can we do to prevent that?
The technology behind big data and predictive analytics is increasingly user-friendly and cost-effective. As time and effort decrease, it makes sense for evaluators to tap into the potential of big data and predictive analytics. Supplementing traditional techniques with complex big data sets and predictive analytics provides evaluators with the most comprehensive information to make decisions about important programs like those serving youth in foster care and the juvenile justice system.