• By Dartington SRU
  • Posted on Monday 28th February, 2011

To speak the same language is the desired effect

To show that their interventions are effective, developers need to demonstrate a significant statistical effect size – by putting clear water between the outcomes achieved by the average participant in an intervention trial and those in the comparison group. There are many ways to express this distance: a school-readiness program that has an effect size of 0.8, for example, can demonstrate that the average child receiving the intervention will be better prepared for school than 79% of children who do not. This would be considered a large effect.If only it were so simple. The forum into which these decimal findings are thrown is rather like the ground floor of the Tower of Babel: researchers and policy makers arrive with a similar goal – to make fine judgments about the evidence of effectiveness – only to be confounded by one another’s language. Policy makers are concerned with costeffectiveness: is intervention A cheaper to deliver than intervention B, and how much will it save in the long term? They speak in dollars and cents, not standard deviation units.In the latest edition of Educational Evaluation and Policy Analysis University of Wisconsin-Madison economist Douglas Harris presents a method for resolving some of the conflict between professional languages by proposing an approach that permits effect size to be assessed against cost-based standards.He argues that “a policy should be adopted only if there is no other way to create the same effect at a lower cost”. So the way forward is to calculate comparative cost-effectiveness ratios (CER) across a range of interventions – simply by dividing the effect size by the total cost of the intervention. Harris focuses on educational interventions, but he suggests that the method applies to other types of services or technologies, too. “By comparing the ratio of effects to costs for alternative interventions,” he writes, “it is possible to draw general conclusions and create benchmarks that aid in interpreting results about individual interventions.” These days, most efforts to improve the social interventions evidence base by setting standards of effectiveness recommend cost-benefit analysis, but it tends to be associated with relatively advanced levels of proof. So there is an assumption that it will be undertaken only when programs have satisfied other, more routine evaluation criteria. [For more about benchmarking, read our Standards of Evidence special issue.]Policy makers and researchers tend to have different time frames when it comes to understanding long-term effectiveness, he explains. Studies might measure effects up to a year or even two years after the intervention; in policy terms, the effects of many social interventions become significant only when a child enters adulthood with expectations to vote, enter the workforce and have children.Harris points out, secondly, that effects may decay or accumulate over time. Some may be significant in the short term but gradually wane; other skills learned may beget new skills, compounding the positive impact. Findings on long-term effects – which he defines as lasting five years or more – are preferable because they are more likely to indicate permanent intervention impact. A third timing issue is to do with the fact that society attaches greater value to immediate effects – “the dollar tomorrow is less valuable than one today” and so it is necessary to discount the value of future costs and effects. Coupled with the political requirement to consider effects at entry into adulthood, cost-benefit analysis favors longer-lasting interventions or those aimed at older children because their value to society is more quickly realized.Harris goes on to outline a number of assumptions that must be made about the decision-making or policy context in order for cost-effectiveness benchmarks to be useful. Firstly, he writes, policy makers should focus on a single outcome indicator as a basis for comparison. For example, student achievement might be taken as the central measure of educational objectives even though there may be a number of other relevant indicators. Secondly, because so few studies involve testing interventions across a number of different settings, policy makers have little choice but to treat them all equally in relation to the ease with which they might be taken to scale. Far too little is known about how well the vast majority of interventions will work in different settings, but multiple setting evaluations are too scarce to be useful in decision making. Finally, it must be assumed that policy makers understand the characteristics of the populations they serve and can target resources appropriately. In the case of educational programs, he suggests, “there is often political pressure to provide all programs to all students, which means that the ultimate decisions made are less cost-effective and therefore result in lower average outcomes”.But the real challenge remains the sheer lack of cost-effectiveness analysis undertaken in evaluations of social interventions. Delivery organizations and evaluators must be encouraged to do much more to chart the costs in terms of both human and economic resources.

Back to Archives