• By Kevin Mount
  • Posted on Friday 28th March, 2008

If you don’t know what broke, how can you fix it?

Blueprints conference participants generally want to know what a program needs to do to make the grade. As we reported at the beginning of the week, two more have lately entered the Promising Programs list, and there’s a natural tendency to focus on the recipe for that kind of success. [See: Promising to help kids turn their backs on the bottle.]But improving children's health and development is not simply a matter of implementing what works. It’s also about learning from faults and failures. Similarly, the successes need continuous scrutiny. They may pass the Blueprints test at a certain critical moment, but new trials will follow and the more widespread and thorough they are, the more variable the effects are likely to be.Two instructive examples of the consequences of discovering what does not work were reported at the Denver conference.The first concerned Quantum Opportunities, a mentoring program that aims to establish meaningful long-term relationships between struggling high-school children and a mentor or case manager. It is also designed to develop community commitment and involvement with the child's school.Initial experimental evaluations showed substantial positive effects, and in due course Quantum Opportunities came to be counted among the Blueprints Model Programs. Armed with the results from the first series of evaluations, a more extensive ten-site randomized controlled trial was undertaken. Expectations were high, although it generally happens that when programs move up a gear from their development phase to full-scale implementation, effect sizes are considerably reduced. The change is usually taken to reflect the absence of any marked placebo effect, lower levels of motivation among staff and inevitable deviation from the model.However, in the case of Quantum Opportunities the initial positive effects disappeared entirely. Across the ten sites the picture was mixed. Some children benefited, but there was no appreciable improvement that could be attributed to the program.Such results are naturally deeply disappointing to a program provider, but they are valuable nonetheless, and they deserve to be viewed with interest by scientists, policy makers and practitioners.Exactly why was it that the level of effectiveness deteriorated so much between the earlier and later evaluations? Because so much attention is focused at the threshold between successful and unsuccessful programs, we do not know. We can only speculate. One possibility is that as the program was being scaled up fidelity to the original design suffered. Such variability is the most common explanation for deteriorating results. But the evaluations did not measure fidelity.The Blueprints panel have encouraged Quantum Opportunities to explore some of the reasons. Del Elliott recommended deviant case analysis – investigating those sites where the results were poorest, alternatively examining components of the program that appeared to work less well in the later evaluations. Such analysis has not so far taken place.The second example shows how the same scrutiny can have a contrasting outcome. In 2005 Julia Littell, Professor of Social Work at Bryn Mawr College in Philadelphia, prepared a systematic review for the "gold standard" Cochrane collaboration that found Multisystemic Therapy (MST), another Blueprints Model Program, wanting. [See: Can a systematic review be any better than the work it’s based upon? and Norwegian researchers find flaws in 'gold standard' program review.] But in this superficially similar case Blueprints did not remove MST from the list. Why so? To get on to the Blueprints lists an established threshold must be crossed. To reverse the process, the same threshold is applied. Blueprints commissioned Steve Aos, Assistant Director of the Washington State Institute for Public Policy, and a keynote speaker at the conference, to repeat the analysis undertaken by Littell. He found that the positive effects originally observed stood up, and that Littell's contrary findings could be explained by a “deviant case”, an example from Canada that made a disproportionate impact on the meta-analysis.These tales of woe and doubt carry important messages for prevention science and its application to policy and practice.Plainly, negative results are as important as positive ones but scientists and policy makers will sometimes collude to conceal them. They are less likely to be published and nobody is particularly keen to talk about them.Too many evaluations are set up with the purpose of demonstrating that something works. Good evaluations not only establish why something works, but also, by the same reckoning, expose what is going on if results are negative.A lot of the discussion at Denver was about the standard of evidence that will propel a program on to one or other list of effective programs. A forthcoming Prevention Action special will consider that issue in more detail.But in all such discussions, there is less sharp focus on the corollary, namely that the standard for inclusion is the same as for exclusion. Experimental evaluation provides the necessary channel in both directions. Most of the Blueprints conference and much of the Prevention Action coverage that went with it was about celebrating success. In advancing the cause of prevention, we must also give failure its due.

Back to Archives