• By Dartington SRU
  • Posted on Monday 12th March, 2012

Don’t despair - investigate

Recent years have seen impressive advances in the design and development of effective social, behavioral and educational interventions. But the science of detecting the impact of programs continues to develop, as does understanding of the extent to which results found in one setting transfer elsewhere. Mark Fraser and a team from the University of North Carolina, USA, have provided an overview of emerging issues and challenges. Randomized controlled trials remain the gold standard for evaluating the effects of programs on outcomes, but they are not always possible. Statistical advances now provide new methods for modelling outcomes in such cases.RCTs are the best way of making valid causal inferences in outcome research because they produce a balance between groups in observed and unobserved variables. Non-randomized studies suffer because who gets the intervention is dependent on pre-treatment characteristics that may affect outcomes – young people with more or less serious problems, or those with greater or lesser motivation to change for example.This is where methods that make it possible to balance treatment and comparison groups come in. Unlike RCTs such methods cannot control for unmeasured factors, but they are far better than nothing and their use is increasing rapidly.Fraser and his team say that another problem in most outcome evaluations is that analyses focus only on average findings. This can mask treatment effects for sub-groups, especially those that fit high-risk profiles. Again, the team point to the increasing use of methods to address this, known collectively as Person-Centered Estimation.Then there is the issue of so-called “rater effects.” Studies often use outcome data collected from caseworkers, nurses, teachers and other practitioners. The underlying assumption is that different raters make similar judgements – they are consistent. This may not be true, however. Some teachers might apply rating rules more stringently than colleagues, for instance, or rate one student more leniently owing to favoritism, or be inconsistent in their ratings from one day to the next.Such rater effects need to be disentangled from true change because they can affect the estimate of outcomes. Again, there is increasing use of statistical methods to control for such effects.The issues identified thus far all concern how far we can trust the evidence of effects shown in studies. But it is also necessary to consider challenges for making generalizations about the effects of proven programs. The adaptation of programs to context and culture is increasingly recognized as a central element of replication, and there is also a growing focus on adapting settings to programs.Fraser and his team point out that until now there has been an expectation that program effects identified in a given study will be applicable to other settings and populations; indeed, this is a premise of evidence-based practice. But some evidence-based programs developed in the USA have produced mixed results in international replications. Positive findings in the USA for Multi-systemic Therapy, for example, were replicated in a randomized controlled trial in Norway but not in trials in Sweden and Canada.Could this mean that the research designs are flawed, or there is low implementation fidelity, or even a lack of fit with a different population? It depends, of course, but according to the North Carolina team: “In outcome research focused on replication, ensuring the cultural and contextual congruence of programs may be an underestimated challenge.” They argue that the task of pre-empting this problem includes reviewing core risk mechanisms, scanning for additional culturally-specific risk mechanisms, adapting program content to fit the local culture and the organisational and policy contexts, and developing new program content as required.But it is not just about adapting programs. In many outcome studies effects are assessed shortly after programs are implemented for the first time. This is problematic because new programs often present a radical departure from services as usual and demand adjustments to long-standing organizational processes. Practitioners might be required to work more flexibly, for example, or to learn new skills or they may need new training, yet often there is very little opportunity for such change, with far-reaching consequences: “Research of this sort is vulnerable to implementation failure and to attributing effects to programs that were incompletely implemented” explain Fraser and his colleagues.In other words, concluding that a program does not work may be misleading because the way in which the program was introduced to an existing service system was unhelpful. Clearly there are important lessons here for policy makers and service providers. But there is also something here to ponder for researchers.“Intent-to-treat” analysis is regarded as the gold standard in randomized controlled trials because it represents the pattern of implementation that would be observed in routine application. Yet, according to the North Carolina team, this is rarely the case. In routine application a new intervention would be implemented over several years and organizations would make incremental accommodations to the demands of the new program. The system wouldn’t change overnight but it would evolve to fit the program.Given this, it may be appropriate to analyze those who received the intervention as intended, in addition to intent-to-treat analyses. Testing for whether stronger or better doses of the program yield a stronger effect could, the authors argue, help avoid false attributions of program failure.***********Reference:Fraser, M. W., Guo, S, Ellis, A. R., Thompson, A. M., Wike, T. L. & Li, J. (2011). Outcome studies of social, behavioural, and educational interventions: emerging issues and challenges. Research on Social Work Practice 21 (6), 619-635.

Back to Archives