Tuesday, September 9, 2008

Policy On Trial

First published,  Policy Magazine, Spring 2008.

Policymakers claim to develop programs that will benefit citizens. They claim, either implicitly or explicitly, to have certain knowledge of the causal relationship between the actions they plan to take and the outcomes they wish to achieve. This claim is emphasised when, as Prime Minister Kevin Rudd does so often, they express their wish to develop ‘evidence-based’ policy. It is well known in scientific circles that there is one gold standard technique for discovering such a causal relationship—the randomised trial. If policymakers want to be able to claim that their policies will work, they should subject them to randomised trials beforehand. Randomised trials present the policymaker who genuinely wants to know how to make a difference with a powerful and irrefutable tool to put his or her theories to the test and to draw fact-based conclusions from them. 

 

Randomised trials are the least random and most scientific method known for testing a hypothesis. They are the epitome of rational inquiry. In a randomised trial, the burden of proof is placed on the facts themselves and ideology, beliefs and vested interests are put to one side. In a randomised trial, the truth, as indicated by the data and as revealed by the experimental design, is laid bare for all to see and the facts are allowed to speak for themselves. Randomised trials, preferably double blinded and placebo controlled, have been the benchmark for scientific inquiry since R. A. Fisher’s ground-breaking work in the 1920s. Clinical trials are mandatory for every drug approved by the Therapeutic Goods Administration. In short, except for trivial and self-evident cases, the randomised trial is the one and only means of establishing a cause-and-effect relationship between one phenomenon and another.

 

What are randomised trials and what can they do?

A randomised trial starts with a hypothesis—a statement of fact that is put to the test in the trial. For example, one randomised trial in Kenya tested the hypothesis that the provision of text books would raise students’ test scores.[1] (They didn’t.)  Another in the Philippines tested whether or not regular visits from a bank representative would increase household savings.[2] (They did.) Stating a hypothesis can itself be problematic for policymakers, because it invites them to move from vague statements of intent to a specific measurable outcome they wish to achieve.

 

The second aspect of a randomised trial is to test the hypothesis by randomly selecting two groups of people. One group receives the treatment (the text books or the visit from the bank representative) and one group does not. The appeal of the randomised trial lies in the fact that the two groups are as alike as possible in every respect—geographical location, gender, age, socioeconomic status, education, and so on—except whether or not they receive the treatment. Thus, if after the treatment has been applied, a significant difference between the two groups is found, that difference can be attributed to the treatment and to the treatment alone. The researcher can conclude that the treatment caused the difference. This is a much stronger conclusion than discovering that the response and the treatment are merely correlated. A causal effect has been established, and therein lies the power of the randomised trial.

 

In the past, social experiments—such as the negative income tax experiment in the United States in the 1960s—have been conducted on grand scale with high ideals and enormous budgets. In contrast, the current trend is for randomised trails to address very specific questions and to be conducted on a small budget with minimal sample sizes. This makes the randomised trial a potent tool for economists working in developing countries.

 

What randomised trials can’t do

Randomised trials are not applicable in all situations. There are two main areas in which randomised trails are not able to test a proposed social policy. The first is in trying to assess the effectiveness of very long term policies. Claims that an intervention will increase the life expectancy of certain groups of people, benefit future generations, or affect global warming are not testable by randomised trail. Randomised trials are also not applicable to policies that are not repeatable. The benefits or otherwise of going to war, holding the Olympics in a certain city, or signing an international treaty are not repeatable and therefore not testable by randomised trial. But this still leaves a vast array of policies that could easily be subjected to randomised testing.

 

Are there alternatives?

Some claim that there are attractive alternatives to randomised testing, the main candidates being observational studies, pilot programs and surveys. A pilot program in which the intended intervention or treatment is applied to a small sub-population to test its efficacy has one major drawback. Having applied the treatment and seen an improvement in the desired outcome, the researcher usually goes on to assert that the treatment caused the change in response. In doing so, however, the researcher is implying that he or she knows how the targeted population would have fared in the absence of the treatment. In fact, there is no way of knowing this and therefore pilot programs are not able to establish a causal relationship between the treatment and the effect.

 

Observational studies are also proposed as valid alternatives to randomised trials. Yet, not only are observational studies unable to establish a correlation between two phenomena, they are also subject to bias.  With an observational study, there is scope for researchers to look for, discover, and report findings that fit with their preconceived views. They may choose to overlook or nor report findings that do not agree with their previous publications and they may choose to include certain co-variates in their regression analyses that corroborate the conclusion they wish to find. I am not commenting on the prevalence of such biased researching methods but merely indicating that observational studies contain within them scope for such bias.

 

In contrast, randomised trials, if rigorously conducted, are not open to such abuse. In a well-conducted randomised trial, the hypothesis should be stated and publicised beforehand. The finding of no effect is important information because it establishes the absence of a causal link so results tend to be published whether or not the treatment proves to have a statistically significant effect.

 

A survey is also a poor alternative to a randomised trial. Surveys are notoriously unreliable at predicting the outcome of planned interventions. Asking people how they think they would react if a certain change were to be made in some aspect of social policy is one thing. It’s quite another to apply the treatment and observe how people actually react. Life is full of unexpected consequences and the only reliable way to discover the reactions to social interventions is to trial the intervention first. Surveys are also subject to selection bias. Only the views of those who choose to respond to the survey are recorded and analysed, but these people do not always comprise a sample representative of the entire target population.

 

Are there limitations to randomised trails?

Randomised trials are an effective means to answer micro-economic questions. They will tell you about the efficacy of a single planned intervention in a particular stetting. They will not tell you much about macro economic strategies, nor will they be able to predict the interactive effect of a large number of policies. Randomised trails are not the one and only sound way to develop good policy. They should be viewed as one (very powerful) solution in the economist’s toolkit. Having said that, the scope of randomised trails can be very wide. If the experiment is well designed the outcome of the trial will answer the question you are seeking to address. The results of randomised trial have been criticised as being too narrow and not generalisable.[3] If randomised trails are promoted as the silver bullet for poverty alleviation then this is a fair criticism. If they are viewed as an additional weapon in the economist’s arsenal it does not hold water. Particularly in developed countries such as Australia, in which macro economic question such as long term growth and interest rates are well addressed by other means, randomised trials in to examine  micro-economic issues have particular relevance. 

 

How have randomised trials been used elsewhere?

One of the most outstanding examples of randomised trails in social reform is the Progressa (later known as Oportunidades) Program in Mexico.  The aim of the program was to ‘close the gap’ between rich and poor in Mexico in terms of nutrition and education. The program was planned as a randomised trial from the outset because the incumbent president knew that without hard evidence the program would not survive a change of government.[4] A secondary consideration that led to Progresa being implemented as a randomised trial was that budgetary constraints meant that the program could not be delivered to all familles which might have benefited from it.[5]  What could have been seen as a deficiency was turned into a positive attribute through randomisation.

 

Funding was made available to poor rural familles for education and improved nutrition. However, funding was conditional on attendance at both school and a government-funded infant health clinic. Independent consultants from the International Food Policy Research Institute were engaged to evaluate the trials and compare the families which were offered the incentives with those which were not. The results have been encouraging and the program has been expanded into urban areas and extended to target youth up to the age of 22.[6]  

 

This program has a number of striking features. Firstly, it worked!  The families who received the conditional funding benefited significantly from the intervention. This might sound obvious but there has been plenty of funding of programs that have not made people better off. Secondly, we know they benefited because of the program. The improved outcomes cannot be attributed to another cause because the control group, who were like the treatment group in every other respect, did not benefit. Thirdly, the evidence was so overwhelming that the program survived a change of government. Objective evidence proved to be more persuasive than ideology.

 

The Progressa/Oportunidades program is just one example of randomised trials being used to test social policy. Randomised trials have been used to test policies as diverse as the effectiveness of driver education programs[7], the effect of class size,[8] and phonics vs whole language reading tuition.[9]

 

Randomised trials are becoming increasingly well established in social policy assessments. There is now a think tank solely dedicated to such trials, the Abdul Latif Jameel Poverty Action Lab (J-PAL) at MIT.[10] J-PAL has run randomised trials to test many social programs, mostly in developing countries. Issues examined include: the effect of remedial education programs on school quality and test scores; the effect of micro-credit in Hyderabad slums and a comparison of electronic surveillance, documented teacher attendance, and  incentive pay as means to improve student performance.

 

For-profit micro credit institutions are also turning to randomised trials to test the best ways to serve their markets. The Centre for Micro Finance in India has coordinated a number of random trails on micro credit financial projects. Projects include a trial of smokeless cooking stoves as an alternative to tradition cooking methods that lead to serious respiratory infections in many young children; a trial comparing the difference between weekly and monthly repayment schedules on loan default rates; and a trial measuring the impact of micro-health insurance products on clients and their families.[11]

 

How have randomised trials been used in Australia?          

Despite some recent interest, randomised trials are yet to be used extensively in Australia[12]. However policymakers here have experimented with randomised trials a number of times. Between 1999 and 2001 the Department of Family and Community Services conducted two randomised trials on the Job Network, examining the effect of interviews and follow-up contact from professional staff on workforce participation by the long-term unemployed.[13],[14] They found that the intervention led to a reduction in the number of hours worked but an increase in the number of hours spent in studying or training.

 

In 2002 the effectiveness of the NSW drug court in reducing recidivism was tested in a randomised trial.[15] 514 offenders who met certain criteria were randomly assigned to either the standard court system or the NSW drug court, which took them through a detoxification program. The trial showed that not only did the Drug Court reduce recidivism, it was also more cost effective when measured in cost per offence averted.

 

Is there scope for further randomised trials in Australia?

In theory, the time is ripe for randomised trials in Australian politics. Kevin Rudd speaks often about his preference for ‘evidence-based’ policy.[16] The government’s responses to the 2020 summit are to be built on ‘a strong evidence base’.[17] A raft of new policies is being introduced by an enthusiastic newly elected government. Many of these policies are candidates for testing by randomised trials. Let’s examine two proposed policies which lend themselves to objective testing.

 

Behind the introduction of the national welfare card lies the following hypothesis: ‘Making welfare payments available to delinquent parents through a national welfare card will benefit the children of these parents.’ Some agree with this policy while others doubt it will work.[18] The hypothesis would need to be more clearly defined to be tested by randomised trials and the exact benefit that is supposed to accrue to the children would need to be specified. Once this has been done there is no reason why the hypothesis could not be tested. As child protection authorities identify delinquent parents, each family could be randomly assigned either to a control group with no curb on their welfare spending, or to a treatment group that receives welfare payments via the card. The hypothesised good that is supposed to accrue to children could be measured before and after the trial and the efficacy or otherwise of the welfare card could be determined.

 

A similar analysis could be applied to the provision of high speed internet access to schools, another Rudd government initiative.[19] The hypothesis behind this initiative is that it will benefit students; that is, it will improve their grades. By randomly assigning high speed internet access to one group of schools and leaving another group as it is we could discover if such technology makes any difference to student achievement.

 

Such a proposal will no doubt raise objections. On what grounds could the government possibly deny schools access to high speed internet access? Wouldn’t that be inequitable? This assumes that high speed internet access is beneficial to students, the very question the trial is designed to test. Temporarily denying a group of people a service which may or may not benefit them is a reasonable price to pay to discover if it is actually beneficial.

 

Clearly, such randomised trials would be one of the most effective uses of public funds. Instead of rolling out untested programs which have not been proved to deliver benefits but which draw heavily on taxpayers’ funds, the government would be judiciously screening proposed new programs before they are introduced on a wider scale.

 

Why are randomised trials not being used?

In the financial year 2005-2006, the Federal government spent an estimated $90.2 billion of taxpayers’ hard-earned cash on programs that purported to benefit Australian one way or another.[20] None of these programs was tested by randomised trial. A number of factors make randomised trials unattractive to politicians.

 

Making a real difference is hard and many randomised trails often show that the treatment made no difference. There are two ways of looking at this. One is to celebrate that fact that the treatment is now known to be ineffectual and that it can be discarded as a possible solution to the problem. One could also acknowledge that without the trial many taxpayer dollars could have been wasted on a ‘solution’ that was no solution at all. Alternatively, one could take the view that the experiment was a ‘failure’, the researcher’s hypothesis was ‘wrong’, and that funds that could have been better used elsewhere had been squandered on a frivolous investigation which bore no fruit. The former interpretation of the outcome is based in the scientific method. Unfortunately, the media, with its propensity for a bad news story, often favours the latter.

 

Suppose, however, that the randomised trial shows that the treatment gives significant benefit to the participants. Suppose the national welfare card really does benefit children or that broadband internet access really does improve student grades. Surely that must be a coup for the government. Not necessarily so. They may find themselves open to accusations of withholding a beneficial treatment from the control group. In retrospect this is true, but at the time the randomised trial was conducted it wasn’t known whether or not the treatment was beneficial. Unfortunately such subtleties are often lost on the popular press and consequently it is understandable that randomised trails are not seen in a favourable light by many politicians.

 

These are not the only reasons randomised trials are unpopular with politicians. Our elected representatives like to be seen as decisive, energetic and positive; especially when there is a crisis. They like to be seen as doing something about problems and demonstrating leadership where others will not. Randomised trials take time to implement, require an investment of time and money, show no immediate results, and are based on the premise that no one actually knows what will work. The fact that they may lead to certain knowledge about real solutions is often not enough to recommend them to many politicians.

 

Since governments spend far more on implementing social policy than any other body in Australia it would be preferable if they were the primary champion of randomised trials. But because of the political and ideological factors mentioned above this is unlikely to happen in the short term. It is more likely that NGOs or charities may be open to possibility of testing their interventions through randomised trails. NGOs are less subject to popular opinion and are under no obligation to be seen to be benefiting the entire population. Therefore, small-scale randomised trails may possibly fit within their charters. They may also find randomised trials an attractive means of providing hard evidence for the efficacy of their programs and thereby attract additional funding.   

 

Conclusion

There is little doubt that randomised trials are the best way of establishing causal effect between one phenomenon and another. Because of their inherent sophistication, there are serious challenges that need to be overcome before an elected body in Australia will take up randomised trails as a means to test the efficacy of proposed social policy. However, if elected officials really want to make a difference, and not just be seen to be making a difference, this is exactly what they need to do.

 

Ross Farrelly is….. The author thanks Andrew Leigh for listening to, and discussing, some of the ideas contained in this article.

 

Endnotes



[1] Glewwe, P., Kremer, M. and Moulin, S., Many Children Left Behind? Textbooks and Test Scores in Kenya, 2007, http://www.povertyactionlab.com/papers/Textbooks%20and%20Test%20Scores%20Kenya.pdf .

[2]  Ashraf, N., Karlan, D. and Yin, W., ‘Deposit Collectors’, Advances in Economic Analysis &

Policy, Volume 6, Issue 2, Article 5, 2006,   http://ipa.phpwebhosting.com/images_ipa/DepositCollectors.AshrafEtAl.2006_1.pdf

[3] ‘Control Freaks’,The Economist, Jun 12th 2008, http://www.economist.com/finance/displaystory.cfm?story_id=11535592

[4] Ayres, I., Super Crunchers, 2007, p 76.

[5] Dufloy, E., Glennersterzand, R. and Kremerx, M., Using Randomization in Development Economics Research: A Toolkit, 2006, p20, www.povertyactionlab.com/papers/Using%20Randomization%20in%20Development%20Economics.pdf

[8] ibid.

[9] ibid.

[11] Brochure of the Centre for Micro Finance, http://ifmr.ac.in/cmf/CMF_Brochure.zip

[12] A recent conference, New Techniques in Development Economics, at the ANU included a session entitled, ‘The Economics, Ethics and Politics of Randomised Policy Trials’, http://econrsss.anu.edu.au/developmenteconconf.htm.

[13] Barrett, G. and D. Cobb-Clark (2001), ‘The Labour Market Plans of Parenting Payment

Recipients: Information from a Randomized Social Experiment’, Australian Journal of

Labour Economics 4(3):192-205.

[14] Breunig, R., D. Cobb-Clark, Y. Dunlop and M. Terill (2003), ‘Assisting the Long-Term

Unemployed: Results from a Randomised Trial’, Economic Record 79(244):84-102.

[16] As of 20/05/2008, the phrase “evidence based” occurs six times on the official prime ministerial website. http://www.google.com/search?sourceid=navclient&ie=UTF-8&rlz=1T4SKPB_enAU217AU217&q=%22evidence+based%22+site:pm%2egov%2eau

[17] Australia 2020 Summit Initial Summit Report, http://www.australia2020.gov.au/docs/2020_Summit_initial_report.doc, p 38

[18] Karvelas, P., ‘Welfare curbs on parents’, The Australian, May 09, 2008, http://www.theaustralian.news.com.au/story/0,25197,23668892-601,00.html .

[19] First 100 Days, Achievements of the Rudd Government, February 2008,   http://www.pm.gov.au/docs/first_100_days.doc p 5

[20] Hargreaves, J., Welfare Services Resources: Financial and Human, 6th December 2007,p 5, http://www.aihw.gov.au/eventsdiary/aw07/presentations/jenny_hargreaves_welfare_services_resources.pdf

No comments: