There has apparently been a lot of concern that the three influential studies of the 1960s and 1970s were very much compromised by small sample sizes and what the experimenters call "contamination", when the experimental protocol is not followed exactly. This is what we had, apparently, in FAVL's study of the effects of the summer reading camps of 2008, where some smallish number of the assignments were changed from the random assignment, because if someone who was assigned was not in the village, they were replaced not with a random pick but with someone the team knew would be around in the village. So there was some bias, and this was picked up in some of the pre-program test scores, which were higher for the campers than for those invited to the discussion groups and those who got free books to read.
After re-analysis, Anderson concludes:
The results demonstrate that preschool intervention has significant effects on later life outcomes for females, including academic achievement, economic outcomes, criminal behavior, drug use, and marriage. The effect on total years of education is particularly strong. However, while treatment effects are sizable for females, they are minimal or nonexistent for males - a fact relevant to the design of optimal human capital policy.So a strong gender effect, which is very interesting and important, though not explained..
But the most interesting par tfor me is the method to control for contamination.
A thorough analysis of threats to validity, conducted in Appendix A, concludes that the main results are unaffected by reasonable assumptions regarding attrition, violation of random assignment, and clustering.What Anderson does is assign values to the outcomes (in our case the test scores) for those missing. Typically he assigns outcomes in ways that are favorable to the null: the treatment group is assigned the 25th percentile score, which the control group missing are assigned the 75th percentile. Then he computes the new differences in means and statistical significance. And why not? So tomorrow I hope to run this and see how it affects results. We have more missing than he does, but larger sample sizes.