Last Friday, an analysis of the “value-added” contributions of public school teachers authored by a trio of economists, Harvard’s Raj Chetty and John Friedman and Columbia’s Jonah Rockoff, was unveiled with great fanfare, receiving prominent coverage by both the New York Times and the PBS’ Newshour. By all accounts, this study (hereafter, CFR) is both substantial, employing a database unprecedented in such work by its sheer size, and innovative in a number of new design elements and statistical measures. It appears that the large urban school district which is the subject of the CFR analysis is New York City, both because of the sheer size of the database (New York City is more than twice the size of the next largest school district, Los Angeles) and because of the fact that one of the co-authors, Rockoff, has a long history of working closely with the New York City Department of Education in the development and implementation of its now discredited and abandoned version of value-added measures, the Teacher Data Reports.
Given the importance of such a study, the method and timing of its release to the media is of particular note. To date, the authors have only presented their analysis in seminars, the Times reports. They have yet to submit CFR to a peer reviewed journal for publication, as is the scholarly norm for the public distribution of such research — it is just now available as a working paper on Harvard and NBER web sites. Sherman Dorn is certainly correct that there is little reason to think that a study of this magnitude is not already very close to what would be needed for publication in a peer reviewed journal, and that those critics who have hung their hat on that hook are bound to be disappointed. But it should be noted that scholarly norms of peer reviewed publication are designed not just to assure the quality of the work, the aspect that Dorn is invoking; no less important is the way in which these norms regulate the use and misuse of expert research in public policy debates, with the objective of ensuring that those debates are full and robust, characterized by public access to all of the essential information. This is particularly important with a study of this complexity, one which introduces a rather significant number of new elements into its field.
When CFR was released to the media in the fashion we witnessed Friday, before it had undergone the peer review publication, a full public discussion was deliberately preempted — whatever conclusions the authors want to draw in the media were presented without the slightest fear of challenge. Note that the first thoughtful takes on the study in the blogosphere — those by Bruce Baker at the School Finance blog, Matt Di Carlo at the Shanker Institute blog and Sherman Dorn himself, public intellectuals with the requisite skills in quantitative analysis to assess the study on its own terms — are appearing a day or two after the fact: it has taken that long to prepare what each of them readily concedes is a preliminary assessment of CFR. Yet when full critiques are produced down the road, they will receive a small fraction of the media attention given to CFR, if that.
This criticism of the abandonment of scholarly norms is of particular salience with respect to the release of CFR, because the authors are using the authority of their study to advocate public policy prescriptions which are simply not supported by their underlying analysis, even if one were to grant its validity. Take the three points that appear on the blackboard during the NPR story, as a summary of what CFR has demonstrated:
- Testing of students is a good tool for evaluating teachers’ success.
- Replacing a bad teacher can boost students’ lifetime savings.
- Could amount to hundreds of thousands of dollars more over their careers.
In fact, all value-added analysis of test scores, including that of CFR, assumes that standardized exams are accurate, reliable and robust measures of actual student learning, a necessary assumption if one is to use them as a measure of teacher performance. It is tautological to claim that an analysis proves what it assumes, especially when that assumption is precisely what is contested in the public debate over standardized tests and value added measures.
Further, even if one were to grant for purposes of argument that the value-added analysis of CFR correctly identified the lowest performing teachers, it is far from clear that the way to improve the quality of teaching is to fire and replace those teachers. Here the CFR trio have taken up the policy prescription of fellow economist Rick Hanushek, that it is possible not only to use value-added analysis to identify and fire the lowest performing teachers, but that it would be a simple exercise to find considerably higher performing teachers to replace them. Quite frankly, this is a mistaken policy prescription that emanates from a lack of understanding of the real world of teaching and education, not uncommon among economists who limit themselves to econometric studies that study our world at an extraordinary distance. At best, the value-added analysis of CFR is applicable to a minority of teachers — teachers of ELA and Math in grades 4 through 8; there are extraordinarily difficult, possibly insurmountable problems in extending it much beyond them. More importantly, since it takes a number of years for a teacher to master the fundamental skills of his/her craft, a reality that Chetty himself concedes as supported by the CFR analysis in his Newshour interview, the likelihood is that this method will disproportionately target novice teachers working in low performing schools which provide them little support, and then replace them with another set of novice teachers sent into the very same settings. In sum, this policy would simply increase the teacher churn and turnover which already plagues high need, low performing schools. An educationally grounded policy proposal would be to provide the professional development, the supports and resources, that would improve the performance of the great preponderance of the teachers in place.
Finally, as Dorn, Baker and DiCarlo all note, what CFR’s analysis identifies as “small but noticeable improvement in some young-adult quality-of-life measures” (Dorn’s characterization) are being extrapolated into such hyper-inflated claims as differences in “hundreds of thousands of dollars” over a lifetime. At the very least, this is highly speculative; logically, it is, as Dorn’s wryly notes, a trip through a “fallacy of composition fantasyland.”
One last word on the timing of Friday’s release of CFR. The choice of this particular moment, with the NYC DoE walking out of negotiations with the UFT over the teacher evaluation system for the 33 PLA schools, State Education Commissioner King’s disallowance of the teacher evaluation plans of all ten major urban school districts and Governor Cuomo’s announcement in the State of the State address that he would take up the matter of teacher evaluation, is telling. Combined with the assertion by the CFR authors that their study supported the mass firings of teachers identified as low-performing by value-added measures, this moment points to the political nature of the CFR authors’ abandonment of scholarly ‘peer reviewed’ norms for the publication of research. Their decision serves well their partisan political purpose. It is just the quality of debate and decision-making around important public policy choices that suffers.