Log in  |  Search

Setting The Record Straight On Teacher Evaluations:
Scoring and the Role of Standardized Exams

(This is the first of two posts on the new teacher evaluations, focusing on the overall scoring of the evaluations and the role of standardized exams. The second post will take up the question of appeals.)

The 2010 law that established a new framework for the evaluation of New York educators was a complex piece of legislation, and last week’s agreement to clarify and refine that law with additional legislation added another layer to that complexity. The complexity is unavoidable. It is important to have evaluations based on multiple measures of teacher effectiveness, just as it is important to evaluate students based on multiple measures of their learning: more measures and more forms of evidence produce more robust, more accurate and fairer evaluations. Further, multiple measures allowed New York to avoid placing inordinate weight on standardized exams and value-added algorithms, as other states have done to very negative consequences. And it was essential that the bulk of the evaluations be established locally through collective bargaining, with the law only providing a general framework. These objectives necessarily led to a high level of complexity.

But with that complexity, New York is on the road to teacher evaluations that will engage educators in meaningful professional dialogue, provide them with essential supports, and give them the tools to hone their craft. With evaluations based on multiple measures, evaluations will be more comprehensive, more accurate and fairer, and in sharp contrast to other states such as Florida and Tennessee, the role of standardized testing in the evaluation will be minimized. With collective bargaining playing a key role in the shaping of “on the ground” evaluations, teacher unions have the input that will allow us to protect the educational integrity and fairness of the evaluation process.

Unfortunately, complexity has provided a fertile ground for commentaries on the New York teacher evaluation framework that reach alarmist conclusions, with arguments built on a foundation of misinformation and groundless speculation. A widely circulated piece by Long Island Principal Carol Corbett Burris, published on the Washington Post’s Answer Sheet blog, is in the thrall of this alarmist alchemy. Burris decries the law and last week’s agreement as allowing “test scores… to trump all.” Under its scoring, a teacher could be “effective” in all components of the evaluation and yet still receive an overall rating of “ineffective.” The law, Burris concludes, is creating an evaluation system in which schools and students will “lose great teachers.” At the Bridging Differences blog, Diane Ravitch has now taken up Burris’ argument, repeating her main points as gospel.

To comprehend why Burris’ conclusions are problematic, it is necessary to understand the general framework of the teacher evaluation as laid out in the 2010 law and last week’s agreement. Teacher evaluations are based on a 100 point scale, with 60 points of the evaluation based on measures of teacher performance and 40 points based on measures of student learning. These two general categories are further divided into different components.

The measures of teacher performance must include supervisory observations of lessons that utilize a research-based framework of teaching such as the Danielson framework, and these observations must account for at least 31 of the 60 points. But they can also include a variety of other measures which may account for as much as 29 of the 60 points, including peer observations that use the same research based framework for teaching and portfolios of artifacts of teacher performance such as lesson plans and student work. Districts such as Rochester have already negotiated the inclusion of peer review in their measures of teacher performance.

The measures of student learning are divided into two: measures developed from state assessments, worth 20 points, and measures developed from local assessments, worth 20 points.[1] For those teachers who teach English Language Arts and Mathematics in grades 4 through 8, the state assessment measures will be a value added growth measure derived from the state’s standardized exams in those subjects. These are a minority of all teachers. A different set of measures called student learning outcomes will be used as the state measure for all other teachers, based either on existing standardized exams or on local assessments aligned with the common core standards. The local assessment measures may be based on entirely different assessments of student learning, such as performance assessments, provided that they are “rigorous and comparable across classrooms.” Or, as a result of last week’s agreement, they may be a new measure, different from than the state’s value-added growth measure, but still based on the results of the states’ standardized exams.

Here then is a schematic of the general framework of teacher evaluations in New York:


(60 of 100 points)


(40 of 100 points)

Minimum of 31 Points Up to 29 points 20 Points 20 points
Supervisory Observations Other Measures such as Peer Observations and Portfolios of Artifacts of Teacher Performance For Teachers of ELA and Math, Grades 4 through 8:
Value-Added Growth from State Standardized Exams
For All Teachers:
Growth on Local Assessments, such as Performance Assessments



For All Other Teachers:
Growth Measures on “Student Learning Outcomes”
For Some Teachers:
Different Measures of Growth from State Standardized Exams

In all of the complexity of these multiple measures, there is one essential point to remember: 80% of the total evaluation – the measures of teacher performance and the measures of student learning based on local assessments – are set through collective bargaining at the district level. This provides teacher union locals with an essential and necessary input into teacher evaluations, allowing us to ensure that they have educational integrity and are fair to teachers.

We are now in a position to see the full dimensions of the misinformation and groundless speculation in Burris’ argument. Three central issues present themselves.

First, Burris incorrectly assumes that the entire 40 points in the measures of student learning will be derived from standardized state exams. But the use of value-added growth measures from state standardized exams need not take up more than 20% of the total teacher evaluation – and then only for a minority of teachers, those teaching English Language Arts and Mathematics, grades 4 through 8. Standardized state exams can only be used as the basis for the local measures of student learning if the union local agrees to their use in collective bargaining. I know of no significant New York district where the local union has agreed to the use of standardized state exams as the basis for the local measures of student learning. In New York City, the UFT has taken the position that under no circumstances would we agree to the use of standardized state exams for the local measures of student learning; at the point that the negotiations for the evaluation system for the 33 Transformation and Restart schools broke down over the appeals system last December, we were developing high school and middle school performance assessments for the local measures.

Indeed, insofar as the state’s Student Learning Outcomes could be operationalized with local performance assessments aligned with the common core standards, standardized exams could well play NO role in the evaluation of the majority of teachers. The reality of teacher evaluations under the New York law is thus significantly better than what is found in other state evaluation systems – and dramatically at odds with Burris’ vision of standardized state exam scores “trumping” everything else.

This reality provides important context for the vexing issue of scoring bands. At the behest of Governor Cuomo, the New York State Education Department set overall scoring bands for the teaching evaluation system which are quite stringent: very low scores in both the state and local components of measures of student learning (0, 1 or 2 out of a possible 20 in both components) will lead to an overall ineffective rating, regardless of how a teacher scored on the measures of teacher performance. If both components were based solely on standardized test scores, using unreliable value-added models with high margins of error, as Burris incorrectly claims, these scoring bands would have the potential of producing unfair ratings among outlier cases. But with at least one of these two components being a local assessment that, as it is collectively bargained, should be an authentic assessment of student learning, this objection does not hold. Teachers and their unions have always said that we wanted to be responsible for student learning – our objection was to the idea that standardized exams provided a true measure of that learning. With the inclusion of authentic assessments of student learning, student achievement must be a vital part of our evaluation.

A compelling approach to the issue of using value-added scores in teacher evaluations is found in the Hechinger Report blog post of Columbia University sociologist Aaron Pallas. Pallas sensibly suggests that where value-added models of standardized test scores are included in a teacher evaluation, the scoring needs to take into account the margin of error in a teacher’s score.

Second, Burris’ commentary ignores the ways in which the New York teacher evaluation law turns over the scoring of different components of the evaluation to local collective bargaining. On the measures of teacher performance, worth 60 points, the selection both of the research-based teaching framework for observations and of the HEDI (Highly Effective, Effective, Developing and Ineffective) scoring cut points for that framework are the subject of local collective bargaining, as are the selection, weighting and scoring of measures other than supervisory observations. On the measures of student learning, both the selection and the scoring of the local assessment are the subject of collective bargaining. The law thus gives local unions the means to prevent the very sort of scenario Burris plays out in her piece, where a teacher is effective on all the measures of teacher performance and all the measures of student learning, yet still receives an overall rating of ineffective. But Burris simply ignores the collective bargaining requirements and speculates that a scoring range for the measures of teacher performance will be established that, conveniently, produce the results that makes her scenario work. Is it really necessary to note that teacher union leaders with substantial experience in collective bargaining know how to do simple math, and would not agree in collective bargaining to scoring bands for teacher performance that would produce such an incongruous and unfair result?

Third, in her descriptions of different hypothetical teachers who would be harmed by the new teacher evaluation framework, Burris suggests that teachers who teach students with learning challenges, such as English Language Learners and students with special needs, would be harmed by the new teacher evaluation. That would be true if the measures of student learning were based on the extraordinarily naïve notion that all students can meet the same academic standards in the same time frame, regardless of their learning challenges and their prior learning. But there is no evidence that such educational naiveté informs the New York teacher evaluation framework. When the UFT was working on developing performance assessments as the local assessments for the 33 Transformation and Restart schools, one of our agreements with the NYC DoE was the development of a system of weighting that would account for the academic challenges of a teacher’s students. And certainly one of the very first ‘validity’ tests of a value-added model of growth would be its ability to account for those academic challenges.

While a change of the complexity required by the new teacher evaluation system is daunting, it should not lead us to romanticize a failed evaluation status quo. As it now stands, evaluations are based on a single measure, the principal’s subjective rating of the teacher. In arriving at this rating, the principal may employ any framework or standards s/he finds fitting to observe and rate the teacher. A teacher who is fortunate to have an educator with integrity as his/her principal will have little to fear from this evaluation process. Yet even in these cases, the teacher rarely receives the feedback and support that will allow him/her to grow as a professional educator. This is especially the case for novice teachers and teachers experiencing difficulties in their classrooms, as they seldom receive the support they need to develop their craft and become skilled teachers. And in an era when the powers that be no longer deem educational experience and accomplishment to be the requisite qualities of a principal, far too many teachers find themselves with an unqualified principal, and are victimized by a politicized evaluation process that has precious little to do with education. The evaluation status quo is failing New York students and teachers, as the next post in this series – on the appeals process – will make clear. Change is necessary.


[1] For now, these two components are both 20 points. The law envisions that once the State Education Department has developed a valid value-added model for measuring growth in student learning, which it has yet to do, the state component can grow to 25%, while the local component would shrink to 15%.



  • 1 Bronx Teacher
    · Feb 22, 2012 at 5:11 pm

    How do we evaluate the gym teacher? The art teacher? The music teachers? A school’s IEP teacher? The Literacy coach? The math coach? How about those teachers that do home schooling? How about the ATR’s? What about those teachers that work at CFN’s? Tweed?

  • 2 Leo Casey
    · Feb 22, 2012 at 5:19 pm

    The teacher evaluations are now focused on classroom teachers, so teachers assigned, such as someone working on a CFN, will come later. But your point about physical education, art and music teachers is important, and it goes straight to the fallacy of standardized test scores determining everything, as there clearly are no standardized test scores for these subjects. What we were working on was performance assessments, that would look at the sort of exercises students should learn to do in these classes.

  • 3 BronxTeacher
    · Feb 22, 2012 at 5:40 pm

    Excercises? What like in art, how well a child draws? Isn’t art a clearly subjective discipline? Will a phys ed teacher be judged on a students athleticism?

    Why would anyone want to be a classroom teacher now?

    Leo, I would like to discuss this further and more in depth with you.

  • 4 Neil Friedman
    · Feb 22, 2012 at 5:55 pm

    Thank you, Leo. I think the Union would do its members a service if isn’t teams into schools to explain this to our members. There is a lot of; confusion, anxiety, worry and fear in the ranks due to all the misrepresentation and speculation going around about the Evaluation & Appeals process . If, has Leo, says, the proposed evaluation system will empower teachers then teachers need to understand how and why they do .


  • 5 Remainders: Advocates call attention to city’s Asian students | GothamSchools
    · Feb 22, 2012 at 6:50 pm

    […] UFT VP Leo Casey analyzes the state’s evalss deal and argues that some criticism is off-base. (Edwize) […]

  • 6 Tony
    · Feb 22, 2012 at 6:55 pm

    So what about elementary spec. ed teachers who teach ELA & Math to spec. ed. students? Arn’t they at a distinct disadvantage in this new evaluation system?

  • 7 mike schirtzer
    · Feb 22, 2012 at 7:06 pm

    Wow, this is great Mr Casey does a better job of making a case for our mayor and governor than they can do themselves. You sound as if you are so proud of a system that is going to measure my effectiveness by testing that you have actually come to its defense. If your first supposed to be protecting teachers best interests than we have no chance. Hey since your in the explaining mood, care to elaborate on why only 13% will have a right to appeal an ineffective rating to an independent arbitrator and while your at it why did is it up to the accused to prove innocence, in America the burden of guilt should be on the accuser.
    Listen you all sound like really nice people, but Im not sure nice people is what we need in our fight to save schools against ruthless politicians. It showed at PEP when occupy put on a stronger willed more contested fight than you did. Your techniques seem old and aged. Maybe its time for some new voices and fresh ideas. People who arent toasting with the enemy of our teachers and students over lunch at a Hilton
    Our children are staging walk outs, our teachers are organizing communities, and you are all busy defending a flawed evaluation system, oh Im sorry i forgot you are also sending buttons and blue ribbons to closing schools, now thats some radical action.
    Maybe instead of fighting for this , why dont you show up at some Parents meetings and tell them the truth, this entire thing means more testing for our students. Again its our kids who will suffer more Than anyone. So while you write nice essays and send out. Blue ribbons, people like me will continue to grassroots organize because we believe in our future.

  • 8 subterranean
    · Feb 22, 2012 at 7:32 pm

    “Teachers rated ineffective on student performance based on objective assessments (tests) must be rated ineffective overall.”

    Please explain.

  • 9 Carol burris
    · Feb 22, 2012 at 7:38 pm

    If only you were right, and I were but an alarmist
    Take a look at bullet 4.
    The judge who upheld your lawsuit agreed with me on this issue.

  • 10 Carol burris
    · Feb 22, 2012 at 7:48 pm

    Bronx teacher,
    Beginning next year you district will have to create assessments that you give in the fall and June to show student growth. They are called SLOs. You will not be allowed to grade them and the district will set the points. There will be SLOs for gym teachers, art teachers et. SED suggested that art teachers could be assessed on student ELA results. It is a plane being built in the air (their description ) and you and your students are on board.

  • 11 Leo Casey
    · Feb 22, 2012 at 8:28 pm

    Bronx Teacher:
    I think the challenges in measuring student learning in subjects like physical education, art and music are not fundamentally different than the challenges in other subjects, and can be successfully managed, if done with care. To do it right is a complicated, but feasible undertaking. A teacher should not be responsible for a student’s artistic talent, but shouldn’t s/he be responsible for a student developing an appreciation for art, and knowing how to use the tools in a particular artistic medium? Much to discuss here, and the comments section is a limited medium to do that, but I would be happy to dialogue in other settings.

    We should distinguish between different levels of disability for students with special needs. For students with more serious disabilities, such as students in self-contained classrooms, student growth should be measured against their IEP. For students with special needs who are mainstreamed in ICT/CTT settings, where they can and should take the same assessments as their classmates, their growth on those assessments should be weighted to take into account their disability. So long as we take into account the challenges of teaching students with special needs, we can measure student growth fairly.

    I see you launched right into insult mode without having taken much care to read what I actually wrote.

    Subterranean and Carol:
    I addressed the point you both made directly in what I wrote, where I talked of the “vexing issue of scoring bands.” The meaning of that passage in the NYSED press release is that if a teacher scores very low on both the state measure of learning and the local measure of learning — 0, 1 or 2 out of a possible 20 in both components — they will not be able to make the cut score out of the ineffective rating, no matter their score on the measures of teacher performance. Given that there are two different measures, and at least one of them can be an authentic assessment of learning rather than a standardized exam, it is hard to make a convincing argument that a teacher who scores so poorly on both measures is effective in the classroom. I think Sherman Dorn is on point in this regard.

  • 12 Carol Burris
    · Feb 22, 2012 at 8:41 pm

    That still does not address the issue of a teacher who scores effective,( by the negotiated agreement), with a 9 and a 9 and then gets 46 out of 60 in the final part. The observation rubrics are demanding and to expect that principals will just give away points in the final 60 is not realistic. That is the place where teachers grow. Proportionally 46 should be in the effective range. In addition, we will need to give the score prior to see test scores. That is the case of effective, effective, effective being ineffective overall. I have nothing to gain (or lose) by any of this, but the teachers you represent do. My motivation as I near retirement is to defend public schooling and the educators who I believe deserve a better and fairer system than this. The judge who upheld your lawsuit saw the same flaws. 0-64 is way too many points to give true balance to the system. I am very surprised that NYSUT did not hold its ground on this very critical issue.

  • 13 Ken Achiron
    · Feb 22, 2012 at 8:42 pm

    “What we were working on was performance assessments, that would look at the sort of exercises students should learn to do in these classes.”
    That’s a real laugh!. We have students who come to us and can pitch a fastball at varsity level, while others can’t even throw off the correct foot. We have students students who can shoot a three point shot while others can’t dribble the ball at all. And the reality is there is no “acceptable standard” of measurement of athletic skill that would be fair to determine student skill level as passing, let alone whether or not the teacher of the class was the reason the student could now shoot a free throw, or came to us with that skill already learned. Reality: some of us will be varsity level, some even Olympian, and some will never reach the level of any meaningful test in athletics. On top of that, to measure teachers on the outcome of student performance assessment, especially in a subject that is not even averaged in at most schools and by decree of the State Ed Dept (circa 70’s) when the Commissioner of the time rightfully said over 30 years ago, you may not fail a student on skill level, as opposed to lack of participation and lack of work, makes the use of any performance assessment questionable. Worse, as evidenced by results in Fitnessgram, students can come in motivated and test well the day of a pre-test, but come in the day of the test unwilling to work and score well below what they did earlier. To base the evaluation of a teacher on whether the student chooses to work to capacity or “dog it” where there is no penalty for such behavior, leaves open to question the validity of the use of any performance assessment of athletic skill

  • 14 Leo Casey “Sets the Record Straight” on the New Teacher Evaluations | assailedteacher
    · Feb 22, 2012 at 8:44 pm

    […] at Edwise today Leo Casey, Vice President of the United Federation of Teachers, addresses the criticisms of Diane Ravitch and […]

  • 15 Steven
    · Feb 22, 2012 at 8:59 pm

    Leo, You must understand that much of the anger and to paraphrase yourself – misunderstanding on the part of the rank and file as well as others comes from NOT hearing from the Union to which we pay dues to as well as depend on representing us. We were left clueless for close to a week. I ask this sincerely – why was our UNION so late in hearing something like this . That is unacceptable to me who has literally paid over 20 thousand dollars in dues over the course of my career. Your post here does clarify some things that I have been worried and confused about I must admit. But you too must understand that there are literally hundreds of scenerios that the teachers in the trenches face day in and day out that might worry them. Here is one of many I have and it is REALITY BASED. 1. What part does a student’s attendance play in all this? What if a teacher fails a student due to excessive absences yet he or she still shows up for the test or whatever it is the union collectively baragains? Wouldn’t his attendance and failing grade that the teacher granted backup the assessment he will most likely show little growth in? Will the teacher be punished for that? Thank you.

  • 16 Gregory McCrea
    · Feb 22, 2012 at 9:17 pm

    This conversation harkens strains of Monty Python’s The Holy Grail: “It’s only a flesh wound,” and “Not quite dead yet,” come to mind.

    Yes, the deal was all about maintaining our right to bargain collectively at the local level, something that has been essential as my team has been negotiating APPR since August and otherwise would not have a seat at the table. But please, let’s not pretend that the Gov getting his way (and then some) will not drastically change the face of teaching and learning in New York, and in the long run negatively impact kids.

    We are already seeing SED and some local administrators (thankfully not ours) suggesting that some form of benchmark multiple choice exam be applied to music and art programs; a ridiculous idea and anything but authentic. This is amplified by unrealistic SED requirements that teachers, and anyone close to the data, are not allowed to score their own students’ tests. Who, in a small district, will score my grade 4 students’ abilities to match pitch, sing expressively, and play instruments?

    The entire evaluation deal, and reporting of it in the press, has served as a smokescreen to the dangers of the Gov’s over-reliance on high-stakes tests. Union leaders have an obligation to shine the light on this. You can be for a viable evaluation system and still speak out against high-stakes tests.

    This deal was struck to prevent the Gov and SED from doing more harm to local schools and students. Teachers across NY will be forced to “take one for the team” in order to maintain some semblance of local control and sanity in their classrooms. The Gov, if allowed to continue to bully from his pulpit, would have instituted a draconian system that seized total control from local school boards and taxpayers. The unions stepped in front of a bullet aimed right at the heart of New York’s education system. It’s time to take action before the Gov can reload and take aim again.

  • 17 Learn more about NY’s new teacher evaluation system | Outside the Cave
    · Feb 22, 2012 at 9:23 pm

    […] evaluation system currently being negotiated in NYC, but in the meantime, UFT VP Leo Casey has a most informative and thorough explanation of what the system is and isn’t up on Edwize.  It answered questions I didn’t even know that I had. Share […]

  • 18 Leo Casey
    · Feb 22, 2012 at 10:01 pm

    These comments are coming at a pretty fast clip, but I will do my best to get them up and respond to them as quickly as possible. Please remember that we welcome criticism, but not name calling, which sends comments straight to the recycle bin.

    You seem to have missed the point I was making in the main post. Your scenario for a teacher who was effective in all components coming up as ineffective in the overall rating is based on your speculation on what might be the scoring range for the 60 points in measures of teacher performance. The reason why your speculation is unconvincing is that no teacher union negotiator worth his or her salt would ever agree to that scoring range for the 60 points precisely because of its incongruous results.

    Are you saying that teachers have no responsibility for our students’ learning? If that is the case, we disagree on a fundamental point. I am of the view — and I think most teachers would agree — that our objection has been to tests that are poor measures of student learning and to being held for factors completely beyond our control. It is possible to create performance assessments that are good measures of student learning. Not easy, not simple, but certainly possible.

    I understand completely that evaluations are a subject which causes great anxiety and fear among teachers, given the tenor of the dark times in which we live, with the ferocious attacks on public education, teachers and unions. I am also quite sympathetic to your desire to know as much as possible about where the UFT is on this subject, and what we have accomplished. I would simply say that to the extent that we have fallen short on the communication imperative, it has not been for want of trying, but because we have been consumed with fighting the battle around appeals and because of the extraordinary complexity of the matter. I will tell you, for example, that we had considered the issue of student attendance in our negotiations over the 33 Transformation and Restart schools and had reached an agreement in principle on including in a teacher’s score only those students who had met a minimum attendance threshold.

    Your points are important.

  • 19 Brobxactivist
    · Feb 22, 2012 at 10:55 pm

    You would not want to upset your friends in Albany. Why upset commander Bloomberg and General Cuomo. Let’s give them what they want. So give them all the teachers jobs on a silver platter. Hey all union activist watch out because if they go after all union non sell outs who will be left to stand up to these principals? Who will fight the city?? If principals give ineffectives and write up every teacher in their school hey only 13 will be able to appeal then let’s say half win their cases then that means that 6.5 percent of ineffective teachers jobs could be saved that one year so the following year the principal knows you could only get an appeal so many times. Now teachers will give up tenure and be fired in 2 years so that is a gain?? Remember they set it up like the military system your guilty until proven innocent. Who is the union working for?? They cave in each time. They supported occupy wall street half heartedly so this radical protestesters would not turn on the UFT. In bed with enemy?? Appeasing the enemy?? Giving them the pie and letting them eat it too?? So bow principals will not just submit generic reports and evaluations?? Will they not use the willing formula over and over?? Who would stop them from destroying teachers carreers. Now in essence is everyone a probationary teacher?? At will employees??

  • 20 MK
    · Feb 22, 2012 at 11:13 pm

    I am a teacher, and to a certain extent, I understand that the political climate of today made it difficult for the union to fight against the use of value-added growth measures in teacher evaluation. However, I find it disconcerting that a) part of your argument is that value-added scores will only affect a minority of teachers (implying that the majority of us should not be concerned) and b) you are criticizing the people who are pointing out the unreliability of value-added growth measures. That standardized test scores will only count for 20% of English and math teachers’ overall scores does not give me comfort. NYSUT wasn’t able to deliver an evaluation deal that teachers approve of (I don’t know a single teacher who supports the deal), but please, please don’t discourage people from speaking out against the unreliability of value-added scores. In fact, I would like to publicly thank Carol Burris and Diane Ravitch for writing about what this new evaluation system might mean for teachers.

  • 21 TeacherLady
    · Feb 22, 2012 at 11:45 pm

    Wow, the UFT fails us again. So since I have the worst performing group of 7th grade ELA in the school’s history (they have done miserably on the state test every year since the beginning), I will more than likely be rated ineffective this year. The UFT and Bloomberg sure are doing a great job of getting rid of higher paid teachers. The UFT is nothing but a tool in Bloomberg’s pocket. How do you all look yourselves in the mirror everyday?

  • 22 TeacherLady
    · Feb 22, 2012 at 11:47 pm

    This article hits the nail on the head.

  • 23 Gary Rubinstein
    · Feb 22, 2012 at 11:49 pm

    Could you clarify something: If someone were to get a 3 on each of the two achievement measures, which would be 6 points total, and then a 60 on principal evaluation, are you saying that this person would not be labeled ‘ineffective’? Surely a score of 6 out of 40 would be ‘ineffective’ in that component so according to the passing quoted by Carol, it seems that they would be labeled ineffective despite receiving enough points to be ‘developing.’ Or is it that they will define ‘ineffective’ as just getting 4 or lower as a total score on that component.

    The other issue I want to bring up is that the union made a big mistake in agreeing to the 0 to 64 points being ‘ineffective.’ What this does is inflate the 40 point ‘achievement’ category. It is like if the pass score was a 95 then even something that is only worth 10% would suddenly be very valuable.

    They should never agree to what the ineffective cutoff should be without knowing how the scores are going to be calculated. What is going to be an ‘average’ score for that 40 point category. Will it be something where most people get around a 25 and there are a few 35s and a few 15s, but very few scores in the single digits? Same with the 60 point component. As Carol points out, something like a 45 seems reasonable as everyone has things they can improve upon. If 45 is an average score for the 60 point score and 25 is average for the 40 point, there will be a lot of ‘ineffective’ teachers.

    You say that the union can do the simple math and won’t agree to something that will make it so hard to get the 65 points, but if they were really that good, they would have never agreed to making the cutoff score 65 before even establishing how difficult it will be to get these points. Setting the cutoff is actually the very last thing that should be done in developing a complicated rubric like this.

    The way I see it, by making the cutoff score so high, the city has managed to inflate the 40% component in a way that runs against the ‘spirit of the law’ limiting that component to just 40%.

    And let’s not forget that there’s a clause that the commissioner can ‘veto’ any plan that he feels is trying to minimize the role of the value-added metrics.

  • 24 Rob A. Bouie
    · Feb 23, 2012 at 2:00 am

    The more correct statement is Nysut made a deal late at nite without its council of presidents knowledge. That deal was for 20%. The state prostituted itself for RT3 and changed it to 40%.The real race is to raise it to 100% and continue to take the plane apart as we fly it.

  • 25 Carol Burris
    · Feb 23, 2012 at 5:15 am

    so what you are saying is that if a solid teacher earns 46 out of 64 points you want that teacher to be less than effective? Why? Do you honestly believe that the scale for the final 60 should not be proportional? What should ineffective be in the 60? 0-35? You know and I know the only points that matter are the composite and in the first category they are a bell curve. Look at the Race to the Top application. Even the revised app predicts over 10% of teachers in high needs schools as ineffective.

  • 26 Carol Burris
    · Feb 23, 2012 at 5:24 am

    If you do not think that my suggestion of effective effective effective is viable because you do not believe someone with well over half the points in the 60 is an effective teacher (though that boggles my mind), then press the legislature to lower ineffective overall to 0-50 as suggested by the judge. Frankly, the whole system will become unworkable and will provide fodder for columnists until it falls, but at least fewer good teachers will be humiliated and hurt. And for heavens sakes, don’t attack someone who is working to support teachers, students and public schools which are under attack as you well know. The governor knew full well what he was doing…he made unions own the problem. When the problems arise, it will be your agreement. This is a very sad time for public schools.

  • 27 NY’s Teacher Evaluations: The Mystery 20% | assailedteacher
    · Feb 23, 2012 at 6:08 am

    […] Leo Casey was gracious enough to respond to my critique of his defense of the new teacher evaluations here in New […]

  • 28 Carol Burris
    · Feb 23, 2012 at 9:31 am

    6 on the first 2 with a 60 in the final equals 66, which is developing. You would need a TIP. 2 and 2 (ineffective) plus 60/60 (perfect) would land you in ineffective.

  • 29 Leo Casey
    · Feb 23, 2012 at 9:37 am

    Brobxactivist and TeacherLady:
    The UFT gives Bloomberg what he wants? We are in Bloomberg’s pocket?

    You and I live in different universes.

    It is my view that part of the job of the union is to minimize the role of standardized exams in education generally and in teacher evaluations in particular. If you step back and take a look at the overall evaluation framework, I believe that the evidence is that we were successful at doing that. Certainly, this is far better than what other states are doing, such as Florida where standardized exams are counting for 50% of the evaluation. You may object that you don’t want any role for standardized testing for any teacher, but I think that such a goal is not realizable in the current dark times, and that it is essential to get the best possible evaluation system in the current context in which we live. There is a lot of talk of union misleadership, but in my view, we would have failed as leaders if all we did was advocate for a maximalist position we knew we would not win, and in so doing ended up with an evaluation in which standardized exams played a greater role.

    The sole basis for that line in the Chancellor’s and Commissioner’s press release are the scoring bands. You are correct that a teacher could get a 3 out of 20 (lowest possible score in the developing band) in both the state and local components and 60 0n measures of teacher performance, and just make it over the line into developing. (Parenthetically, I would argue that a lot of the hypotheticals posed in this debate are rather unlikely to the point of implausible; do you really think that a teacher could be so low on two different measures of student learning, at least one of which is an authentic assessment, and have a perfect score — 60 out of 60 — on the measures of teaching performance?) If we were devising a system on our own, we certainly would have used different bands for the measures of student learning, but with the ability to bargain collectively the bands and cut scores for the 60 points in measures of teaching performance, we can mitigate the effects of those bands and ensure that a teacher who is effective or developing in all of the different measures will receive an overall rating that is effective or developing.

    The reality of any negotiations is that even when you possess a great deal of leverage — certainly, a whole lot more leverage than teachers have in the current climate — you never walk away from a deal getting everything you want. It just doesn’t do to pick out this point or that point on which a compromise was made, and in isolation, say that it should not have been agreed to because it falls short of what we would want to accomplish, at least not if you want to take the responsibilities of leadership — as opposed to the unfettered role of free lance critic — seriously. One needs to look at agreements in its entire context, particularly around the defense of essential interests, and ask whether in a given balance of power, a better bargain was possible. I would argue that getting an evaluation framework which minimized the role of standardized exams far below that of other states instituting new evaluation systems is a real accomplishment, and that it can only be easily dismissed when one operates in a world of absolute principles, quite worthy in the abstract but entirely separated from the ‘on the ground’ realities of the balance of power.

    It is not very fruitful, I would argue, to consider a discrete element of an assessment out of context with the other elements. There is nothing intrinsically wrong with a scoring scale that places the passing grade at 65 — that is how we grade our students all the time. The problem here is the way in which that passing grade of 65 interacts with the other elements of the system. Rather than talk about what would make a sensible score scaling for the 60 points of the measures of teacher performance as an isolated element, we need to look at how to do a distribution of those points, which are 3/5 of the total, that comes closest to the ends results we want, such as the goal of having a teacher who is effective on all the discrete measures ending up with an overall effective rating.

    Finally, disagreeing with an argument is not attacking the person who makes that argument. We need to learn how to have civil but vigorous debate without questioning the motives and character of those with whom we disagree. I don’t view your criticism of NYSUT and teacher unions as an attack on those institutions or the persons who lead them — just a disagreement on how the policy and strategy we have pursued policy in this arena. Why should criticism of your arguments be any different?

  • 30 Leo Casey
    · Feb 23, 2012 at 10:12 am

    A word to prospective commentators. While you may publish your comments under any monicker you like, we do require you to provide real names and real email addresses with your comments. Also, sending three comments from the same computer with different names suggests that either you are trolling the comments or you should see someone about a multiple personality disorder. As you can see from the above, we have no difficulty in publishing numerous comments from the same person, when s/he identifies themselves as such.

  • 31 Gary Rubinstein
    · Feb 23, 2012 at 10:27 am


    If the 40 points are graded fairly, I agree that it will nearly impossible for a teacher to get 6 points on that and also 60 on the principal evaluation. But it seems to me that the ‘cut score’ of 65 was devised to ‘protect’ against just this nearly impossible scenario. So everyone must suffer because of this.

    I don’t agree that the number 65 has any true meaning, just because school tests are designed so that most students should score 65% or better.

    When an inaccurate component is 40% (when you write about the 20% based on other assessments and not the state tests, that has yet to be ironed out) then the passing score should be something like 50 points. I am concerned that the union did not truly understand the ramifications of agreeing to the 65 point cutoff. It will really constrict what they are able to do with the rest of the negotiation. I think they were outfoxed by the DOE.

    I would be happy to volunteer to look over some of the decisions and preemptively identify other potential flaws like this rather than just complain about them after they slip through. Is there a way I can get involved?


  • 32 Gary Rubinstein
    · Feb 23, 2012 at 10:31 am

    Also, it is not clear that we have a framework that is ‘better’ than other places. At a superficial level, D.C. uses 50% test scores compared to 40% test scores so it seems that we are weighing them less. But it all comes down to the details. If getting a high score is easier on their 50% than on our 40% then ours is not better than theirs. This is what I’m concerned about. It might seem that we are minimizing test scores, but there are still a lot of negotiations to be worked out. I do think that it is still possible for the union to ‘save the day,’ but I am worried because they seem to be off to a poor start. They have lost the first round, I think with this inflated cut score.

  • 33 BronxTeacher
    · Feb 23, 2012 at 10:32 am

    Leo, I at the very least have to respect the fact that you in here taking the heat. But several questions do remain.

    Whatever the final agreement is this evaluation agreement something the rank and file gets to vote on?

    You mentioned earlier that perhaps students in art or music can be assessed by a new found appreciation of music or art. In my opinion, you have a talent for art and/or music or you don’t. Not all can pitch like Sandy Koufax, play the drums like Neil Peart, or paint like Jackson Pollock. So what to do?

    My wife is an artist. She considers Jackson Pollock puke on canvas. I, on the other hand, think he’s great Who’s right? Me or her?

    Anyway, thank you for agreeing that this discussion forum is rather limited. I hate 2-D discussions as I am sure you do. I would like to cordially invite you to appear as a guest on my radio show ASAP. I’m sure you know how to contact me, or may I you? Let me know!

    Thanks for your time.

  • 34 Mike
    · Feb 23, 2012 at 10:56 am

    Leo, we fought hard and we lost. The forces aligned against us were simply too strong. There’s no shame in admitting that.

    We won’t know for sure until the local measurements are agreed on, but I suspect that good teachers will suffer as much as struggling ones under this new law. I think most of us prefer to focus on teaching rather than on justifying what we do. Those of us who have had to do SMART goals have little appetite further use of pseudo-scientific performance assessments. They take up a lot of time and energy and ultimately distort our priorities in the classroom.

    Like I said, we tried, but we lost. It’s now up to the parents and principals to save us from this disaster.

  • 35 Leo Casey
    · Feb 23, 2012 at 11:11 am

    In Social Studies, there are many questions — all of the most important questions — to which there is no one right answer, just as there are debates over what constitutes good art. I require my students to make a cogent, logical argument for whatever position they take. I think that a similar principle applies to art appreciation.

    Email me off line (LCasey@UFT.ORG) and we can talk about how you can become involved.

    I don’t agree that performance assessments are pseudo-scientific. In fact, in science, a performance assessment would characteristically take the form of doing real scientific experiments. A student would learn the scientific method because he or she would have to use it in that experiment, and they would be assessed on how well they done that. If you spend some time looking at the Performance Assessment Consortium’s work, I think you will find it impressive.

    And I don’t agree that we lost. Far from it. But I now need to spend some time away from comments to finish part 2, so I won’t belabor the point. {-;

  • 36 Multiple Measured Madness « Opine I will
    · Feb 23, 2012 at 11:27 am

    […] to the contrary, ”researchers have documented a number of problems with VAM as accurate measures of teachers…  Yet a very important percentage of teacher ‘effectiveness’ will be determined based […]

  • 37 Tony
    · Feb 23, 2012 at 11:59 am


    So where does it say that spec. ed. teachers will be rated the way you suggest? I have not seen that in writing anywhere.

  • 38 Michael Wood
    · Feb 23, 2012 at 1:20 pm

    Yes, many remaining issues will be negotiated locally. However, doesn’t SED have the power to veto any provisions they deem unworthy? Does anyone doubt they will use this power? In my opinion, this concession means local negotiations will be useless in the long run.

  • 39 bronxactivist
    · Feb 23, 2012 at 2:03 pm

    What will happen to tenure? What safeguards are there against abusive principal? The union will pick and choose who gets an appeal? Is this a General Electric management model get rid of the workforce each year?

  • 40 John Elfrank-Dana
    · Feb 23, 2012 at 2:31 pm


    The fourth bullet point in the agreement is not ambiguous – Teachers who can’t get passing marks for students’ standardized test scores will be fired.

    Furthermore, many UFT members have a reflex that they have been sold out again with just about anything the union negotiates – as we wait to see teacher rankings in newspapers. Those scores and rankings were supposed to be confidential.

    Remember Teaching for the 21st Century? It created Art. 8 of our contract which provides for professional development and teacher assessment provisions of our contract. The collaborative spirit of the document has been dead for years. When that came out in the late ’90s it was held as a great accomplishment in collaboration between the BoE and UFT. But the reality today is that fear is back in the observation process- gotcha is alive and well. A principal who violates the Teaching for the 21st Century protocol or Chancellor’s memo 80 which calls for individual pre-obs for struggling teachers can still put the resulting U-rated lesson in a teacher’s file and there’s no a damn thing the UFT can do about it because of the 2005 contract it agreed to waiving the right to remove letters from the file.

    The evaluation system has a critical flaw already with bullet no. 4. I have no reason to have any confidence, given the history of bad contracts with the UFT that undermine our rights, that any redeeming value you may negotiate in the other aspects of the evaluation system won’t be gutted in the near future just like it was for Teaching for the 21st Century.

    We should reject that standardized testing threshold, and negotiate in public a multifaceted agreement that builds in real input from teachers and other stakeholders, but not educrats in Albany or their corporate overseers.

  • 41 John Q. Teacher
    · Feb 23, 2012 at 2:32 pm

    Hello Leo. I have two questions: 1) My understanding is that the the union won the court case in that the state may not use 40% of a teachers evaluations on testing. (I believe the law was set at 20%) Why did the union not stick with the fact that they won this court case and allow the 20%? 2) What will the purpose of teacher tenure be when this deal goes through? Many teachers at my school are asking about this. Thank you for your insight to these questions.

  • 42 Leo Casey
    · Feb 23, 2012 at 3:46 pm

    The approaches I suggested are the sort of issues that have to be negotiated locally, which we had started before the negotiations over the 33 schools broke down. What I suggested is the type of position we could take. Students with special needs is one of the most complicated issue, but we had started working out a position along the lines I laid out. I am confident that agreement could be reached with the DoE in that vein.

    It is important to distinguish between legal and political restraints on authority. SED has final regulatory power as a matter of law, but there are political restraints on abusing that authority, especially after the Governor weighed on this agreement. I don’t think we be overly concerned about this. Since this is a law, the state legislature could always pass a new law, too. Some possibilities are remote, and not the main issues before us we need to address.

    John at 40:
    The reading you are giving that passage is too narrow: if it were only standardized test data, it could not have the practical impact the release suggests. It has to be both measures of student learning, including the second 20%, for it to have that impact.

    John at 41:
    You are right that the law gave the state measures based on state standardized exams 20 points, and that the court case confirmed it. Where you are mistaken is the notion that the unions gave that up in the agreement. There were very small upstate districts that said they did not have the capacity to run their own local assessments, and wanted the option to use the state exams for both 20 point measures. All that the agreement does is allow them to do that, if they can reach agreement on it in collective bargaining. For the rest of the state, the unions simply say that we are not prepared to bargain the use of the state exams for the second 20%, and it is the end of story. It was our view that we did not lose anything essential, so long as it had to go through collective bargaining.

    Teacher tenure is really nothing but due process: the idea that you can not be fired without the DoE proving incompetence or misconduct to an independent arbitrator. As I will note in my second post, this agreement secures that right to due process.

  • 43 John Q. Teacher
    · Feb 23, 2012 at 5:36 pm

    Thank you Leo for your response. However, I am still confused. Did the state originally say that only 20% of an evaluation can consist of testing or did the original law claim that there must be a 20/20 evaluation system that consists of test scores? (State and locally designed) Lastly, it seems to me that the way the law will be implemented is that if a teacher is found to be ineffective on the test result section of an evaluation, he or she will be terminated. How does tenure protect that person? Will they face a 3020a process? Once again, I and all NYC teachers appreciate your help in answering our questions. There is much confusion going on and your help is welcomed greatly.

  • 44 Tony
    · Feb 23, 2012 at 7:19 pm

    Leo, but if you are in an upstate district that has already negotiated to use the tests for 40%, it appears to me that spells trouble.

  • 45 Richard Skibins
    · Feb 23, 2012 at 7:45 pm


    What about when a principal purposely places all of the learning disabled, emotionally disturbed and the difficult to teach children in a targeted teacher’s class? What if that principal purposely observes, say, the librarian or health teacher in that class? How can that teacher get a fair shake? And what can be done to discourage principals from going after teachers in such a fashion? Will those who railroad teachers by giving them a trumped-up “ineffective” be punished in such a way as to discourage such actions by others?

  • 46 Ken Achiron
    · Feb 24, 2012 at 2:49 am

    Are you saying that teachers have no responsibility for our students’ learning? If that is the case, we disagree on a fundamental point. I am of the view — and I think most teachers would agree — that our objection has been to tests that are poor measures of student learning and to being held for factors completely beyond our control. It is possible to create performance assessments that are good measures of student learning. Not easy, not simple, but certainly possible.

    Please don’t put words in my mouth. I never said teachers have no responsibility for student’s learning. I said, in a stronger way than you, that as the system currently works, there is no validity to using student performance tests to measure what has been learned. There are any number of variables beyond our control in physical education, as in many other subjects that make determination of a particular teacher’s impact questionable. To further compicate the issue, the requirements for passing physical education center on participation in meaningful activity. Failing a student which means not allowing them to graduate becuase they cannot throw a ball striaght or cannot do a shoulder stand or cannot lift 100 pounds over their head or dance a waltz is not done. Try telling math teachers their grade is based on participation and failing to complete the math exam can have no bearing on whether they pass the class or not. At the least, they would be running to UFT about inappropriate pressure to pass students. But itis exactly this difference, that will call into question the validity and reliability of using performance assessments in physical education to determine the impact of a particular teacher. Sure, if a student is in my class he will learn more gymnastics than if he is in the volleyball coach’s section of gymnastics. But what happens when my star player is programmed for the pother teacher’s section? Who gets the credit? And worse, what happens when he decides that because he is not in my section, and he can do what he needs to do to pass for graduation, he takes the performance test used for grading teachers lightly and does little in it, causing that teacher a poor grade? Never happen? Probably not for 75% But 25% unreliability is way to high to accept. 25% a real number? You tell me. You are basing this whole house of cards on the idea that students will be motivated when they take the performance assesments. What happens when they aren’t and know there is no penalty for not taking it seriously. Actually, not much different from the complainnts I already hear from math teachers about student attitudes while tkaing the acuity tests. I don’t have a problem taking ownership of the learning of my students. But when you measure, make sure you are measuring me.

  • 47 Yaya Yo
    · Feb 24, 2012 at 8:59 am

    “Money is the root of all evil!” How can something so fundamentally and morally unfair be passed into law. Are we, humans, thus so filled with venom that we must look to degrade and dehumanize each other by any mean necessary? When will people of power realize that there is no price that can replace the variable called ——— the human factor?

  • 48 Leo Casey
    · Feb 24, 2012 at 1:31 pm

    The law created two categories of measures of student learning, one determined by the State Education Department and one determined locally through collectively bargaining. It is explicit that the state component will include value-added measures derived from state standardized exams, where that is possible, but the local component is left entirely up to collective bargaining at the local level. It is the view of the UFT — and of every significant New York teacher union local I have seen — that the local measure NOT be based on standardized exams. Since the local component needs to be negotiated with the union locals, that would mean that the use of standardized exams is limited to the 20 points in the state component.

    Now, the scoring bands created by the State Education Department have created a situation in which a teacher who scores very low on BOTH the state and local measure of student learning (0, 1 or 2 out of a total 20 points) will necessarily receive a overall rating of ineffective. However, that would mean that a teacher would have to score very low not only on the state component with standardized tests, but also the local component, which in NYC and most places around the state, will not be based on standardized tests.

    Tenure is nothing but a guarantee of a due process hearing in front of an independent arbitrator before a person can be dismissed. You probably wouldn’t know this from reading the teacher bashers, but tenured teachers are dismissed for misconduct, incompetence and dereliction of duty. They simply have the right to a hearing before this can happen. I will discuss this more in my post on the appeals process.

    I can live with a local union making a decision that we would never make in New York City and in most teacher union locals around the state. They are accountable to their local members, and if the local leadership does not reflect the will of their members on an issue as important as teacher evaluation, those members will elect new leadership. And what is negotiated in one agreement can be renegotiated in a different way in the next agreement.

    Thank you for clarifying your position. I think we have to be particularly conscious of how we present our stance on this question, because if we only talk about what we find unacceptable, and never talk about what we would find acceptable, it is not hard to tar us with the brush of “defenders of the status quo” who want nothing to change. And change on teacher evaluations is necessary.

  • 49 Mike
    · Feb 24, 2012 at 6:55 pm

    Leo, I took a quick look at the PAC’s website and I agree that their work is impressive, but it seems designed to evaluate an individual student’s work rather than to provide the basis for proving that a teacher produced “growth” among students. My understanding of the state law is that some sort of baseline would need to be established against which each student’s progress would be then the teacher would be evaluated according to the results. If the value-added scores can’t get this right with simple standardized tests, I don’t see how it can be done with more complex performance assessments. In other words, I don’t see anything wrong with using these assessments to measure students, but I don’t think they’re a good way to measure me.

    The PAC assements also seem designed as an alternative to a Regents-prep curriculum. I’m still teaching to the test.

    As for whether or not this is a “loss”, I have to admit that I didn’t realize that only VERY low scores on BOTH measures of student growth would result in an automatic “ineffective” rating. I see it now as more of a loss in terms of the aggravation it will cause and misplaced priorities it will create rather than being an issue of job security for good teachers. I say that, though, without fully understanding the implications of the Danielson rubric, which seems biased against teachers with difficult classes and seems to contain impossibly high standards for a “highly effective” rating.

    As a parent, though, I consider anything that places greater importance on standardized tests to be a big loss for my kids.

  • 50 Elisabeth J
    · Feb 24, 2012 at 8:23 pm

    As a NYC teacher I am afraid. I teach in a district that has been identified as a DINI long before I began teaching (I began in 2001). My MS is sandwiched between four NYCHA housing projects. We often get the students other MS/IS/JHS will not take. We get the lowest performing students, students w/o IEP’s but have serious learning and behavioral disabilities. (A student of mine was tested borderline retarded. At the IEP meeting, it was suggested that the student be moved from gen-ed to a 12:1:1 or else be retained in the same grade for a 3rd year. The parent refused ALL spec.ed services). So as his math teacher I am being held responsible for his test scores?..I am not a special ed teacher. Many of my students come to school tired, hungry, abused, unclean and w/o school supplies. I have 14 year olds in my classes. A girl who only comes to school on trip days, to go on trips. I have a student who’s older brother is pimping her out. So she hardly comes to school and when she does all she does is sleep in class. Students who attended summer school, did not pass their summer school classes or tests but were pushed ahead because their elementary school didn’t want them anymore. We are now getting students who have been de-certified from D75 schools w/o any transitional services. We tried to get rid of a student who burnt a teacher, 2 students that bought a combined total of 17 knives to school, a student who threatened to shoot a teacher 3 times and left the building to go get his gun, another student who left the building to go get his gun to shoot up the school, students who drink (grey goose and absolute) in the cafeteria and the girls who are pregnant. I work my BUTT off EVERYDAY in my classroom. Yet according to these TDR I am below average !!!. I am afraid. I am afraid. I am afraid. Who’s protecting me, the teacher?…Who’s holding the parents and students accountable?…The bureaucrats protect and empower parents and students but crap on teachers. The UFT stands by and watches, while twiddling their thumbs, as teachers are belittled, berated and abused. The superintendent of my district will not grant tenure to anyone in my school because our test scores are so low, but we are consistently sent the lowest performing students. I am afraid. My goodness…what is happening. At what point in the history of education did teaching shift from teaching to surrogate parent? I am not the bad guy.

  • 51 Bronx Teacher
    · Feb 25, 2012 at 11:15 am

    Leo appears with me this coming Tuesday night live……


  • 52 John Elfrank-Dana
    · Feb 25, 2012 at 12:36 pm

    What will the second 20 percent be then? It’s up to the UFT to negotiate. How about student portfolios? How about student surveys? Will those be allowed? I bet not!

    It will be some “objective” measure, otherwise corpratists like King won’t approve. We are doing it now at Bergtraum. The are called Performance Based Assessments (PBAs). They ARE a kind of standardized test that measures progress of students by setting a benchmark and then a summative assessment at the end of the term or year. The UFT will heed to the demands if the past is any indication.

  • 53 Dina Strasser
    · Feb 28, 2012 at 10:09 pm

    Thank you, Leo, for responding to these comments with consistent cogency and civility. Thank you, commenters, for helping me understand these issues more clearly.

    Leo, I do have to chime in with John @52 and say that I fear that your faith in the second 20% being represented by a measure other than a standardized test is misplaced. My district *immediately* responded to the request for a second measure with the proposal that another set of standardized exams be bought and used.
    One need only look to Vermont to understand that standardized exams, compared to portfolios or other performance-based measures of learning, are cheap– both in terms of money and person power. Indeed, standardized exams were developed precisely in the effort to provide a quick, cost-effective, and “scientific” means of evaluating knowledge. In our current fiscal climate and quintessentially American fixation with numbers, who can blame districts for looking to a second set of standardized testing?

    This is not to say that my local won’t bargain differently; however, as John points out, it doesn’t change the fact that the seduction of “efficiency” and “objectivity” will hold sway over local stakeholders as well as our DOE. As long as it does, we teachers face the possibility that our performance will be determined not, perhaps, by state standardized testing, but by a combination of standardized testing at large. This merely replicates and exacerbates the very real and very large margin of error in VAM.

    It also makes the distribution of the 60% all the more important, which is why, I surmise, Carol Burris feels it must be considered in isolation, and be thoroughly and consistently fair.

  • 54 John Elfrank-Dana
    · Feb 28, 2012 at 10:16 pm

    What? I get the last word? Where did everyone go? :) Thanks for your thoughts Leo, you have handled dozens of questions here alone.

  • 55 Dina Strasser
    · Feb 29, 2012 at 4:48 am

    I should clarify that since we’re still negotiating our APPR where I work, the word “proposal” for the use of standardized tests is not the most accurate (sorry)– nor do I want to imply that our reps have not been keeping negotiations confidential. Suffice it to say, in the most broad strokes, that standardized tests have been one of the options discussed as the local 20%.

  • 56 Peter Lamphere
    · Feb 29, 2012 at 11:14 pm

    Leo –

    It is important for the full text of the agreement to be released – for clarity about what the constraints are on the locally negotiated 20%.

    For example, John King stated (minute 27) of the press conference announcing the deal that both sections of student achievement portion will be based on tests. What does the agreement actually say?

    Given that King will have the power to reject or approve local deals, it seems that his interpretation will have a lot of sway.

    I think that John Elfrank-Dana is also raising an important point about Teaching for the 21st Century – the mandate that one observation is unannounced actually will eliminate a contractual right (and a right that Leo has vigorously defended in the past), for teachers in danger of an unsatisfactory rating to have a pre-observation conference focused on the content of their lesson.

    Basically, what Leo is asking us is to trust that the union, in what he admits is a position with weak leverage, will nonetheless come out with a deal that does not make concessions – even though the people with the power to approve the deal already are saying that the concessions are made.

    The best way of putting this to rest is to release the full text of the agreement.

    Finally, if people would like an analysis of the politics behind the agreement, check out http://socialistworker.org/2012/02/28/bitter-fruits-of-race-to-the-top.

    In Solidarity,


  • 57 Darrell McElroy
    · Mar 2, 2012 at 10:16 pm


    I am the Executive VP for my local union (Monroe-Woodbury Teachers Association). I am also a NBCT (AYA Social Studies/History). I believe in fair evaluations, but what we are about to be subjected to is not fair nor educationally sound. I believe that the statewide union needs to change course philosophically before it is too late.

    I realize that I and other like minded local leaders are cast aside as outliers, but I think that we represent a silent majority of teachers around the state who do not feel as though our elected union leaders have worked on our behalf or on behalf of public education.

    I admire your ability to try and answer all of these questions, but what NYSUT and the UFT will never be able to do is demonstrate to the dues paying rank and file how this system is fair.

    We are forced to do this because our leaders bought into the RTTT blood money that will do nothing to improve education, help students, or retain teacher jobs. My district is getting $130,000 ($100,000 of which goes to the local BOCES). The cost of these reforms will soon outstrip the paltry federal funds we received and no doubt people will lose their jobs in order to fund this monstrosity.

    Carol Burris is right on. She and Sean Feeney have been at the forefront of this issue, which is why I and many others have signed on to their letter.

    We are being asked to negotiate for a car that we do not want to buy. Common sense tells us that using flawed exams that were not intended to measure teacher effectiveness is problematic.

    Common sense tells us that agreeing to an evaluation model that will eventually include very unreliable Value Added formulas is problematic. While the formula used in the TDRs in NYC will not be the one adopted at the state level, what happened this week is an eye opener for the state’s teachers and parents. The NY Times had a great article about the really good teachers were unfairly scored as a result of the VAM. One teacher’s students scored 1.22 standard deviations above the mean (89th percentile), but the teacher scored a zero because the students did not score 1.84 standard deviations above (97th percentile) (http://www.nytimes.com/2012/02/26/nyregion/in-new-york-teacher-ratings-good-test-scores-arent-always-good-enough.html). I understand that the teacher had high achieving students and with a VAM you have to be able to show growth. While we do not have a VAM yet, we will and it will be worth 25% of the overall score. That further undercuts the argument that we have all of this “control” over the 80%. Who would want to teach high achieving students if this could be the outcome? Can you explain how this part of the evaluation is right in this case?

    Common sense tells us that there is no way to protect our members against the SLOs that will be generated for 20% of the evaluation. Nobody can clearly explain the SLOs and these are not even within the ability of our local to bargain. From what I have been able to discern, the school district could base the 20% on a school wide metric if they chose to do so. We really have no leverage.

    Common sense tells us that caving in on the scoring bands (after we won that argument in court) was highly problematic. I almost fell out of my chair when you compared the 65% to what we do when we grade our students. That is precisely the problem: we should not be generating a “grade” for teachers on a 1-100 scale. The broad category (HEDI) should be enough. Based upon your logic, since an 85 is considered mastery on the Regents exam then shouldn’t highly effective be 85-100?

    How is the collective bargaining process protected if SED has to review and accept each proposed APPR? With the leverage of state aid, I doubt even a resurrected Al Shanker could negotiate the types of things you think we can negotiate. I have already heard of an instance where SED did not approve an APPR that included attendance as a mitigating factor on the locally developed 20%. While it was a SIG school, I believe that SED is going to take a hard line approach with the rest of the state’s districts.

    To give you another example, in a recent conference call with NYSUT it was suggested that for the 60% an administrator has to give a score based upon the rubric selected. While I understand that concept in theory, I actually teach in a school building with real life administrators. They are human and evaluation, like grading, can be subjective. I understand that it is supposed to be evidence based, but in spite of those platitudes, people can look at evidence and data differently. How do we truly ensure inter-rater reliability?

    But for the sake of argument I will accept what I was told. What was said next was even more preposterous: an E would mean getting 57-58 points and H would mean getting 59-60 points. That sounds fantastic, but doesn’t pass the smell test in terms of actually appearing in a CBA. Like I said, Al Shanker could probably not negotiate scoring bands like that.

    The fact is that even if we were to negotiate fairly favorable terms, SED can reject it. Yes, they have to point out the deficiencies and theoretically NYSUT will defend us. However, I do not have a high degree of confidence that we will win those battles. As the January deadline for state aid gets closer, who will win that battle?

    I appreciate the idea of having a seat at the table, but it did not take us off the menu. In some ways, we have been complicit in our potential demise. We should not have been a party to something that we all know is wrong. We made a political calculation to play nice and this deal is getting worse all the time. We need to stand up against what we know is wrong for public education, kids, and teachers.

    I am not concerned about the perception that people think we are only protecting bad teachers. Bad teachers do not serve anyone any good and I am not interesting in doing anything more than ensuring due process for them. This system doesn’t even guarantee the bad teacher a basic level of fairness.

    Having said that, my even greater concern is that many good teachers will lose their jobs while we blindly figure this out. Students will be no better off as the we lose time to meaningless testing/test prep that has no relation to critical thinking or real life. Eventually we will conclude that this is lunacy, but what will be left of public education at that point?

  • 58 The Myth of Budget Cuts in American Education | assailedteacher
    · Mar 4, 2012 at 7:45 pm

    […] new teacher evaluation agreement promises to pour gasoline on this no-bid contract wildfire. At the very least, we know […]

  • 59 New York City’s flawed data fuels the right’s war on teachers | Hotspyer – Breaking News from around the web
    · Mar 4, 2012 at 10:37 pm

    […] principal basis for making consequential decisions about teachers.” To its credit, New York isn’t using the Teacher Data Reports currently in the news as the sole basis for its ultimate teacher […]

  • 60 New York City’s flawed data fuels the right’s war on teachers | TeaBaggers Of America
    · Mar 5, 2012 at 2:27 am

    […] principal basis for making consequential decisions about teachers.” To its credit, New York isn’t using the Teacher Data Reports currently in the news as the sole basis for its ultimate teacher […]

  • 61 New York City’s flawed data fuels the right’s war on teachers | FavStocks
    · Mar 5, 2012 at 3:19 am

    […] principal basis for making consequential decisions about teachers.” To its credit, New York isn’t using the Teacher Data Reports currently in the news as the sole basis for its ultimate teacher […]

  • 62 New York City’s flawed data fuels the right’s war on teachers | FavStocks
    · Mar 5, 2012 at 3:19 am

    […] principal basis for making consequential decisions about teachers.” To its credit, New York isn’t using the Teacher Data Reports currently in the news as the sole basis for its ultimate teacher […]

  • 63 Simon
    · Mar 5, 2012 at 10:15 am

    I think this is less a matter of legal tolsibisipy than a way to explain that the concerns here go beyond the question of being a public figure. Theresa Pinilla points out the first example that comes to mind, but I think there’s another one that is perhaps better and addresses the fragility of any test-score derivative: suppose a news organization acquired all the residents within a 50-mile radius whose names appeared on the federal No-Fly list. The federal government acknowledges that the list is highly imperfect, and while there are certainly some suspicious characters on the list, there are also a number of people on the list who are there by mistake. Would it be ethical for a newspaper to print such a list? (For the moment, wish away the national-security implications of such a publication. This is about the consequences for individuals.) An editor could rationalize the publication of such as list as in the public good, for certainly you might want to know of any neighbor whom the federal government considers unreliable or so would an editor explain. So what if the list is imperfect? The public need is great! The no-fly list is different from a set of test-score measures (one is a list of private citizens, the other data about public employees), but the imperfection of the data exists for both.

  • 64 How New York City’s Flawed Data Fuels the Right’s War on Teachers – AlterNet | New York Christian Radio
    · Mar 5, 2012 at 10:51 am

    […] principal basis for making consequential decisions about teachers.” To its credit, New York isn’t using the Teacher Data Reports currently in the news as the sole basis for its ultimate teacher […]

  • 65 SchoolBook: The Principal’s Role in Teacher Evaluations
    · Mar 6, 2012 at 6:05 pm

    […] U.F.T.’s vice president, Leo Casey, is arguing against the New York Principals’ paper, a document that thoughtfully disagrees with using student test […]

  • 66 As testing starts, critics plan post-teacher evaluation deal efforts | GothamSchools
    · Apr 18, 2012 at 3:16 pm

    […] of the state evaluation law that panel members have criticized: UFT Vice President Leo Casey dismissed Burris’s concerns about the role of state tests on the union’s blog, for […]