Politicians in Washington State are attempting to reduce educators to lifeless bits of inaccurate data. House Bill ESSB 5748, which would explicitly link teacher evaluation to standardized test scores, was recently introduced into the Washington State Legislature. Dr. Wayne Au, the University of Washington Bothell’s 2015 Distinguished Teaching award winning professor, joined scores of others in Olympia on Monday, March 30th, to express opposition to this educationally unsound proposal. In fact, there were so many there that wanted to use their testimony to “test-defy,” that after both sides had presented arguments, there were still 322 more people who wanted to speak against the bill but there wasn’t enough time. After Au’s schooling, the full transcript of which appears below, at least these politicians can’t plead that they were ignorant of the unscientific approach and harmful effects that tying teachers’ evaluation to tests scores will have on intellectual and emotional process of teaching and learning. Professor Au sent me the following preface to his testimony:
I was invited to testify by Representative Sharon Tomiko-Santos, Chair of the Washington State House of Representatives Education Committee. She wanted me to bring a research-based perspective to the discussion of ESSB 5748, which would explicitly link teacher evaluation to standardized test scores. The audience was the Education Committee, so one thing I want readers to know is that I was making an argument to folks who believe the tests are a valid measure. Plus I only had 2 minutes to testify, which means a lot of arguments that could have been made had to be omitted due to time constraints. So while there are so many good arguments to make against using standardized test scores to evaluate teachers, I had to choose a few particularly clear and sharp ones that I thought would be the best at directly challenging the ESSB 5478. Also I want to publicly thank my partner, Dr. Mira Shimabukuro, for her help in the editing and crafting of the final statement.
Dr. Wayne Au’s Testimony to the Washington State House Education Committee Regarding House Bill ESSB 5748, March 30, 2015
Members of the House Education Committee, I am Dr. Wayne Au, an Associate Professor in the School of Educational Studies at the University of Washington Bothell. I am here today as an individual citizen, a parent of a future public school student, and a nationally and internationally known scholar with expertise in education policy and high-stakes testing. I am testifying today to share my concern about using standardized test scores to evaluate teacher performance. The logic of using test scores to evaluate teachers seems like commonsense: A teacher teaches, a student learns, a test is given, and the test score shows the effectiveness of teaching. However, this logic falls apart in the face of research. For instance, using test scores in teacher evaluation has produced large statistical errors. Based on a single year’s scores, one major study by the U.S. Department of Education found a 1 in 3 chance of mislabeling a proficient teacher as not proficient. Other research has found wild, year-to-year swings in teacher ratings based on test scores, where teachers highly rated one year dropped to the bottom, and teachers poorly rated shot to the top, the next year. This inconsistency suggests that the tests are measuring the changing, year-to-year demographics of students as opposed to measuring the ongoing effectiveness of teachers. Finally, we have known for decades that non-school, poverty-related factors like lack of adequate healthcare, food insecurity, and housing insecurity, account for up to 70% of an individual student’s test score. The impact of teachers on test scores pales in comparison to the impact of such broader social and economic issues. Given problems such as these, leading educational researchers and the American Statistical Association have warned against using standardized test scores to evaluate teachers. Unfortunately the U.S. Department of Education refuses to pay attention to these experts and continues to push such a fundamentally wrong-headed policy. The State of Washington should not follow their lead. Thank you.
Notes Other research resources on using high-stakes, standardized testing to evaluate teachers:
Amrein-Beardsley, A. (2008). Methodological concerns about the Education Value-Added Assessment System (EVAAS). Educational Researcher, 37(2), 65-75. doi: 10.3102/0013189X08316420.
Amrein-Beardsley, A. (2014). Rethinking value-added models in education: Critical Perspectives on tests and assessment-based accountability. New York: Routledge.
Amrein-Beardsley, A., & Collins, C. (2012). The SAS Education Value-Added Assessment System (SAS® EVAAS®) in the Houston Independent School District (HISD): Intended and unintended consequences. Education Policy Analysis Archives, 20(12), 1-36.
Au, W. (2010). Neither fair nor accurate: Research based reasons why high-stakes tests should not be used to evaluate teachers. Rethinking Schools, 25(2), 34–38.
Baker, B. D., Oluwole, J. O., & Green, P. C. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race-to-the-Top era. Education Policy Analysis Archives, 21(5), 1-71.
Berliner, D. C. (2014). Exogenous variables and value-added assessments: A fatal flaw. Teachers College Record, 116(1).
Briggs, D. & Domingue, B. (2011). Due diligence and the evaluation of teachers: A review of the value-added analysis underlying the effectiveness rankings of Los Angeles Unified School District Teachers by the Los Angeles Times. Boulder, CO: National Education Policy Center.
Collins, C., & Amrein-Beardsley, A. (2014). Putting growth and value-added models on the map: A national overview. Teachers College Record, 16(1).
Corcoran, S. P. (2010). Can teachers be evaluated by their students’ test scores? Should they be? The use of value-added measures of teacher effectiveness in policy and practice. Providence, RI: Annenberg Institute for School Reform.
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8-15.
Gabriel, R. & Lester, J. N. (2013). Sentinels guarding the grail: Value-added measurement and the quest for education reform. Education Policy Analysis Archives, 21(9), 1-30.
Glass, G. V. (1990). Using student test scores to evaluate teachers. In Jason Millman & Linda Darling-Hammond (Eds.), The new handbook of teacher evaluation: Assessing elementary and secondary school teachers (pp. 229-240). Newbury Park, CA: SAGE Publications.
Kennedy, M. M. (2010). Attribution error and the quest for teacher quality. Educational Researcher, 39(8), 591-598. doi:10.3102/0013189X10390804
Newton, X., Darling-Hammond, L., Haertel, E., & Thomas, E. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18(23), 1-27.
Papay, J. P. (2010). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1), 163-193. doi:10.3102/0002831210362589
Paufler, N. A. & Amrein-Beardsley, A. (2014). The random assignment of students into elementary classrooms: Implications for value-added analyses and interpretations. American Educational Research Journal, 51(2), 328-362. doi: 10.3102/0002831213508299
Polikoff, M. S., & Porter, A. C. (2014). Instructional alignment as a measure of teaching quality. Education Evaluation and Policy Analysis. doi:10.3102/0162373714531851
 Schochet, P. Z., & Chiang, H. S. (2010). Error rates in measuring teacher and school performance based on test score gains (No. NCEE 2010-4004). Washington D.C.: U.S. Department of Education, Institute of Educational Sciences, National Center for Educational Evaluation and Regional Assistance. Retrieved from https://ies.ed.gov/ncee/pubs/20104004/pdf/20104004.pdf: These researchers found a 35% statistical error rate when using one year’s worth of data to evaluate teachers, and this error rate only fell to 25% when using three year’s worth of data.
 Sass, T. R. (2008). The stability of value-added measures of teacher quality and implication for teacher compensation (Policy Brief). National Center for Analysis of Longitudinal Data in Educational Research: In the Sass study, 1/3rd of the bottom 20% one year moved to the top 40% the next year, and 1/3rd of the top ranked teachers one year moved to the bottom 40% the next.
 Berliner, D. C. (2010). Poverty and potential: Out-of-school factors and school success. Boulder, CO & Tempe, AZ: Education and the Public Interest Center & Educational Policy Research Unit. Retrieved from https://epicpolicy.org/publication/poverty-and-potential; Berliner, D. C. (2014). Effects of inequality and poverty vs. teachers and schooling on America’s youth. Teachers College Record, 116(1). Retrieved from https://www.tcrecord.org.: Depending on the study, teachers account for around 17% of a student’s test score. However, this is entire discussion is based upon the assumption that test scores are the most important aspect of teaching and learning, and I feel strongly that we must challenge that assumption.
 There are a whole host of other issues I’ve omitted here due to time constraints. For instance, using test scores for teacher evaluation can’t account for knowledge and skill transfer between teachers and subjects. Whose to say that the essay writing done in a social studies classroom is or is not what contributed to a student’s score on an English Language Arts section of a standardized test? Similarly, we don’t know how to tease out if the mathematics learned in a physics course contributed to a student’s standardized math test score. Further, standardized test scores can’t account for “peer effect” – where being in a classroom full of high test scorers tends to bring an individual student’s score up, and being in a classroom full of low test scorers tends to bring an individual student’s score down, regardless of past performance.
 Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., … Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. Economic Policy Institute.
 American Statistical Association. (2014). ASA statement on using value-added models for educational assessment. American Statistical Association. Retrieved from https://www.amstat.org/policy/pdfs/ASA_VAM_Statement.pdf