Jul 15

Jul 15 A Metric Without Meaning

biology, education, personal, pedagogy, science education, teaching

Evidence for a ceiling of teacher competence effects on standardized test scores.

Here's a thing that I see repeatedly: A teacher gets their scores after their students sit for some sort of standardized exam (ex. An AP Exam). They are not pleased with the results. So they turn to their professional network to express their frustration. They want their kids to do better. Maybe they have made various changes to their course in keeping with curricular shifts, but they haven't seen scores improve in line with those changes. Or maybe kids are doing pretty well, but the same number of kids are doing well year-over-year, and that number doesn't seem to be shifting. Whatever the symptoms, relief is sought.

The typical approach to replying to this type of post is to spitball ideas for what a teacher can do to improve their instruction. Replying teachers might talk about what they are doing in their classes that seems to be working for their students, or maybe they ask the posting teacher what sorts of things they might want to implement that they haven't yet. All of this is fine (and I would argue a necessary part of the reflective process). But all of this also ignores another critical aspect of student exam scores:

A lot of how students do on standardized exams is outside of the control of the teacher.

There is only so much that a teacher can do that will affect something like the standardized test scores of their students. Let's focus on AP scores since it's score-release season, and since the AP Biology exam is the only standardized exam that I still have to interact with in my own teaching. Let's leave aside issues the utility of AP exam scores as a useful metric for gauging the success of a student's experience in an "AP" course (spoiler alert: I think that utility is quite low). Let's also accept the hypothesis that the pedagogical choices of the teacher are the most significant influencer of student exam performance. I think that's probably supported by research. Taking that as a given, what is the effect size of the impact of the teacher once the teacher has hit a certain level of what I'll term "teacher competence?" I think that once you get to a certain level of competence, further gains in effect on something like student standardized test performance tail off pretty quickly.

Here are three possible relationships between teacher competence and standardized test performance:

Our usual conversations around topics like this assume that something like relationship 1 holds true; If students aren't scoring the way we'd like them to, then the solution is to do more things that help them to score better. I'll suggest that the reality of the relationship is much more like relationship 2 or 3; Once a particular level of instructional competence has been reached, there is not too much else that the instructor can do to continue to boost the standardized test performance of their students.

Maybe you reject my thinking here, in which case you probably want to stop reading now, and possibly leave me a note explaining why you think I'm wrong. But I think I'm actually in a position this year to offer some evidence to support my contention. As you know if you read a lot of what I write here, this past year I moved jobs. Whereas I was previously employed in a lovely and still-functional public school in NYS, I'm now teaching at a Fancy International School in Singapore. But usefully for this discussion, I'm teaching the same courses in the new job that I did in the old one. And I'm teaching them in the same way.

Actually, we could make a pretty compelling argument that I taught them less competently this first year in the new job than I did for many years in the old one. Rather than see my AP students for 80 minutes every day, as I did back on Long Island, this past year, I saw my AP students for 80 minutes every other day. We did start a bit earlier, so I did have a bit more than half of the prior student contact time than I used to, but it was still significantly shortened. Also, I had to take a course that was designed for every day and reconfigure it for every other day (not an insignificant shift). Also, it was the first year in a new school and a new culture, so there were unavoidable growing pains and learning experiences. Also, I taught the course to more than twice as many students as I've ever had before, which couldn't help but limit the amount of 1:1 feedback and intervention that I could provide. I'm okay acknowledging that each of these unavoidable changes meant that this past year's AP Biology students got a less-steady version of my course than the one that I used to teach.

So how did they do?

I find it disagreeable to talk about specific numbers, but I am okay saying that this year's group did significantly better on the AP exam than any group of students I ever taught the course to in NYS. How significant? On average more than a full point higher (the exam is scaled from 1-5) than the running average of my NYS students, with more students scoring a 4 or 5 this year than the total number of 4s and 5s recieved by all years of students that I taught AP Biology to, combined. This year's "numbers" for my students are among the best most teachers will ever see in their classes.

I taught a largely-unchanged (marginally-worse!) course and my scores went up significantly. Assuming you are okay agreeing that I am a reasonably competent teacher of the course, this is what should be expected if there is a strong teacher-competence effect ceiling on standardized test scores. Fortunately for me, my personal situation has moved the ceiling higher. But given how we talk about these things every scoring season, I'm not at all sure that many other teachers or administrators in many other systems understand that they have their own ceilings beyond which very little is going to make a difference in how their students score on standardized tests.

Of course, this thinking could be extended to suggest that teachers should stop trying to boost the standardized test performance of their students. I think that only logically follows once a teacher has maximized their instructional competence to the point that the ceiling effect is evident. I also think it would be folly for any teacher ever to stop developing professionally, or reflecting on the work that they are doing with and for their students. The idea that whatever efforts you are investing in your work will not be reflected in the standardized test scores of your students should simply be understood, accepted, and ignored. So much of what teachers do can't be usefully measured. That doesn't make any of it any less worthy of our efforts.

Do you see a teacher-competence ceiling in your own work? Or perhaps you have been roused to violent disagreement with my entire thesis? Leave me a comment or drop me a line if you yearn to let me know.