Writing the rules of the grading game (part III): There is no value-neutral approach to giving feedback

These three blogs (part I, part II, part III here) are based on a talk I gave at Headteachers’ Roundtable Summit in March 2019. My thoughts on this topic have been extensively shaped by conversations with Ben White, a psychology teacher in Kent. Neither of us yet know what we think!

Our beliefs about our academic ability are often so tightly intertwined with our sense of self that we must take care in how we talk about it. It is only with a clear mental model of how feedback might alter a parent or student’s beliefs and goals that we can understand how to manage risks and enhance potential gains involved in communicating attainment. Inducing competitive behaviour can be enormously helpful in encouraging student effort in these situations where many of the benefits of learning are long-term and poorly appreciated by the learner. People tend to be highly motivated by facing up to social comparisons, and students will make these comparisons whether you give them ranking information or not. However, the mental model I’ve described suggests that pushing the competitive focus too far risks lowering effort or pushing students off the game, especially where they feel there is no prospect of doing well.

Risks in giving clear, cohort-referenced feedback

The model implies there are three situations where giving parents or students clear cohort or nationally-referenced feedback can lower future effort. Firstly, if they receive unexpectedly positive feedback, it could lead to complacency about effort required in the future. In fact, the simple act of learning your place on the curve with greater certainty than you had before could even be unhelpful if it gives you greater comfort that you are where you want to be. Secondly, there is a risk of demoralisation if you come to believe that the effort needed for a small improvement in ranking isn’t worth it. Thirdly, game-switching is a risk if you decide you can achieve better returns by working at something else. I think all these risks, but particularly the third, are deeply culturally situated and framed by how you communicate the value of achievement and hard work. Only you can know whether you have created a school climate where you can keep all your students playing the grading game you create for them.

Risks in giving kind and fuzzy feedback

So what are the risks around the alternative – the fuzzy language we use to talk about attainment? Teachers often suffer from rater leniency (being a little too generous with student grading) when there is room for subjectivity. The mental model shows why ‘kind’ feedback might – or might not – be so helpful. On the one hand, the model highlights why we might want to be lenient. By saying “you’ve learnt a lot”, we are hoping students feel more confident in their ability to learn in the future. This is unambiguously a good thing. Also, by saying “you’ve learnt a lot”, we are hoping to keep vulnerable students engaged in our grading game. However, leniency in the rating of a skill level can equally reduce motivation as it may signal the student has already done enough to get to the position they’d like to be in. Hence, while raising confidence in the ability to acquire a certain skill or achieve an outcome can be beneficial, raising confidence in the skill itself or the level of past achievements can be detrimental.

Withholding clear attainment information from parents and students can also be damaging if it de-prioritises YOUR game in their minds and fails to give them the information they need to ensure they maintain the position on the bell curve that they would like to achieve. Primary schools are the masters of ‘fuzzy’ feedback. I suspect a majority of primary parents are told their child is ‘as expected’ in most schools, yet these parents would respond quite differently to discovering their child was ranked 7/30 versus 23/30 in class. What mental model of the beliefs, capabilities and desires of that child and their parents leads primary schools to believe it is in the family’s interest to withhold clear attainment information? There is nice evidence from elsewhere in the world that shows powerful effects of communicating frequent and transparent attainment grades with parents of younger children. It would be great to have a trial in this country to learn which of our families respond to this information.

Downplaying the importance of prior attainment in the game

One unambiguous finding from this literature is the importance of trying to maintain strong student beliefs in their ability to climb up the rankings through making an effort. The teacher’s dilemma is how to maintain strong beliefs that learning is productive (i.e. you can do this) without telling students they’ve already done well enough (i.e. you’re already there). One implication is that you need to construct a game where feedback scores are truly responsive to changes in effort.

The problem we face is that performance in a test is frequently more strongly determined by prior knowledge/IQ than by recent student effort. Students are frequently rational in appreciating that effort yields few rewards. One approach to lessening the anchoring effect of prior knowledge is to encourage comparisons between students with similar prior attainment. For example, if a school has subject ability-setting, then within-class comparisons promote a competition where effort is more strongly rewarded than do whole-school comparisons. This approach will only be effective though if students ‘buy into’ the within-class games you’ve constructed, of course.

I am frequently asked why we need make comparisons with other students at all. Asking a student to compete with their own past performance avoids many of the problems I have discussed, though we tend to be less motivated by competition with ourselves! Ipsative feedback compares performance in the same assessment of the same domain over time and we frequently use it outside school settings (e.g. my 5km speed this week compared to last). It would be great to see these ipsative comparisons encouraged more in schools, but there are good reasons why their application is limited. When we teach we tend to continuously expand the knowledge domain we wish to assess, making ipsative comparisons less straightforward (except in restricted domains such as times tables). (And since everyone in the class must plough on with learning the curriculum at the same speed, regardless of whether they are ready, mastery assessment approaches where we accumulate a list of competencies as they are reached also aren’t very practical). I think creating strange ‘progress’ measures, fudging within-student comparisons between non-standardised tests from one term to the next, are attempts to encourage students to make these comparisons with themselves. They can certainly be justified by the mental model described in the previous post. (For what it’s worth, though, I am not really convinced they are credible metrics in a game that students actually care about.)

A meaningful game where there are more winners than losers over time

If you want to construct a game where making an effort typically yields a decent return, why on earth would you make it a zero-sum game – as ranking does – where half your class will necessarily be losers despite making an effort? One initial step to avoid this is to create benchmarks that are external to the school, i.e. national reference points (invented, if necessarily), to remove the requirement for there to be losers.

National-benchmarking, such as using standardised scores, still doesn’t ensure effort is typically rewarded though, unless your school happens to be able to outpace the national benchmark. To do this, your system for describing attainment could invent typical grades or levels that rise a little each year as students move through the school, generating a sense that pupils are getting better at your game. And so, our mental model starts to explain why schools invent and re-invent arbitrary levels systems!

But our mental model also asserts that we need our game to feel meaningful to students, with rewards that they value (otherwise they won’t feel inclined to work hard at it). Achieving a ‘Level Triangle’ (or whatever you choose to call it) might not feel meaningful enough to some. Is this how you settle on the idea of using GCSE grades as your levels system, since we know they are a grade which has motivational currency to students? Why not invent fictional internal scales and tell students they are at a GCSE grade 2 in Year 7, grade 3 in Year 8, and so on? Of course, technically this is a nonsense – a 12 year old who hasn’t studied the GCSE specification cannot possibly be a GCSE grade anything.

We find ourselves creating meaningless games, but ones that might be worthwhile because they have better motivational properties than any other game we could invent for our students to play! I hate meaningless data, but I’d find it hard to argue that schools shouldn’t use it if they could demonstrate it increased student effort.

The curious flightpath games we play

This gives us a new perspective on trying to make sense of the flightpath, a 5-year board game where individual students are asked to keep on, or beat, the path we’ve set up for them. It is easy to be dismissive of this game on grounds of its validity, especially when we know how little longitudinal attainment data conforms to the paths we create. But we should also ask whether it is more or less motivating than the termly ranking game or any other grading game we could give them as an alternative.

I’m pretty sure the standard, fixed flightpath that maps student attainment from Key Stage Two data, and is impervious to effort or new information, has poor motivational properties. The motivational properties are poor for those students who discover they are on track for a Grade 3, before they’ve had a chance to work hard in secondary school. They might also be poor for those who are told that, in all likelihood, they’ll attain a Grade 8 or 9. The game card prioritises the signal of prior attainment (bad for motivation) and underplays the importance of effort in reaching any desired goal.

But what about the schools who use dynamic flightpaths, updating each students game card each term or year in light of new effort and attainment information? Suppose that, at all times, the game card also shows the non-zero probability of any grade being attained in the future to signal the importance of effort in achieving goals. Is it possible that this type of dynamic grading can help students create a game it is possible to do well at, preserving useful beliefs about effort being productive, whilst also signalling that more effort is needed to get to the next position they’d like to attain?

This is all speculation – there isn’t any research out there that can tell you the impact of using target grades, predictions or flightpaths in different types of schools. All we can do is to invoke mental models to think through how they might affect motivation (one nice speculation about target grades is by James Theo).

The ethics of telling un-truths

When we construct grading games that prioritise manipulating behavioural responses over the whole-truth about attainment, we have to face up to tricky ethical dilemmas. We face these all the time in schools when we tell the half-truths we do to parents and students about attainment; the exploration of mental models simply makes it more explicit why we do it, and who we might help or damage in the process.

Mental models also make it explicit that one grading system will not suit all students equally well. Slightly over-confident, competitively minded students who are able to figure out how to translate effort into learning would do well in a pure rankings system. They will have classmates who find competition stressful and, even with considerable effort, risk slipping behind each year for reasons entirely outside their control. Those researchers who showed that cohort-referenced grades can improve school exam results also showed they increased inequality in happiness amongst students overall. If there are trade-offs, whose welfare do we prioritise?

Conclusion

Choosing how to give attainment feedback to students and their parents is a minefield, but I hope by now you appreciate that choosing NOT to give clear, interpretable (i.e. often norm-referenced) feedback on how a student is doing is not a neutral position to take. It can be damaging to the motivation of certain students under certain circumstances, and you need a clear mental framework to understand why this happens.

Equally, validity of inference should not be the only concern in working out how you are going to report attainment at school. Systems that look bizarre on the face of it, such as flightpaths, might have an intelligible approach to motivating and managing students’ complex belief systems.

If, on getting to the end of these posts, you feel utterly confused about what it is right to do, I think that’s OK. We can be pretty sure that choosing your grading system isn’t the most important decision a leadership team makes. It is true that many of these studies identify a costless and significantly positive effect of giving attainment feedback, particularly at points in time where the stakes are high or where attainment is not yet well known. However, the overall impact of a change in attainment reporting on end-of-school outcomes will typically be quite small, on average.

Nobody can tell you how you should construct your own grading game. The findings of the literature are inconsistent because the mental models of how we are trying to change student beliefs are very complex. How your students will respond to your grading system through your manipulation of their belief systems strongly depends on your school culture and on localised social norms amongst peers. The best you can do is take the time to learn what students believe to be true about themselves – both in their current attainment and in their capability to learn and progress. It is these existing beliefs that students hold about themselves that give you a clue as to how they might respond to your grading game.

Good luck with writing the rules of your grading game (it’s not easy)!

6 thoughts on “Writing the rules of the grading game (part III): There is no value-neutral approach to giving feedback”

Matt Finn (@MattMattFinn)

Really interesting series. I think this is what I was coming across and describing here: https://journals.sagepub.com/doi/10.1177/1474474015575473. The themes of surprises, effortful behaviour, motivation, use of progress data, ‘effective GCSE grades’ and so on all seem very familiar. I remain very ambivalent about what I was observing – it was working for most students, at least at keeping students playing the game, but it caused all sorts of other issues for curriculum, making students responsible for teachers in seemingly perverse ways, and other such unintended consequences.

April 27, 2019 at 2:40 pm Reply
Nick von Behr

Thanks Becky. A brilliant and clear analysis of the key issues. It makes me want to write a post on my experience of assessment and grading in STEM education policy starting from 2002 onwards with ACME and the Royal Society. If I do I will link back to yours. I got so frustrated with assessment policy that I focused more on the role of specialist teachers instead, but here I think you are trying to make links across the two. The same needs to happen at all levels in the education system, especially in universities where we have assumed their research-led expertise in these matters!

April 28, 2019 at 8:38 am Reply
eustudentsvoices

Thanks for this blogs Rebecca some inserting still to think about. Yes, I am aware of our “need” to complete with out ‘neighbours’ I know studies on pay show this (https://warwick.ac.uk/newsandevents/pressreleases/study_says_money/). Two questions – does this relative grading help in terms of absolute achievement (so I could be top of my class but still doing ‘badly’) I know you also link this to national norms above but this is difficult if the data does not exist? Also I could do the “same” in absolute achievement terms but move radially in the position in class.

Does this also depend on the types of assessment we are using – which may be reductive if we are looking to produce data to enable us to situate children on the bell curve. Does this approach encourage a multi-model assessment approach?

Finally, should assessment not focus on “what you can do now” and “what you need to do next” (as I undertaking Wiliam’s saying) rather than this relative ranking?

April 28, 2019 at 9:41 am Reply
eustudentsvoices

Appendum: first sentence should have read “.. some interesting things …” (apologies)

April 28, 2019 at 9:42 am Reply
Pingback: Writing the rules of the grading game (part II): The games children play – Becky Allen
Pingback: Writing the rules of the grading game (part I): The grade changes the child – Becky Allen