Don’t let ‘perfect’ become the enemy of ‘better’ in the revision of accountability metrics

Last week, Ed Dorrell wrote a strange editorial in TES called Why attaching excluded pupils’ results to their school won’t work‘. I say it was strange because he failed to address the major impediment to including off-rolled pupils in accountability metrics (i.e. finding them… for that, read on). There is no doubt that there are some complicated choices, trade-offs to consider, and new sets of undesirable behaviours that would arise from making schools accountable for the pupils they teach. However, we are not starting from a neutral position and everyone agrees that the current accountability measures reward schools who find a way to lose students from their roll AND that this is an increasing problem.

We have to act, and do so with the following principle in mind. When we construct accountability metrics, our primary goal is that they encourage schools to behave as we want them to – in their admissions, expulsions, and nature of education provisions. Ensuring the accountability metric fairly represents school quality should be second order, as tough as that feels to heads. (Why? I’ll write about it another time, but essentially the mechanisms by which having greater precision on estimates of school quality feed through to improved educational standards are pretty blunt.)

The question of how we should count a student when they leave the school roll or arrive part-way through school should be viewed through the lens of the school behaviours we’d like to invoke. We want the school to maximise the educational success of the community, including that student. We want them to remove students if they are disrupting others (and thus lowering likely GCSE scores). If schools do remove them, we want to them to take an interest in ensuring that student then transfers to another school or alternative provision, rather than encouraging them to be ‘home schooled’. If the student is not disrupting the school community, and is more likely to be successful at their school than elsewhere, then barring specific circumstances (e.g. breaking the law or serious school rules), we want them to retain the student.

Ed poses a set of questions that suggest it is ‘unfeasibly complicated and impossible’ to explain and police a measure of Progress 8 that weights student results according to which of the 15 terms of secondary education they had spent in their secondary school (or equivalent in a middle school system). He is correct in the sense that there are literally an infinite number of choices we have to make – as there were when we made decisions about how to create the current Progress 8, incidentially. But choice from an infinite choice set is not ‘unfeasibly complicated and impossible’. All we need to do is pick an algorithm – any algorithm – that produces BETTER behaviours than we currently have. Again, whether or not it represents precisely how ‘good’ the school is isn’t our primary consideration.

Here are four alternative choices:

  1. Status quo = each student present in Spring y11 (term 14 out of 15) is weighted with a value of 1. The behavioural response is well known – there is a strong incentive to find a way to remove from the school roll any student who is likely to have a strongly negative progress score.
  2. Year 7 base option = schools are held accountable for the results of those who were admitted, regardless of whether they complete their education there. The advantage is that this produces a strong incentive to maximise the exam outcomes of each student who is enrolled, whether on roll or off roll. The disadvantage is that students admitted after year 7 will not count and so there is no need to maximise their GCSE outcomes. That said, it would encourage schools to feel comfortable in accepting previously excluded students from other schools as a fresh start, knowing that they will not be penalised in their performance table metrics.
  3. FFT Education Datalab proposal = each student is weighted according to the number of terms they spent at the school. This means schools would need to consider the best needs of every student that passes through the school, whether they are still on-roll or not. However, it does create an incentive to accelerate moving off-roll any student that is struggling. Would this produce large numbers of students being moved off roll during years 7 and 8 in a manner that is worse than current practice? This is judgement call.
  4. Ever seen option = every student who appears at any stage in the school is weighted with a value of 1 (this is the unweighted version of the FFT Education Datalab proposal). This fixes the problem with the weighted method whereby off-rolling early is better than off-rolling late. However, it doesn’t fix the current incentive to avoid taking on previously excluded students from other schools to give them a fresh start.

All other options (e.g. ever seen since year 9; weighting ks4 terms more than ks3 terms; etc…) can be viewed as a minor deviation from the above in terms of the types of behaviour they induce.

These adjustments to a progress score DO NOT ‘disproportionately punish the majority of schools – those who strive to get the best out of even the most challenging students and for whom exclusion is a last resort’. Quite the opposite; they are less punishing to those schools who take on previously excluded students to give them a second chance in mainstream education.

As FFT Education Datalab showed, the vast majority of schools who have pupils come and go as normal should not worry since any modification to Progress 8 will not materially affect them. It is only high-mobility schools where there are substantial differences between the types of students who leave the school and the types of students who arrive at the school that are likely to be affected.

None of this is difficult to implement and progress measure tweaks are entirely independent of issues around the commissioning of alternative provision. The termly pupil census means we can straightforwardly calculate progress figures and match pupils to their examinations data.

That said, Ed’s piece fails to identify the one major impediment to holding schools accountable for pupils they taught, even after they left. There are two situations where we do not want to hold them accountable for a student who receives no GCSE examination results: (1) where that student left the country; and (2) where the student died or was unable to complete their education for serious medical reasons. When students disappear from all school and examination records, central government does not know the reason why because we have no census which covers children outside of education. School accountability isn’t a good enough reason to set-up a full annual census of children, using GP records as a starting point. But, given the rise in ‘home schooling’ where parents are not even present to educate the teenager, there are very clear safeguarding reasons why it is time to look again at introducing one.

In the meantime, I don’t see concerns about death and migration as so material that we should continue with the damaging incentives set up by Progress 8 which currently allows substantial off-rolling without consequence.

Remember: there isn’t a world in which accountability is perfect, but there are many accountability measures that are better than the status quo.

The social mobility challenge is not impossible

Schools-datajournogenius Christopher Cook of the FT wrote a nice blog post this week showing how forcing academy conversion for low performing schools probably wouldn’t do much to fix social inequalities in educational achievement. I agree. But I want to show some (quick and dirty) data to help keep alive the dreams of school reformers. This data refutes his suggestion that poor children do badly in the majority of England’s schools.

Schools do make a difference to the lives of poor children, far more so than for rich children who do well everywhere. Professor Simon Burgess and I noticed this when we were working on local school performance tables, which consistently showed that choice of local school appeared to matter far more for low ability children than it did for high ability children:

Equally, the variation in achievement for children from deprived neighbourhoods is greater than than for children from affluent neighbourhoods. This chart shows the 10th-90th percentile range of GCSE achievement by deprivation of neighbourhood:

Which schools make a difference to poor children’s lives? Well, Chris Cook shows it isn’t the schools in the top half of the national league tables on the % 5+ A*-C measure. But why should they? These schools aren’t necessarily ‘good’ schools, they are just relatively affluent schools. I’ve opened up my dataset to look a little harder for some high quality schools.

How do high quality schools, as judged by Ofsted, do on the social mobility challenge? This chart here shows that Ofsted-judged outstanding schools are pretty good social levellers, on average (i.e. they are much better than the average school for poor children but only moderately better than the average school for rich children)…

…And how to high value-added schools do on the social mobility challenge? Not so well overall. But this chart does show that schools who perform poorly on a CVA-style value-added measure appear to be a serious drag on social mobility…

…and, of course, if we start digging deeper into the data we can find hundreds of individual examples of outstanding schools that truly appear to transform the lives of children from deprived neighbourhoods, such as this well known north east London academy that recently lost its headteacher:

So, while I agree with Chris Cook that within-school variation in attainment remains a huge problem and that education policy can never fix all of society’s problems, I want to give a ray of hope to the policy makers, headteacher and teachers who work in education because they believe they can transform lives.

It’s not all hopeless (although it also isn’t easy). There are schools where pupil achievement isn’t entirely dependent on social background. We can’t close the social class attainment gap, but the best schools do help make it much smaller.

Why the new school league tables are much better … but could be better still

CMPO Viewpoint

Rebecca Allen (IOE) and Simon Burgess (CMPO)

Tomorrow the new school league tables are published, with the usual blitz of interest in the rise and fall of individual schools. The arguments for and against the publication of these tables are now so familiar as to excite little interest.

But this year there is a significant change in the content of the tables.  For the first time, GCSE results for each school will be reported for groups of pupils within the school, groups defined by their Keystage 2 (KS2) scores. Specifically, for each school the tables will report the percentage of pupils attaining at least 5 A* – C grades (including English and maths) separately for low-attaining pupils, high attaining pupils and a middle group.  This change has potentially far-reaching implications, which we describe below.

This is a change for the better, one that we have proposed and supported

View original post 1,092 more words

Reporting GCSE performance by groups is fraught with problems

This month the government is publishing school GCSE attainment data separately for groups of low (below L4 at KS2), middle (L4 at KS2) and high (above L4 at KS2) attaining pupils. This approach is to be commended and we recommended it in academic papers here and here because it provides information to parents on how a child like their own is likely to achieve in local schools. These group-based measures offer perspective on the importance of school choice for parents, reminding them that differences in the likely attainment of their child across local schools is often small. They also encourage desirable behaviour from schools by encouraging them to focus on attainment of all pupils, rather than the marginal grade C/D borderline pupils.

Unfortunately there is a small problem with the way that attainment by groups is to be reported by the government. It has chosen to report average attainment across quite a large group of pupils (as many as 45% of pupils are in the middle band). Because the group of pupils is very large, reported average attainment across the group tells us (1) partly how well the school is doing, but importantly (2) partly the distribution of prior attainment for the pupils in the school within this group. In this sense it simply replicates the problems of reporting ‘raw’ GCSE attainment: more affluent schools will appear to do better than more deprived schools, at least in part because their prior ability distribution is more favourable.

In our proposed performance table measures we also reported attainment by group, but we deliberately made the size of the groups small: our low attaining group scored in the 20th-30th percentile at KS2; our middle attainment group scored in the 45th to 55th percentile at KS2; and our high attainment group scored in the 70th to 80th percentile at KS2.*

The chart below illustrates how the use of very large groups of pupils in the government measures favours schools with higher prior ability pupils, relative to our measure that uses much smaller groups.** Each of the 3,000 or so tiny blue dots plots capped GCSE attainment for a group of high attaining pupils (government measure of achieving above L4 at KS2) against the average KS2 score (i.e. prior attainment) of pupils at the school. The red dots plot the same relationship for the Allen and Burgess calculation of average performance for high attaining pupils (70th to 80th percentile at KS2 pupils). The two measures are actually highly correlated (over 80%) and it isn’t informative which measure is higher or lower since they make calculations over different groups. The important difference is that on the government measure, schools with higher attaining intakes do proportionately much better than those with lower attaining intakes. On the Allen and Burgess measure this gradient is far less pronounced, suggesting that the gradient is largely due to differences in the ability profile of the high attaining groups across schools.

GCSE attainment of schools for high attaining pupils

The same relationship can be seen for the middle attaining group of pupils…

GCSE attainment of school for middle attaining pupils

…but the slope disappears for the lower attaining group of pupils, perhaps because just 17 percent of pupils are in the government’s low attaining group (though this is still a larger number than the 10 percent in the Allen and Burgess calculations).

GCSE attainment of school for low attaining pupils


While we are on the topic of reporting averages across groups, problems with confounding the composition and performance of groups is the reason why I believe free school meals (FSM) attainment gaps should not be used as a measure of success at a school. Here the problem is that the background characteristics of pupils who are not FSM will vary considerably across schools. So, calculating the group average attainment of non-FSM students at a school tells us a lot about what these non-FSM students are like on entry, and little about how well the school serves them once they arrive.

The chart below calculates the average attainment for non-FSM pupils in each school minus the average attainment for FSM pupils in each school. This is the so-called FSM attainment gap that is used as a performance metric in the new league tables. Non-FSM pupils tend to do about one grade better in each of their eight best GCSE subjects than FSM pupils in the same school, on average. In the chart the size of this gap in a school is plotted against the school’s overall FSM proportion. There is a clear relationship showing that schools that are more deprived overall tend to have a smaller FSM attainment gap.

FSM-nonFSM attainment gap across schools

I first noticed how problematic attainment gaps were in practice as a governor of a school that was struggling to produce strong academic results but was very proud that its FSM gap was zero. All the students at the school came from low income families living on a very large and universally deprived council estate. Some of the families happened to claim benefits that made them eligible for free school meals (they probably weren’t the poorest), others didn’t or couldn’t. Not surprisingly, the GCSE performance of the FSM and non-FSM pupils in this school were no different, on average, because these pupils were no different in their social or educational background. Nothing the school was doing was contributing to this supposed ‘success’.

 Attainment gaps compare groups within a school, whereas we should be comparing a group across schools. What matters to FSM pupils is that a school enables them to achieve qualifications to get on in life. If a low income student gets a low quality education from a school, it is little consolation or use for them to learn that the higher income students were equally poorly served by that school.

* The ideal is to fit a line of GCSE performance against KS2 scores separately for pupils in each school and to read-off the scores at the 25th, 50th and 75th percentiles. We originally used this approach, but it is hard to communicate the method to parents and we estimated that the error caused by calculating across groups was not serious.

** The 2011 attainment data is not yet subject to public release so this analysis is carried out on an earlier year of data and is intended to be illustrative of the problem rather than a definitive relative measurement of metrics.