Reporting GCSE performance by groups is fraught with problems

This month the government is publishing school GCSE attainment data separately for groups of low (below L4 at KS2), middle (L4 at KS2) and high (above L4 at KS2) attaining pupils. This approach is to be commended and we recommended it in academic papers here and here because it provides information to parents on how a child like their own is likely to achieve in local schools. These group-based measures offer perspective on the importance of school choice for parents, reminding them that differences in the likely attainment of their child across local schools is often small. They also encourage desirable behaviour from schools by encouraging them to focus on attainment of all pupils, rather than the marginal grade C/D borderline pupils.

Unfortunately there is a small problem with the way that attainment by groups is to be reported by the government. It has chosen to report average attainment across quite a large group of pupils (as many as 45% of pupils are in the middle band). Because the group of pupils is very large, reported average attainment across the group tells us (1) partly how well the school is doing, but importantly (2) partly the distribution of prior attainment for the pupils in the school within this group. In this sense it simply replicates the problems of reporting ‘raw’ GCSE attainment: more affluent schools will appear to do better than more deprived schools, at least in part because their prior ability distribution is more favourable.

In our proposed performance table measures we also reported attainment by group, but we deliberately made the size of the groups small: our low attaining group scored in the 20th-30th percentile at KS2; our middle attainment group scored in the 45th to 55th percentile at KS2; and our high attainment group scored in the 70th to 80th percentile at KS2.*

The chart below illustrates how the use of very large groups of pupils in the government measures favours schools with higher prior ability pupils, relative to our measure that uses much smaller groups.** Each of the 3,000 or so tiny blue dots plots capped GCSE attainment for a group of high attaining pupils (government measure of achieving above L4 at KS2) against the average KS2 score (i.e. prior attainment) of pupils at the school. The red dots plot the same relationship for the Allen and Burgess calculation of average performance for high attaining pupils (70th to 80th percentile at KS2 pupils). The two measures are actually highly correlated (over 80%) and it isn’t informative which measure is higher or lower since they make calculations over different groups. The important difference is that on the government measure, schools with higher attaining intakes do proportionately much better than those with lower attaining intakes. On the Allen and Burgess measure this gradient is far less pronounced, suggesting that the gradient is largely due to differences in the ability profile of the high attaining groups across schools.

GCSE attainment of schools for high attaining pupils

The same relationship can be seen for the middle attaining group of pupils…

GCSE attainment of school for middle attaining pupils

…but the slope disappears for the lower attaining group of pupils, perhaps because just 17 percent of pupils are in the government’s low attaining group (though this is still a larger number than the 10 percent in the Allen and Burgess calculations).

GCSE attainment of school for low attaining pupils

 

While we are on the topic of reporting averages across groups, problems with confounding the composition and performance of groups is the reason why I believe free school meals (FSM) attainment gaps should not be used as a measure of success at a school. Here the problem is that the background characteristics of pupils who are not FSM will vary considerably across schools. So, calculating the group average attainment of non-FSM students at a school tells us a lot about what these non-FSM students are like on entry, and little about how well the school serves them once they arrive.

The chart below calculates the average attainment for non-FSM pupils in each school minus the average attainment for FSM pupils in each school. This is the so-called FSM attainment gap that is used as a performance metric in the new league tables. Non-FSM pupils tend to do about one grade better in each of their eight best GCSE subjects than FSM pupils in the same school, on average. In the chart the size of this gap in a school is plotted against the school’s overall FSM proportion. There is a clear relationship showing that schools that are more deprived overall tend to have a smaller FSM attainment gap.

FSM-nonFSM attainment gap across schools

I first noticed how problematic attainment gaps were in practice as a governor of a school that was struggling to produce strong academic results but was very proud that its FSM gap was zero. All the students at the school came from low income families living on a very large and universally deprived council estate. Some of the families happened to claim benefits that made them eligible for free school meals (they probably weren’t the poorest), others didn’t or couldn’t. Not surprisingly, the GCSE performance of the FSM and non-FSM pupils in this school were no different, on average, because these pupils were no different in their social or educational background. Nothing the school was doing was contributing to this supposed ‘success’.

 Attainment gaps compare groups within a school, whereas we should be comparing a group across schools. What matters to FSM pupils is that a school enables them to achieve qualifications to get on in life. If a low income student gets a low quality education from a school, it is little consolation or use for them to learn that the higher income students were equally poorly served by that school.
—————————————————————————————–

* The ideal is to fit a line of GCSE performance against KS2 scores separately for pupils in each school and to read-off the scores at the 25th, 50th and 75th percentiles. We originally used this approach, but it is hard to communicate the method to parents and we estimated that the error caused by calculating across groups was not serious.

** The 2011 attainment data is not yet subject to public release so this analysis is carried out on an earlier year of data and is intended to be illustrative of the problem rather than a definitive relative measurement of metrics.