The pupil premium is not working (part III): Can within-classroom inequalities ever be closed?

On Saturday 8th September 2018 I gave a talk to researchED London about the pupil premium. It was too long for my 40-minute slot, and the written version is similarly far too long for one post. So I am posting my argument in three parts [pt I is here and pt II is here].

I used to think social inequalities in educational outcomes could be substantially reduced by ensuring everyone had equal access to our best schools. That is why I devoted so many years to researching school admissions. Our schools are socially stratified and those serving disadvantaged communities are more likely have unqualified, inexperienced and non-specialist teachers. We should fix this, but even if we do these inequalities in access to experienced teachers are nowhere near stark enough to make a substantial dent on the attainment gap. In a rare paper to address this exact question, Graham Hobbs found just 7% of social class differences in educational achievement at age 11 can be accounted for by differences in the effectiveness of schools attended.

Despite wishing it weren’t true for the past 15 years of my research career, I have to accept that inequalities in our schooling system largely emerge between children who are sitting in the same classroom. If you want to argue with me that it doesn’t happen in your own classroom, then I urge you to read the late Graham Nuthall’s book, The Hidden Lives of Learners, to appreciate why you are (probably) largely unaware of individual student learning taking place. This makes uncomfortable reading for teachers and presents something of an inconvenience to policy-makers because it gives us few obvious levers to close the attainment gap.

So, what should we do? We could declare it all hopeless because social inequalities in attainment are inevitable. Perhaps they arise through powerful biological and environmental forces that are beyond the capabilities of schools to overcome. If you read a few papers about genetics and IQ it is easy start viewing schools as a ‘bit part’ in the production of intelligence. However, at least for me, there is a ray of hope. For these studies can only tell us how genetic markers are correlated with educational success in the past, without reference to the environmental circumstances that have allowed these relationships to emerge. Similarly, children’s home lives heavily influences attainment, but how we organise our schools and classrooms is an important moderator in how and why that influence emerges. Kris Boulton has written that he now views ‘ability’ as something that determines a child’s sensitivity to methods of instruction; so the question for us should be what classroom instructional approaches help those children most at risk of falling behind.

Having made it this far through my blogs, I suspect you are hoping for an answer as to what we should do about the attainment gap. I don’t have one, but I am sure that if there were any silver bullets – universal advice that works in all subjects across all age ranges – we would have stumbled on them by now. Instead, I’d like to take the final words to persuade you that our developing understanding of the human mind provides teachers with a useful language for thinking about why attainment gaps emerge within their own classrooms. Whether or not they choose to do anything about that is another matter entirely.

Focusing on inequalities in cognitive function rather than socio-economic status

In earlier blogs I have argued that noting the letters ‘PP’ on seating plans does not provide teachers with useful information for classroom instruction. Labelling students by their educational needs is helpful (and essential for secondary teachers who encounter hundreds of children each week) and I think paying more attention to variation in cognitive function within a class has far more value than their pupil premium status. Cognitive functions are top-down processes, initiated from the pre-frontal cortex of the brain, that are required for deliberate thought processes such as forming goals, planning ahead, carrying out a goal-directed plan, and performing effectively.

The neuroscience of socio-economic status is a new but rapidly growing field and SES-related disparities have already been consistently observed for working memory, inhibitory control, cognitive flexibility and attention. There is much that is still to be understood about why these inequalities emerge, but for a teacher faced with a class to teach, their origins are not particularly important. What matters is that they use instructional methods that give students in their class the best possible chances of success, given the variation in cognitive function they will possess.

Implications for the classroom

Unfortunately, translating this knowledge about social inequalities in cognitive function into actionable classroom practice is difficult and rather depends on the subject and age of children you teach. Maths teacher-bloggers find cognitive load theory insightful; other subjects less so. This is because developing strategies to overcome limitations in working memory through crystallised knowledge is more productive in hierarchical knowledge domains (maths, languages, handwriting, etc) where the benefits of accumulating knowledge and fluency in a few key areas spill across the entire curriculum.

That said, I think social inequalities in attention and inhibitory control affect almost all classroom settings. Attention is the ability to focus on particular pieces of information by engaging in a selection process that allows for further processing of incoming stimuli. Again, this is a young field but there are studies (e.g. here and here) that suggest it is a very important mediator in the relationship between socio-economic status and intelligence.

When you see a child who is not paying attention in class, what are they attending to? Graham Nuthall’s New Zealand studies showed how students live in a personal and social world of their own in the classroom:

They whispered to each other and passed notes. They spread rumours about girlfriends and boyfriends, they organised their after-school social life, continued arguments that started in the playground. They cared more about how their peers evaluated their behaviour than they cared about the teacher’s judgement… Within these standard patterns of whole-class management, students learn how to manage and carry out their own private and social agendas. They learn how and when the teacher will notice them and how to give the appearance of active involvement. They get upset and anxious if they notice that the teacher is keeping more than a passing eye on them.

We tend to assume that attentiveness is an attribute of the child, rather than something it is our job to manipulate. Teacher and psychology researcher, Mike Hobbiss, says we should instead view ‘paying attention’ as the outcome of instruction methods. In a blog post he urges us to create classroom conditions that are likely to engender the effect of focused attention by making our stimuli as attractive as possible and by reducing other distractors. We could do this by having students face the front, by controlling low-level disruption, and by removing mobile phones and fancy stationery materials, and so on. And since attention is limited (and more so in some children than others), he points out that: ‘capturing attention is not in itself the aim. The goal is to provide the optimal conditions so that attention is captured by the exact stimuli that we have identified as most valuable’.

There are a number of very successful schools I have visited where shutting down the choices about what students get to pay attention to during class is clearly the principal instrument for success. I am glad I have visited them, despite the state of cognitive dissonance they induce in me. On the one hand, I am excited to see schools where the quality of student work is beyond anything I thought it was possible to achieve at scale. On the other hand, their culture violates all my preconceptions about what school should be like. Childhood is for living, as well as for learning, and I find it uncomfortable to imagine my own children experiencing anything other than the messy classrooms of educational, social and interpersonal interactions that I did.

However, I do now think that we have to face up to the trade-offs that exist in the way we organise our classrooms. If we care about closing the attainment gap and we accept the relationship between SES and cognitive function, then surely our first port of call should be to create classroom environments and instructional programmes that prioritise the needs of those who are most constrained by their cognitive function? In many respects, we are still working out what this means for the classroom, but I’m pretty sure that being laissez-faire about what students can choose to pay attention to in class is likely to widen the attainment gap.

Graham Nuthall was not particularly optimistic about disrupting the cultural rituals of our classroom practice to improve what children are able to learn. He believed these rituals persist across generations because we learn about what it means to be a teacher through our own schooling as a child. We have deeply embedded values about the kinds of experiences we want our students to have in our classrooms. For him, the cultural values of teachers are the glue that maintains our schooling system as it is, with the consequence that it entrenches the attainment gaps we’ve always had.

Conclusion

The pupil premium, as a bundle of cash that sits outside general school funding with associated monitoring and reporting requirements, isn’t helping us close the attainment gap. We should just roll it into general school funding, preserving the steep social gradient in funding levels that we currently have. When we teach children from households that are educationally disengaged there is a lot we can do to help by way of pastoral and cultural support. This costs money and monitoring test scores isn’t the right way to check this provision is appropriate.

We shouldn’t ring fence funds for pupil premium students, not least because they may not be lowest income or most educationally disadvantaged students in the school. We should stop measuring or monitoring school attainment gaps because it is a largely statistically meaningless exercise that doesn’t help us identify what is and isn’t working in our school. In any case, ‘gaps’ matter little to students from poorer backgrounds; absolute levels of attainment do.

I understand the argument that marking ‘PP’ on a seating plan or generating a ‘PP’ report introduces a language and focus around helping the most disadvantaged in the school. I have argued that this language is of little value if it distorts optimal decision-making and takes the focus away from effective classroom practice. Instead, by focusing on disadvantage in the classroom – that is, cognitive functions that place students at an educational disadvantage – we have the opportunity to better understand how our choice of instructional methods maximises the chances of success for those most at risk of falling behind. I very much doubt it enables us to close the attainment gap, but I like to think it will help us achieve more success than we’ve had so far.

I am not unrealistic about how hard this is: our teachers have amongst the highest contact hours in the OECD and this has to change if they are to have the time to modify how they teach. But more importantly, we have to decide that changing classroom practice is something we want to do, even if it disrupts our long-held cultural ideals of what education should look like.

The pupil premium is not working (part II): Reporting requirements drive short-term, interventionist behaviour

On Saturday 8th September 2018 I gave a talk to researchED London about the pupil premium. It was too long for my 40-minute slot, and the written version is similarly far too long for one post. So I am posting my argument in three parts [pt I is here and pt III is here].

Most school expenditure sustains a standardised model of education where 30 children are placed in a room with a teacher (and a teacher assistant if you are lucky). Now, for the government to sustain its pupil premium strategy, it makes schools evidence the impact of their pupil premium spending on attainment. But it’s hard to build evidence for that impact if you’re just spending the cash sustaining a well-established, standardised model. (Unless… you segregate all the pupil premium children into one classroom first… though you really shouldn’t, and I have only come across one school so far that is mad enough to do that.)

Instead, in their efforts to close the gap between students sitting in the same classroom, schools ‘target’ pupil premium students with activities and interventions that sit outside the standard whole class activities of a school: tutoring, withdrawal from class with teaching assistants, breakfast clubs, extracurricular activities, and so on. Intervention-type activities suit this short-termist funding stream that is entirely dependent on whether pupil premium eligible students enroll, or not. The chart below shows that over half of the 2,500 teachers answering the Teacher Tapp survey app reported that targeted interventions were provided to pupil premium students, a group that I’ve argued do not have a well-defined set of social or educational needs.

10TT2

In the classroom too, pupil premium students frequently receive different treatment. 63% of teachers say they are required to monitor their progress more closely than other students; 18% say they mark their books first; two-thirds of secondary teachers are required to mark out the status of pupil premium students on their seating plans.

9TT1

You could argue that all this is, at worst, inefficient both in its choice of activities and targeting of pupils. But headteachers frequently explain to me the ethical dilemmas this raises in their own schools, where pupils in greater need are excluded from clubs or provision in a manner that can be impossible to explain to parents without identifying those who are disadvantaged.

History teacher, Tom Rogers, has written several posts explaining how the pupil premium has pushed ethical boundaries too far. Here he explains:

11TES1

In another post, he describes how it affects classroom teachers:

12TES2

At this stage I know there will be some school leaders and consultants thinking “Yes, but you don’t have to do any of these things. You can spend the money supporting interventions and high quality teaching for all those who need them”. In a sense they are right: the pupil premium hypothecation is only notional and nobody asks to see an audit trail of the expenditure. But if this is our best argument for sustaining the pupil premium as it is, then surely we should just roll it into the general schools funding formula with all the other money that disproportionately flows to schools serving disadvantaged communities?

In any case, it takes a brave headteacher and governing body to explain to Ofsted that they choose to spend their pupil premium funding on non-pupil premium students in need. After all, newspaper articles such as this by Louise Tickle in the Guardian constantly remind them that expenditure must raise the attainment of pupil premium children:

13Guardian1

Ofsted comment on pupil premium expenditure and attainment more often than not, even during short inspections. In a sample of 663 Ofsted reports we reviewed from the 2017/18 academic year, 51% mention the pupil premium and well over half of these assert that inspectors can see the monies are being spent effectively!

Where their comments are critical of pupil premium expenditure, they rarely make concrete recommendations that could be useful to anyone, except to the industry of consultants and conferences that help schools solve the riddle of how to spend the pupil premium. These are example quotes from inspection reports (with the one mentioning external review appearing regularly):

  • The school does not meet requirements on the publication of information about the pupil premium spending plan on its website
  • The leaders and managers do not focus sharply enough on evaluating the amount of progress in learning made by the various groups of pupils at the school, particularly the pupils eligible for the pupil premium …
  • An external review of the school’s use of the pupil premium funding should be undertaken in order to assess how this aspect of leadership and management may be improved

Governors are expected to take a central role in relation to monitoring this pot of money (one-third of Ofsted’s pupil premium comments mention governors). Not only must they be trained in how to monitor and evaluate their attainment gap, they should be capable of examining what interventions have been shown to work and be able to analyse pupil attainment data ‘forensically’ (according to an EEF employee quoted in this article).

What should money for disadvantaged pupils be spent on if we want to close the gap?

I have argued that the pupil premium is constructed in a way that encourages interventionist rather than whole class approaches to education improvement, and it does so for a group of students without a well specified set of needs.

Schools that serve more disadvantaged communities do need considerably more money to operate. Their students frequently have greater pastoral needs and they face higher costs of dealing with safeguarding, attendance and behaviour. Equally, we want these schools to provide rich cultural experiences that the students might not otherwise afford. And yet, many of these things we’d like schools to spend money on aren’t central to the question of how we should spend money to raise attainment (remember, the pupil premium is supposed to be used to raise attainment).

Beyond the obvious provision to help make home life matter less to education (e.g. attendance and homework support), we struggle to make highly evidenced and concrete recommendations, in part because ‘money’ has a poor track record in raising educational standards in general. The Education Endowment Foundation was established alongside the pupil premium with the expectation they would identify effective programmes or widgets that schools could then spend money on. Unfortunately, most trials have shown that programmes are no more effective than existing school practice, and in any case free school meal eligible children do not disproportionately benefit from them.

And if we turn to the bigger picture, there is a large literature on the relationship between money spent and pupil outcomes. This isn’t the place to review the literature, but studies (particularly UK ones) frequently show that money does not matter to pupil attainment as much as we think it should. I wish it did, for that would give us a policy lever to improve education.

Money changes the way we educate. It changes the way that education feels to those involved and it changes the diversity of experiences we can give students in school, but that is a different thing to saying it directly affects how students learn.

The curious question is why money and attainment are not more tightly linked.

I don’t think governments help themselves here when they ring-fence money or give it an expiry date which prevents schools making efficient expenditure decisions. And, as discussed earlier in relation to EEF trials, we simply do not have good evidence that it is possible to go and purchase off-the-shelf programmes that are demonstrably effective.

But equally, schools don’t always spend money in a way that increases test scores because they have other considerations, not least making the lives of their staff more manageable. We know from the IFS paper that the majority of the increase in cash over the Labour Government period (which disproportionately went to disadvantaged schools) was spent on expanding the team of teachers who rarely teach (the senior leadership team), teaching assistants, and general wages.

Equally, from a Teacher Tapp question asked last week, we know teachers in secondary schools would choose to spend money on more classroom teachers, presumably to reduce class sizes. Primary school teachers would elect to have more teaching assistants. Both smaller class sizes and teaching assistants are resources that make the lives of teachers more manageable, but evidence says they have little immediate impact on pupil attainment. They certainly do support the long-term health of the teaching profession, which I believe is the most important determinant of pupil attainment in the future (see my Teacher Gap book). But this money does not buy us better pupil attainment today.

15TT3

To be clear, as a parent whose own children are educated in one of the most poorly funded counties in England, I am gravely concerned about how the current funding crisis is damaging both the quality of the experiences they have and the well-being of their teachers. But equally, as a researcher in this field, I would not be able to give a school well-evidenced advice about how to use money to close the attainment gap. I think this is because improved classroom instruction isn’t something it is easy to buy. Is it possible to teach in a way that disproportionately benefits those in the classroom from disadvantaged backgrounds? This is the question that we will turn to in Part III.

What’s coming up…

Part III asks whether within-classroom inequalities can ever be closed

(Punchline for the nervous… No, I don’t think the pupil premium should be removed. I suggest it should be rolled into general school funding.)

The pupil premium is not working (part I): Do not measure attainment gaps

On Saturday 8th September 2018 I gave a talk to researchED London about the pupil premium. It was too long for my 40-minute slot, and the written version is similarly far too long for one post. So I am posting my argument in three parts [pt II is here and pt III is here].

Every education researcher I have met shares a desire to work out how we can support students from disadvantaged backgrounds as they navigate the education system. I wrote my PhD thesis about why school admissions help middle class families get ahead. No politician is crazy enough to do anything about that; but they have been brave enough to put their money where their mouth is, using cash to try to close the attainment gap. This series of blog posts explains why I think the pupil premium hasn’t worked and why it diverts the education system away from things that might work somewhat better. I suggest it is time to re-focus our energies on constructing classrooms that give the greatest chance of success to those most likely to fall behind.

Money, money, money…

We think about attaching money to free school meal students as a Coalition policy, but the decision to substantially increase the amount going to schools serving disadvantaged communities came during the earlier Labour Government. The charts below come from an IFS paper that shows how increases in funding were tilted towards more disadvantaged schools from 1999 onwards. The subsequent ‘pupil premium’ (currently £1,320 for primary and £935 for secondary pupils) really was just the icing on the cake.

1Funding

However, the icing on the cake turned out to have a slightly bitter taste, for it came with pretty onerous expenditure and reporting requirements:

  1. The money must be spent on pupil premium students, and not simply placed into the general expenditure bucket
  2. Schools must develop and publish a strategy for spending the money
  3. Governors and Ofsted must check that the strategy is sound and that the school tracks the progress of the pupil premium students to show they are closing the attainment gap

The pupil premium does not target our lowest income students

Using school free school meal eligibility as an element in a school funding formula is a perfectly fine idea, but translating this into a hypothecated grant attached to an actual child makes no sense. The first reason why is that free school meals eligibility does not identify the poorest children in our schools. This was well known by researchers at the time the pupil premium was introduced thanks to a paper by Hobbs and Vignoles that showed a large proportion of free school meal eligible children (between 50% and 75%) were not in the lowest income households (see chart below from their paper). One reason why is that the very act of receiving the means-tested benefits and tax credits that in turn entitle the child to free school meals raises their household income above the ‘working poor’.

7FSMpoverty

Poverty is a poor proxy for educational and social disadvantage

Even if free school meal eligibility perfectly captured our poorest children, it would still make little sense to direct resources to these children since poverty is a poor proxy for the thing that teachers and schools care about: the educational and social disadvantage of families. Children who come from households who are time-poor and haven’t themselves experienced success at school often do need far more support to succeed at school, not least because:

  • Their household financial and time investment in their child’s education is frequently lower
  • Their child’s engagement in school and motivation could be lower
  • The child’s cognitive function might lead them to struggle (of which more in part 3)

These are social, rather than income, characteristics of the family.

Pupil premium students do not have homogeneous needs

There are pupil premium students who experience difficulties with attendance and behaviour; there are pupil premium students who do not. There are non-pupil premium students who experience difficulties with attendance and behaviour; there are those who do not. Categorising students as a means of allocating resources in schools is very sensible, if done along educationally meaningful lines (e.g. the group who do not read at home with their parents; the group who cannot write fluently; the group who are frequently late to school). Categorising students as pupil premium or not is a bizarre way to make decisions about who gets access to scarce resources in schools.

Yes, there are mean average differences by pupil premium status in attendance, behaviour and attainment. However, the group means mask the extent to which pupil premium students are almost as different from each other than they are from the non-pupil premium group of students. The DfE chart below highlights this nicely.

8FSMdistribution

In his book, Factfulness, the great, late Hans Rosling implores us not to overuse this type of analysis of group mean averages to make inferences about the world. He explains that ‘gap stories’ are almost always a gross over-simplification. They encourage us to stereotype groups of people who are not as dissimilar to others as the mean average would have us believe.

Why do we like these ‘gap stories’? We like them because we humans like the pattern forming that group analysis facilitates, and having formed the gap story, we are then naturally drawn to thinking of pupil cases that conform to the stereotypes.

Your school’s gap depends on your non-PP demographic

I’ve explained how the pupil premium group in schools do not have a homogeneous background and set of needs. Students not eligible for the pupil premium are even more diverse.

When we ask schools to monitor and report their pupil premium attainment gap, the size of the gap is largely a function of the demographic make-up of the non-pupil premium students at the school. Non-pupil premium students include the children of bus drivers and bankers; it is harder to ‘close the gap’ if yours are the latter. Many schools that boast a ‘zero’ gap (as did one where I was once a governor) simply recruit all their pupils from one housing estate where all the residents are equally financial stretched and socially struggling, though some are not free school meal eligible.  Schools that serve truly diverse communities are always going to struggle on this kind of accountability metric.

Tracking whether or not ‘the gap’ has closed over time is largely meaningless, even at the national level

There are dozens of published attainment gap charts out there, all vaguely showing the same thing: the national attainment gap isn’t closing, or it isn’t closing that much. None of them are worth dwelling on too much since the difference between average FSM and non-FSM attainment is very sensitive to two things that are entirely unrelated to what students know:

  1. We regularly change the tests and other assessments that we bundle into attainment measures at age 5, 7, 11 and 16. This includes everything from excluding qualifications, changing coursework or teacher assessment mix, to rescaling the mapping of GCSE grades to numerical values. Generally speaking, changes that disproportionately benefit higher attaining students widen the gap.
  2. The group of students labelled as pupil premium at any point in time is affected by the economic cycle, by changes in benefit entitlements and by changes to the list of benefits that attract free school meals. For example, recessions tend to close the gap because they temporarily bring children onto free school meals who have parents more attached to the labour market.

It is also worth noting that FSM eligibility falls continuously from age 4 onwards as parents gradually choose (or are forced) to return to the labour market. This means comparisons of FSP, KS1, KS2 and KS4 gaps aren’t interesting.

Don’t mind your own school gap

Your school’s attainment gap, whether compared with other schools, compared with your own school over time, or compared across Key Stages, cannot tell you the things you might think it can, for all the reasons listed above.

Moreover, it isn’t possible for a school to conduct the impact analysis required by DfE and Ofsted to ‘prove’ that their pupil premium strategy is working for all the usual reasons. Sample sizes in schools are usually far too small to make any meaningful inferences about the impact of expenditure, and no school ever gets to see the counterfactual (what would have happened without the money).

What’s coming up…

Part II explains how reporting requirements drive short-term, interventionist behaviour

Part III asks whether within-classroom inequalities can ever be closed

(Punchline for the nervous… No, I don’t think the pupil premium should be removed. I suggest it should be rolled into general school funding.)

What if we cannot measure pupil progress?

Testing and recording what students know and can do in a subject has always been part of our education system, especially in secondary schools where teachers simply cannot hold in their head accurate information about the hundreds of students they encounter each week. However, measuring progress – the change in attainment between two points in time – seems to be a rather more recent trend. The system – headteachers, inspectors, advisors – often wants to measure something quite precise: has a child learnt enough in a subject this year, relative to other children who had the same starting point?

The talks I have given recently at ResearchED Durrington and Northern Rocks set out why relatively short, standardised tests that are designed to be administered in a 45-minute/one hour lesson are rarely going to be reliable enough to infer much about individual pupil progress. There is a technical paper and a blog post that outlines some of the work that we’ve been conducting on the EEF test database that led us to start thinking about how these tests are used in schools. This blog post simply sets out a few conclusions to help schools make reasonable inferences from test data.

We can say a lot about attainment, even if progress is poorly measured

No test measures attainment precisely and short tests are inevitably less reliable than long tests. The typical lesson-long tests used by schools at the end of a term or year are reliable enough to infer approximately where a student sits on a bell curve that scores all test-takers from the least good to the best in the subject. This works OK, provided all the students are studying the same curriculum in approximately the same order (a big issue in some subjects)!

Let’s take a student who scored 109 in a maths test at the start of the year. We cannot use that single score to assert that they must be better at maths than someone scoring 108 or 107. However, it is a good bet that they are better at maths than someone scoring 99. This is really useful information about maths attainment.

When we use standardised tests to measure relative progress, we often look to see whether a student has moved up (good) or down (down) the bell curve. This student scored 114 at the end of tell the year. On the face of it this looks like they’ve made good progress, and learnt more than similar students over the course of the year. However, 109 is a noisy measure of what they knew at the start of the year and 114 is a noisy measure of what they knew at the end of the year. Neither test is reliable enough to say if this individual pupil’s progress is actually better or worse than should be expected, given their starting point.

Slide2newnew

Dylan Wiliam (2010) explains that the challenge of measuring annual test score growth occurs because “the progress of individual students is slow compared to the variability of achievement within the age cohort”. This means that a school will typically find that only a minority of their pupils record a test score growth statistically significantly different from zero.

Aggregation is the friend of reliability

You can make a test more reliable by making it longer, sat over multiple papers, but this isn’t normally compatible with the day-to-day business of teaching and learning. However, teachers who regularly ask students to complete class quizzes and homework have the opportunity to compile a battery of data on how well a student is attaining. Although teachers will understandably worry that this ‘data’ isn’t as valid as a well-designed test, intelligently aggregating test and classwork data is likely to lead to a more reliable inference about a pupil’s attainment than relying on the short end-of-term test alone. (Of course, this ‘rough aggregation’ is exactly what teachers used to do when discussing attainment with parents, before pupil tracking was transferred from the teacher markbook to the centralised tracking software!)

Teacher accountability is the enemy of inference

Teachers always mediate tests in schools. They might help write the test, see it in advance, warn pupils or parents about the impending test, give guidance on revision, advise pupils about the consequences of doing badly, and so on. If the tests are high-stakes for teachers (i.e. used in performance management) and yet low-stakes for the pupils, it can become difficult for the MAT or school to ensure tests are sat in standardised conditions.

For example, if some teachers see the test in advance they might distort advice regarding revision topics in a manner that improves test performance but not the wider pupil knowledge domain. Moreover, some teachers may have an incentive to try to raise the stakes for pupils in an attempt to increase test persistence. The impact of the testing environment and perception of test stakes has been widely studied in the psychometric literature. In short, we need to be sure that standardised tests (of a standardised curriculum) are sat in standardised conditions where students and teachers have standardised perceptions of the importance of the test. For headteachers to make valid inferences across classrooms, or across schools, they need to be clear that they understand how the stakes are being framed for all students taking the test, even those who are not in their own school!

I think this presents a genuine problem for teacher accountability. One of the main reasons we calculate progress figures is to try to hold teachers to account for what they are doing, but very act of raising the stakes for teachers (and not necessarily for pupils) can create variable test environments that threaten our ability to measure progress reliably!

The longer a test is in place, the more it risks distorting curriculum

A test can only ever sample the wider subject knowledge domain you are interested in assessing. This can create a problem where, as teachers become more familiar with the test, they will ‘bend’ their teaching to towards the test items. Once this happens, the test itself becomes a poor proxy for the true subject knowledge domain. There are situations where this can seriously damage pupil learning. For example, many primary teachers report that one very popular standardised test is rather weak on arithmetic compared to SATs; given how important automaticity in arithmetic is, let’s hope no year 3, 4 or 5 teachers are being judged on their class performance in this test!

Our best hopes for avoiding serious curriculum distortion (or assessment washback) are two-fold. First, lower the stakes for teachers (see above). Second, make the test less well-known or less predictable for teachers. In the extreme, we hear of schools that employ external consultants to write end-of-year tests so that the class teachers cannot see them in advance. More realistically, frequently changing the content of the test can help minimise curriculum distortion, but is clearly time-consuming to organise. Furthermore, if the test changes each year then subject departments cannot straightforwardly monitor whether year group cohorts are doing better or worse than previous years.

None of this is a good reason not to make extensive use of tests in class!

Sitting tests and quizzes is an incredibly productive way to learn. Retrieval during a test aids later retention. Testing can produce better organisation of knowledge or schemas. As a consequence of this, testing can even facilitate retrieval of material that was not tested and can improve transfer of knowledge to new contexts.

Tests can be great for motivation. They encourage students to study! They can improve metacognitive monitoring to help students makes sense of what they know (and don’t yet know).

Tests can aid teacher planning and curriculum design. They can identify gaps in knowledge and provide useful feedback to instructors. Planning a series of assessments forces us to clarify what we intend students to learn and to remember in one month, one year, three years, five years, and so on.

Are we better off pretending we can measure progress?

I’m no longer sure that anybody is creating reliable termly or annual pupil progress data by subject. (If you think you are then please tell me how!) Perhaps we don’t really need to have accurate measures of pupil progress to carry on teaching in our classrooms. Education has survived for a long time without them. Perhaps SLT and Ofsted don’t really mind if we aren’t measuring pupil progress, so long as we all pretend we are. Pretending we are measuring pupil progress creates pressure on teachers through the accountability system. Perhaps that’s all we want, even if the metrics are garbage.

Moreover, I don’t know whether the English education system can live in a world where we know that we cannot straightforwardly measure pupil progress. But I am persuaded by this wonderful blogpost (written some time ago) by headteacher Matthew Evans that we must comes to terms with this reality. Like many other commentators on school accountability, he draws an analogy with The Matrix film in which Neo must decide whether to swallow the red or blue pill:

Accepting that we probably can’t tell if learning is taking place is tantamount to the factory manager admitting that he can’t judge the quality of the firm’s product, or the football manager telling his players that he doesn’t know how well they played. The blue pill takes us to a world in which leaders lead with confidence, clarity and certainty. That’s a comfortable world for everyone, not just the leader.

He goes on to argue, however, that we must swallow the red pill, because:

However grim and difficult reality is, at least it is authentic. To willingly deceive ourselves, or be manipulated by a deceitful other (like Descartes’ demon), is somehow to surrender our humanity.

And so, what if we all – teachers, researchers, heads, inspectors – accept that we are not currently measuring pupil progress?

What then?

How an economist would decide the what, when and how of reception year

Clare Sealy has written an amazing blog post explaining why rising 5s need to learn through a mixture of explicit teaching, whole class collective experiences, and play-based encounters. The early years isn’t an area of research for me, but it is a field I spend a lot of time thinking and reading about simply because my own children and those of my friends are currently so young.

Clare’s blog describes the controversies around the question of how we should educate in the reception year. However, I think questions of what and when we should teach young children are equally contentious. Reception year has moved from something that lasted only a few months for many (e.g. me) a generation ago to a de facto compulsory year of schooling and I’d like us* to conduct more empirical research on when it makes sense to teach complex skills such as reading and writing to children.

As an economist, whilst I am supporting my own children in learning new skills (potty training, arithmetic, reading, getting dressed etc…), I wonder why we don’t talk more about the opportunity costs involved in the decisions we make in reception year. What other opportunities must we give up when we decide to teach 4 year olds to read, or to learn some French words, or their number bonds to 20, or to learn a repertoire of songs by heart, or how to identify trees by their leaves?

Economists naturally think in terms of costs and benefits – here our costs are time costs. For example, we choose to potty train children about a year later than they were a generation ago. Why? By delaying we can invest far fewer hours in the process – hours that we then get to spend doing other things. We can afford this delay because disposable nappies are now cheap enough to use for extended periods of time. Equally, we now invest hundreds of hours ensuring children can read the word ‘mat’ when they complete the reception year. That has great benefits to the child, but it has also cost them time which they were then not able to spend doing other things that are promoted in other cultures, such as numeracy or memorising a repertoire of songs, dances and poems.

If an economist was asked how reception year should be organised, they would want some data on these time-investment trade-offs. For example, in the case of teaching a child to read through an explicit phonics programme, they would want to know exactly how the age at which a child starts learning affects the number of teaching hours that need to be invested. The chart below illustrates a trivial example of this. Suppose my daughter started a phonics programme at age 5 and it took her 300 hours of teaching time to complete. How many hours would it have taken if we’d started at age 3? 750 hours? 1000 hours? Would it be worth it? What about if we’d delayed until she was 7? 150 or 200 hours? Would these time gains make it worth delaying? Suppose we could draw a similar chart for a child who comes from a less book-rich home? Would the chart be steeper (i.e. the gap in time investment needed compared to my daughter closed somewhat over time) or flatter? I think we can be fairly sure the curve would be steeper for boys than girls, but by how much? These charts wouldn’t tell you when you should teach phonics, but they would make explicit one bit of evidence we need to decide at what age we should start teaching children to read.

Tradeoffs

Now, suppose we have two goals – learning to read well enough to pass the phonics test and achieving fluency in number bonds to and within 20. We can choose to start a phonics programme at the age of 4, but that doesn’t leave enough time to also achieve fluency in number bonds as well. Which should we prioritise in the younger children and which should we concentrate on later? An economist would say it depends on the shape of the curves. I suspect (based on my sample size of 2 children) that arithmetic has a flatter curve than reading so that the time investment for learning number bonds at age 4 is not vastly higher than it is at age 5.

These curves are fictional – I don’t know what they look like for real children. But I’d feel far more comfortable explaining to a foreigner why we teach our children to read soon after they are four if I knew the time trade-offs involved in this decision.

Economics is the coldest of the social sciences, but this analysis places every hour of children’s precious lives at its heart. It reminds us that we should take care in balancing the gains from learning new skills against the costly time investments of teaching new stuff to young children. And it reminds us that, whilst it might be more efficient to teach handwriting to four year olds through more explicit and formal methods, this fact alone doesn’t mean we should do it. We should also weigh up the relative time investments involved in choosing to bring forward the teaching of a new skill to the reception year, rather than deferring it to year 1 or 2. Indeed, I suspect much of the raging argument about how we should organise the reception year gets confused by private disagreements about the what and when.

(This is a slightly trivial New Year blog post that summarises everything economics has to say about the reception year. Economists shouldn’t decide what the reception year looks like. Don’t let them.)

—————————–

* not me

—————————–

Still reading? OK, here is the indulgent bit where I tell you about my personal views on the reception year:

  • I only did a few weeks in reception class and I did OK in life – I can’t help feeling that if it were so critical to start things young then other countries would be doing it too
  • Child-initiated play was great for my eldest in playgroup, where the adult-child ratios were high; it was pretty sub-optimal in reception year where there were necessarily frequent child-on-child interactions that could not be mediated by an adult, producing endless social/emotional issues. The thought of having to put my youngest through reception year doesn’t fill me with joy for this reason
  • We aren’t ever going to get larger physical spaces and more adults in reception classes. With that in mind, my dream reception year for my children would be 2-3 hours a day at school for collective activities (singing, learning poems, games) and structured work at tables, then back to pre-school for lunch and afternoon play.

Making Oxbridge entry matter less

Yet again, universities are under the spotlight for their admission processes. On the one hand, of course we need to do all we can to get under-represented groups into our elite universities. Alternatively, we could enquire as to why it is so important that they get into these universities in the first place. I’d[i] argue that this is largely because educational achievement is unmeasured at the end of degrees and so name of university attended is still acting as a (poor) signal of IQ/knowledge/effort [delete as appropriate] to employers.

One of the many reasons for this is that degree class inflation is out-of-control, with places such as the University of Surrey now awarding a first-class degree to over 40% of their students. Degree classifications clearly no longer reflect genuine attainment, either for cohorts passing through the system in different years or indeed across different institutions.

The consequence is that young people are hugely incentivised to apply to highly-selective courses, rather than ones with high quality teaching. For this is the only way they can signal their intellect in the labour market. For this reason, incidentally, the TEF alone cannot degrade the market quality of an LSE degree.

We could fix all these problems by introducing a common core examination in all degree subjects, set externally by learned societies. All students would sit them, say, two-thirds of the way through their degree, thus allowing specialised final year examinations to continue. Performance in this exam, by subject, would determine the number of first-class, upper-second, lower-second and third-class degrees the department is allowed to award that year. It would not determine the degree-class of the student.

Agreeing a common core of the curriculum would be more controversial in some subjects than in others. We should try this first in subjects where this is not controversial: the sciences, maths, economics, and so on.[ii]

This degree design would still leave the majority of time free for esoteric topics, set by a university (e.g. 50% of the first two years and 100% of the final year), who could choose to combine papers into a degree classification in any way it chooses. It would simply be restricted in the proportion of different classifications it could award, based on the common exam results.

The alternative is that we introduce some sort of IQ-style SAT entrance examination that in turn determines how degrees can be set. But this does not incentivise universities to ensure that students are learning anything.

Establishing robust and comparable degree classification will help fix the extraordinary stratification of universities in the eyes of employers. Getting into Oxbridge rather than, say, Nottingham undoubtedly gives people an easy ride in the labour market. As someone who got one of these free passes to pretend I am clever I used to think this was justified. I changed my mind when I had the chance to interview 17 year-olds myself.

A decade or so ago I was roped into interviewing for undergraduates at an Oxbridge college, not because anyone particularly valued my opinion but more because newspaper scandals meant the college didn’t want Fellows interviewing alone. The experience completely revolutionised my view that university admissions were efficiently selecting students by ability.

We handed out about seven offers in the subject in each of the three years I helped out. Three were given to candidates who performed exceptionally well at interview and had great AS point scores; the other four were given rather arbitrarily from a long list of over a dozen candidates who did well at interview and on paper. I could see the consequences of the offers we made because I supervised first year students. Those who performed exceptionally well at interview often didn’t seem to turn out to be genuinely interested and motivated by their subject. The interview didn’t help those from disadvantaged backgrounds, in my experience, who clearly hadn’t been prepared. And the ‘thinking skills’ test that we introduced during the time I interviewed was clearly not tutor-proof; we observed striking mark inflation as it moved from a pilot to a known-test with companies offering preparation.

There are weak students studying at Oxbridge; there are outstanding students studying at Nottingham. The latter group, even if they are awarded a first, find it much harder to signal their talent to the employers who understandably place little store by degree classification. If we ensured genuine comparability in achievement across universities then university attended needn’t act as a signal for anything at all.

 

 

[i] Well, technically most of this argument comes from a conversation with a very smart man who is not in the position to make these arguments publicly at the moment!

[ii] The question of how degrees should be awarded across subjects is a question for another time, but one that is debated frequently by school examination boards. Essentially, there are principles that can be applied to achieve this where subjects have similar academic characteristics; deciding the national degree awarding proportions is almost impossible for art, music, nursing and so on. School examination boards also deal with questions about how to maintain comparability over time, etc…

If Engelmann taught swimming

I have been thinking about social inequalities and education for the past decade and feel like I’m walking a well-trodden path that has a hopeful ending. Perhaps by telling you where it leads it’ll help you get to a productive destination quicker.

I’ve spent my whole research career thinking that our best hope for fixing educational inequalities is to shuffle children, teachers and money across schools and/or classrooms. That is why I have spent so much time writing about school admissions and choice, measuring school performance, school funding and the pupil premium, and the allocation of teachers to schools and to classes within schools.

We have made essentially zero progress in England in closing the attainment gap between children who live in poorer and richer households. Zero. It is easy to feel despondent about this and wonder whether no solutions lie within the education system.

But two things that have happened over the summer – my daughter learning to swim and listening to Kris Boulton talk – have given me renewed hope.

***

We have a daughter who is a low ability swimmer. Like other families, every Saturday morning we’d bargain over whose turn it was to take her to her lesson. One half-term became two. Then three. Then four. And yet still she was in Beginners One. She was fine about this – she liked the classes and only causally noted that other children were learning to swim and moving up to the next class.

Other parents said, ‘Don’t worry. Everyone learns to swim in the end. She’ll get there.’ And I knew they were right – she would get there eventually and we should just accept it’ll take her longer than other children. But then someone suggested we try another swimming class. So we did. And from the moment she got into that new pool with a new instructor it was like watching some sort of miracle. By the end of the first lesson she was doing something approaching proper swimming and by the fourth lesson she was good enough to practise on her own at the local pool with us.

Did she just ‘need time’? Was it chance that it ‘just clicked’ on that particular day? Would she ever have learnt to swim in Beginners One at the old place? Was she really a ‘low ability’ swimmer?

As I was mulling over this small miracle whilst swimming in the local leisure pool on a Saturday (not a tranquil experience that is conducive to deep-thought), I remembered Kris Boulton’s strange picture of classroom desks with a probability that each child learns a concept. This is a photo of him presenting at researchED, but you can also hear about it on Mr Barton’s maths podcast or read his blogpost.

Kris Boulton

Kris thinks the problem with the accepted educational wisdom is that it deems most instructional methods as fine because some children always ‘get it’. From this observation, they then deduce that other children in the same classroom must be failing for reasons outside the classroom – poverty, genetics, and so on. If you read his blogs, Kris doesn’t deny that these other things are present, but he views these all as factors that increase the sensitivity of the child to instructional method chosen.

Kris has come around to this way of thinking through his study of the work of Zig Engelmann. Engelmann isn’t popular in many educational circles for his commitment to Direct Instruction. But you don’t have to be a fan of D.I. (I’m not particularly) to admire the scientific approach he has taken to constantly refining his programmes of instruction. And at the heart of the approach he takes is the following belief:

The best instructional methods will close the gap between those students who have a high chance of understanding a new concept and those who have a lower chance of understanding it.

This! This way of thinking about inequalities in rates of learning is simply not part of the narrative for many policy-makers and researchers. There are some children who will ‘get it’, regardless of instructional method used (the Autumn-born middle-class girls in infants and the kids in my daughter’s first swimming class who raced through and onto Beginners Two within a term). Then there are those for whom the probability that they learn the new concept is highly sensitive to methods of instruction. My daughter wasn’t a ‘low ability’ swimmer; she was just a novice swimmer who was more sensitive to instructional methods than others for whatever reason.

I don’t know whether Zig Engelmann has ever thought about swimming instruction, and I don’t know what he would make of the methods used to teach my daughter in the second swim school. Who knows whether her swim instructor has given much thought to questions of sequencing and the benefits of what Kris calls atomisation in his blogpost. I’m confident that she is not following a Direct Instruction script! But just imagine if the method of instruction she has devised through years of experience could be codified, at least in part, so that other instructors could follow it too.

There will always be differences in how easily humans are able to learn new concepts, but I’m more convinced than ever that we can reduce the size of these gaps in rates of learning by paying close attention to the instructional methods we use. An instructional method doesn’t work if only some children can succeed by it. Let’s work on developing methods that give every child the highest possible chance of succeeding.

Coda

I showed this post to Kris and he wrote:

Engelmann has applied his ideas to physical activity, including tying shoelaces, doing up buttons, and I think some aspects of sport.  I had an excellent instructor for Cuban Salsa a few years back. She created three 10 week courses, at differing levels, and broke everything up into different moves, from small components up to more complex combinations. One evening several of us went out to a dance event that had a lesson with a different instructor – he spent most of the time saying ‘No-one can really teach you how to move, you just have to feel the music.’ Utterly useless.

(I think those last two words are his less polite way of saying that he is a novice dancer who is very sensitive to instructional methods!)

Kris is a little further on in his journey than me, as he explains here. He believes so strongly that this is how we reduce inequalities in rates of learning that he is joining Up Learn, a company dedicated to this same belief, that is putting the theory into practice and believes they can use it to guarantee an A or A* to every A Level student who learns through their programme.