Reception baseline assessment: dangerous, inappropriate and flawed data

Alice Bradbury and Guy Roberts-Holmes. 

In its response to the consultation document Primary Assessment in England  the Government announced its intention to make baseline assessment statutory (along with the existing EYFS Profile) from Autumn 2020. Justine Greening’s Ministerial forward states that the Reception Baseline Assessment ‘must produce data that is reliable and trusted’.

However our research into the 2015 Reception Baseline Assessment, which involved interviewing Reception teachers and a nationwide survey of teachers, found that the data it produced were unreliable and not trustworthy. Even with a newly introduced cohort level analysis we contend that Reception Baseline Assessment will still produce inappropriate, flawed and inaccurate data.

This announcement follows the failed policy of Reception Baseline Assessment, introduced in 2015, which saw 15,000 primary schools buy in and conduct an assessment with four year olds in their first six weeks of schooling in the Autumn term at an estimated cost of £4.5 million. After the results had been compared, schools were then told that baseline scores were not useful because the different forms of assessment provided by different companies were incompatible.

At its heart baseline assessment was an attempt to reduce all the complexity, diversity and contradiction of four year old children to a single number. This single number was then to be used to predict children’s progress across seven years of schooling. However inaccurate, the aim was to hold a primary school to account by recording a scores for each child on entry and comparing this number with ‘what comes out’, in Sats results, seven years later. This input/output model is inappropriate in education; no wonder that teachers responding to a survey on Baseline described it as treating children like sausages in a factory.

Firstly, the Baseline is dependent on the idea that you can establish with some certainty what a child can or cannot do at a given point, reduce this to a single number, and use that number for other purposes. Baseline assessment negates children’s complexity and teachers’ professionalism through its crude binary statements and judgements, and so produces a deficit model of what children cannot do. Instead early years teachers know that young children demonstrate what they can do in their play and in a multitude of diverse, unexpected and imaginative ways. The three forms of Baseline used in 2015 involved a mixture of activities on a tablet or with a teacher and teacher observation and judgement against set statements.

But whatever form the assessment takes, it is impossible to decide with certainty what a child of four can or cannot do, because they are affected by so many other contextual factors. For example, teachers told us that some of those variables included whether they speak English as a first language; were summer-born, whether they are used to working on a tablet; how confident they feel in their new class; their relationship with their teacher and whether they had settled into school happily; even whether they were hungry or were tested at the end of the day when they were tired.

Even the more complex observation-based versions required teachers to interpret yes/no binary judgements as to whether a child had learning dispositions, such as perseverance and motivation and maths and literacy skills, within six weeks. Some teachers saw baseline assessment as damaging children’s well-being, confidence and self-esteem as the children, particularly when using tablets, were aware that they were being tested and were upset if they were unable to do certain things asked of them. One teacher described how some children in her class had ‘come out low’ on physical development because they had wet themselves in their first weeks of school; she was concerned this normal behaviour in the first weeks in an unfamiliar environment would set them up for low expectations overall, as their resulting ‘low’ baseline followed them up the school. So, rather than being an exact science, the single figure baseline outcomes were an often negative and inaccurate crude approximation of what children could do.

This leads us to the second problem with any kind of baseline assessment: the setting of low expectations. Here the inaccurate information initially put into baseline and the invisibility of the mathematical algorithms used to produce a single baseline figure had harmful effects. There is a danger that if you label a child with a score at age four or five, and then measure their progress seven years later, you engage in a model of education which defines children as either having ‘ability’ or not – which tends to stick with them as they travel through the system. This model engages, explicitly or not, in the business of predicting whether has a child has potential. By saying this school has done fine because their children came in ‘below expectations’ and stayed ‘below’, you define low attainment as acceptable. They may of course come out as ‘meeting expectations’ or ‘above’, in which case you praise the school as having done well, but the setting of low expectations is still there. Baseline Assessment, if used in this way, has the potential to become a self-fulfilling prophecy. Assessments which establish low expectations are dangerous, especially if they are based on faulty, incomplete and generalised data. And because the baseline figure has been produced by a mathematical algorithm by an external company, there is a tendency to believe it to be true, even when it is wrong or harmful.

Finally, there is a real temptation in any form of ‘value added’ system for teacher to manipulate of results. This has already been found in relation to the EYFS Profile assessment at age five, where teachers describe being told to keep results low so that the ‘story’ of attainment as children go up the school is more attractive to Ofsted. In our research on Baseline, headteachers described this temptation to ‘limit the damage’. So, even if you could produce an accurate, reliable test, it would be affected by its role as a performance measure.

These dangers of baseline will not go away with the announcement of the new proposals, because they are based on the same flawed principles and logic as the original policy. Like many in the primary and early years education world, we are incredulous that the government has attempted to bring back a discredited policy that fails to acknowledge children’s complexities, and at the same time de-professionalises teachers as ‘scorers’. An effective coalition of education groups came together in 2015-6 as the ‘More Than a Score’ and ‘Better without Baseline campaign’; we hope the government will listen again to those who know about children and education and abandon their plans.


Photo by Phil Roeder via Creative Commons

Tagged with: , , , ,
Posted in Childhood & early education, Education policy
4 comments on “Reception baseline assessment: dangerous, inappropriate and flawed data
  1. Janet rS says:

    Reception children do not legally have to be in school until the term that they are 5. Some could only join in the summer term. Missing Autumn and Spring. How would that be accounted for?

    • Emeritus Professor Rosemary Davis says:

      Not only issue of legal age of entry but there are too many entry variables, e.g. Individual differences, including life circumstances for a baseline assessment to be valid, The risk, too, is that such an assessment could increase the likelihood of erroneous teacher expectations and thus self fulfilling prophecies of failure. As a former teacher, I saw many examples of children admitted early because of adverse family circumstances achieve incredibly well. One little girl entered reception, announced ‘I’m going to learn to read’ and sat in the book corner. Three days later, she could read fluently ( without synthetic phonics!) Teachers’ own observations and classroom based monitoring are sounder ways to assess needs and strategies for supporting learning

  2. Dr. Guy Roberts-Holmes says:

    Great comment – You ask an important structural question about the difficulties of baselining when children have so many different starting points! It will be interesting to see how all these differences can be made compatible into a single data score….

