Accountability: just what do we want to measure?

Chris Husbands

Over the last 20 years, secondary school performance measures have had an enormous impact on schools’ behaviour, parental preference and, indeed, local house prices. Published in local and national league tables, they have been based on the proportion of a school’s students who secure 5 A*-C GCSEs in English, mathematics and three other subjects.

Just before Christmas, in what appeared to be a leak of the Government’s review of secondary school accountability, the Daily Telegraph reported that this measure will go, to be replaced with an average points score. For example, 8 points would be given for an A*, 7 for an A and so on. Unfortunately, and hopefully not on the basis of leaks from the DfE, the Telegraph spoilt its story with two schoolboy howlers: first by suggesting that this provided a “more precise calculation of achievement”, rather than a measure of examination attainment, and second by arguing that this would prevent “over-generous marking” by teachers  – clearly, the way scores are reported for accountability purposes says nothing about the assessment methods which make them up.

In principle, it makes good sense to hold schools to account for the progress made by all their pupils rather than the sub-set who achieve the highest grades. On this principle alone, the Telegraph-reported proposal is welcome and a big advance on current arrangements. At present, the grade C threshold encourages schools to focus attention on that threshold to the detriment of other thresholds, and certainly to the detriment of reporting the progress made by all children. Effectively, we have been using a secondary school performance measure which relates to the performance of only three fifths of young people, which may be not unrelated to our ingrained problem of the long tail of low performance. The idea of calculating a points score is an advance.

As ever, though, in matters of assessment policy, things are not that simple. In the first place, the measure as reported treats grade boundaries as single steps up an equally calibrated staircase – the step from a G to and F  would be treated as the same step up (one step) as from a C to a D or from an  A to an A*. But this is misleading on several grounds. As any grade boundary archive makes plain not all grade boundaries are the same size. Normally, the critical C boundary is set, and then other boundaries are derived statistically based on deviations from the C boundary. Some boundaries are then set as equal steps between marks, but not all are. The conversion of grades to numbers for the purpose of deriving a total average grade assumes a statistical pattern which is not there in pupils’ performance: not all grades are simple steps up in marks.

It gets more complex:  although it is important to hold schools to account for the progress made by all pupils, in practice the C/D borderline is important for post-16 progression.  Getting a C in maths allows a pupil to progress in ways that getting a D doesn’t, but a B opens very few additional progression possibilities which a C does not.  Whether the C boundary should be as important as it has become can be debated, but it does matter – and internationally, the idea of thresholds for functional numeracy and facility in the national language is gaining currency. The C/D borderline measures this – crudely and ineffectively in all sorts of ways – in the way that a measure across the attainment range does not. American schools report their graduation rates – and some students take longer than others to graduate; American graduation rates are reporting a threshold, and they are often fairly incurious about performance above the threshold. High school graduation, as too many teen movies bear witness, is graduation.

There is a further difficulty. Most observers argue that schools should be held to account for the progress made by the pupils they teach, and  it has long been pointed out that the performance of some schools is flattered by the focus on the proportion securing 5 A*-C GCSEs:  there are schools which should be doing much better than they are given their intake. The proposed calculation, although it is a step forward, is still not a progress measure. For schools, the key indicator is not the measure of the overall attainment of a cohort, but the measure of levels of progress from entry to exit. That is a much more genuinely inclusive measure. But even that is complicated as the education system is gently tilted back towards norm- rather than criterion-referenced assessment methods, so that not all pupils may be able to make three levels of progress.

And there is one more complexity, which matters if you accept that accountability measures can drive perverse behaviours. The focus on the C threshold may encourage schools to invest considerable resources at the C/D borderline, but the concern with the number of students reaching a threshold does force schools to be concerned with the performance of individuals. Basing accountability on average scores shifts the focus from individuals to grades. Most of us want schools to be concerned with outcomes for individuals.

The Telegraph report reminds us that “accountability” for performance in education is complex. Developing measures which genuinely allow schools to demonstrate what they have achieved with young people is complex. Translating it into a readily understood format which can be communicated clearly is perhaps even more complex. At root, society needs clarity about what it wants to hold schools to account for: the progress made by individual pupils, in which case we should worry less about thresholds, or their ability to move all pupils to an agreed threshold, in which case we should worry less about above threshold performance, or their ability to push the most able to elite levels of performance, in which case we need to reflect on how to map the performance of all. Until we clarify that, we will struggle with inadequate measures in which we vest too much confidence

10 comments on “Accountability: just what do we want to measure?
  1. Jack Williams says:

    Completely agree. This problem becomes compounded by ‘target setting’ giving pupils levels or grades to achieve. We should be looking at getting them to aspire to the best they can be possibly be. I don’t like Nat. Curric. Levels and we can’t trust GCSE grades anymore.

  2. John Oversby says:

    1. Of course teachers and schools should be accountable. To whom and for what are the central issues. At present accountability is set by the present government, whose methods have not been tested by election popularity, and indeed do not carry a consensus that they are characterising anything that resembles what is valuable.
    2. Chris Husbands is quite right to point out serious challenges in determining progress. These inadequate indicators are then used to construct inadequate league tables that emphasise a part of what we may consider to be progress and significantly distort the accountability system. Examinations designed to accredit specific kinds of learning, with all their significant limitations, are then aggregated quite inappropriately. Different school subjects have different significance for different people.
    3. The number of C-A* passes are then used by OfSTED as a core measure that can have major impacts on the way a school is run. So, the distortion carries on.

  3. This is an important debate Chris. I would suggest that your final paragraph shouldn’t be framed as a set of choices. We need to value all of those things and it is not really difficult to achieve this if we are prepared to look at a number of measures when evaluating school performance. The idea of switching from %5A*-C with EM to average points as something new is odd because the data is already there on the DFE performance website. It is important to look at both measures side by side, but we don’t have a good measure for top end attainment such as %5A*/A or % 8A*/A. If we stick with grades (which I’ve argued against on my blog), then we need to have multiple measures with average points as one of them. This is the only measure that includes all students’ attainment and also gives some value to the volume or breadth of study.

    In the current system, where the main focus is on %5A*-C with EM, we have the perverse situation where, from best 8, 5Cs and 3Es counts the same as 5As and 3Ds or even 8A*s. There are schools where students with an early entry C in Maths, don’t go further to try for a higher grade. A higher profile for Average Points is necessary -but not on its own. Surely we can expect people to cope with three or four measures presented side by side?

    A progress measure such as the current Valued Added score at KS4 is also important although the methodology here is more suspect. It depends on reliable KS2 data for all students and then a mapping process that has a wide error margin – the confidence intervals given with the data are sometimes so wide as to make comparisons invalid; just one large error bar lined up against another – take a look at a selection of schools on the DFE comparison website.

    The elephant in the room with much of this is that we put faith in a C being a C being a C, year on, year across subjects – and that is a major assumption. The C/D borderline is fuzzy so any system that puts intense pressure on that metric alone is deeply unsatisfactory. I agree that grades vary in width – and have argued elsewhere that we should abandon grades altogether for points; the UMS marks that later yield grades would be far more useful. On some UMS scales, say a score of 69 is a C, but a high C; nearly a B(70). Then a score of 60 is only just a C, with 59 a D. It is easy to rank 69, 60 and 59 to get a sense of relative performance. But to artificially label 60 and 69 as equal Cs with 59 a D is absurd – especially when the margin of error inherent in the marking process for some subjects is such that the actual work of a student with 59 could well be better than a student with 60…. so perverse but sadly true. Perhaps the only honest way to report attainment is by giving scores with a +/- error – and not using rigid grades at all! In this way, average points, average UMS or even total UMS would even out some of the errors and we’d get a truer picture of attainment. With a points-only system, a number of benchmarks for elite attainment and baseline attainment could also be devised very simply – and the only incentive for any school would be to get the best score for every individual.

    In the current system, we need to be very careful in assuming that a Grade A student is really much better than a Grade B student – Universities have already made that leap, looking at UMS scores closely and also using other forms of tests to triangulate with A level predictions and GCSE results. There are simply too many flaws and variables that have nothing to do with the students.

  4. Is there room here for a level of accountability that is local? As a primary head, I often wonder what it is that my local community and parents (defined by my catchment in the first instance) actually want from their schooling. Most parents are now in thrall to the apparent “neutrality” and “unbiasedness” of OFSTED, and as Chris points out above, the same problems really do attach to secondary school attainment and achievement data. I am not in favour of a national education system so much as I am for strong locally responsible schools that find meaningful and authentic ways of relating to the populations immediately around them.

  6. behrfacts says:

    The sooner we hear from DfE about the new accountability measures the better for everyone. Why they should take so long about this yet rush in other changes mystifies me. I can only assume there is too much politics involved and perhaps Cameron and Clegg have different views on it, which would not be surprising. Mastery of ‘basic’ English and Maths by age 16 has been an issue for many years and Tomlinson tried to tackle it using a ‘functional’ approach, which many went along with at the time despite their reservations. This whole area requires intelligent evidence, debate and recommendations by politicians who feel they know all about being accountable to the electorate i.e. the threat of not getting the required number of votes to hold on to your seat or worse still losing by one out of thousands!

  7. 3arn0wl says:

    This is a really good post: informative and well argued.

    Thank you for doing your bit to dispel the perception that GCSEs are simply criterion referenced, and talking about the manipulation that goes on for each paper.

    I totally agree that the A*-C fixation (and the funding that accompanies it) distorts the picture (and often the practice) considerably, and that the true measure (and reward) ought to be the ‘value added’ to the student over the five years. (Trusting teachers to award Levels might be too much for Mr. Gove though, and I don’t entirely trust ALIS et al).

    However, I’m far from convinced that league tables measure anything other than the socio-economic predicament of a school’s locality really.

