Baseline assessment 'unreliable and disruptive' - research on the pilot scheme

Monday, 29 February 2016
Research on the pilot schemes for school baseline assessments last September reveals concerns, according to the BBC, that they are ‘unreliable and disruptive’. It also states that these issues are likely to remain when it is introduced in all schools in September 2016. 
Baseline assessment will measure basic reading, numeracy and writing as well as social and emotional development. It replaces the EYFS Profile, which will no longer be completed at the end of the Reception year. The assessments will take place within the first 6 weeks of starting school in September and carried out by teachers.
Baseline assessment will result in a single score for each child and it will be used to allocate funding to pupils with low prior attainment and to assess schools ‘added value’ between Reception and Y6.
It also introduces a new feature to primary education - it has an approved list of private providers rather than the assessment being provided by a government agency (Standards and Testing Agency) like the current end of Key Stage tests.

Baseline assessment will determine how schools are assessed for value added, alongside an attainment floor standard of 85%. The DfE says it is needed to ensure pupils reach their potential. They will assess a child’s level of development at the start of school, then the progress they make during primary school can be measured.

So what did the research discover?

Researchers from the University of London were commissioned by the NUT and ATL to look at the pilot scheme for Baseline Assessment in schools in September 2015. They collected data from interviews with 5 case study schools and 1131 completed online surveys.

Impact on teaching and curriculum

59% of teachers surveyed ‘agreed a lot’ (26.3%) or ‘agreed a little’ (32.6%) that the baseline assessment had disrupted the start of school for Reception children. It was felt that teachers’ focused on the baseline assessment rather than getting to know and settle children into new routines in the same way as previous years. 
It had an impact on teaching and the curriculum as the case study schools showed that teachers organised their teaching around the statements given in the assessment, some even delayed teaching content such as phonics until all the testing was complete, maybe 3-4 weeks into the term. For the computer based tests teacher time was taken with one to one assessments of each child.
Schools and teachers felt little was gained from the assessment that would not have been identified anyway. Few of the survey respondents felt the baseline assessments was as helpful as existing assessment systems in identifying SEN or EAL children, particularly where the school had information from nurseries.
93.2% of respondents to the survey said that their school already had existing assessment arrangements at the start of Reception that supported teaching and learning and that allowed them to track progress through the year, helping them for future planning.  The baseline assessment was not felt to be compatible with existing tracking systems. There was strong support for the existing EYFS Profile assessment.

Accuracy and validity of the data.

Almost 60% of survey respondents disagreed with the statement ‘Scores obtained by the baseline assessment are an accurate refection of children’s attainment at this stage.’
Some of the reasons for these concerns about the accuracy of the results included:
The children themselves
• are too young, most still only 4 years old
• may not show their ‘true potential’ as they may lack confidence in a new and unfamiliar school setting
• have different experiences prior to school that are not considered, particularly if they had been to a nursery or not
• may change over the 6 week assessment period. Children taking the test in the first two weeks may have performed better in weeks 5-6.
The problems of yes/no judgments and subjective statements:

The new baseline assessment contrasted with the ‘best fit’ approach that teachers felt was more accurate and reliable.  Some subjective statements linked to things such as ‘curiosity’ and ‘persistence’ could be open to a wide range of interpretations from school to school and teacher to teacher.
There was a feeling that the computer based tests gave different results if children were tested in the morning or the afternoon when they were tired and easily distracted. One school also felt those who completed the tests in the first week of starting had lower scores than those completing it in weeks 5 and 6.
Limitations of baseline assessments and KS2 testing to measure ‘value added’ in a school
Not only is there variability in children’s progress and development through these seven years at primary school, the content of the assessments at age 4 and age 11 are so different that it is difficult to construct a linear relationship for progress between them. There were also concerns that the pressure of accountability on schools could result in low scoring in the baseline assessments, although in practice this did not seem as important to teachers as ‘getting it right’.

Effect on workload and costs

Overwhelmingly teachers felt it increased their workload both within and outside the classroom.  The purpose of baseline assessment is accountability, it was not designed to give teachers data that would inform their planning, so it is likely they will, even in the future, have to complete the baseline assessment for the purpose of accountability and then complete their own system that will influence teaching and learning and support their own progress tracking throughout the year. Teachers felt it duplicated work they already do. Many schools will already have systems of assessing pupils within progress tracking - often far more detailed than this assessment and that will be more meaningful in ensuring a child will reach their potential.
There were also the issues of increased costs for training, with cases of supply cover so that class teachers could either complete one-to-one tests or complete the data input.
Further issues were raised with baseline tests being provided by private companies, with ‘sales pitches’ and questions about value for money. There were concerns about the expertise of the providers, a lack of standardization and the amount of consistency there is between companies.

One quote from the case studies was from a teacher completing part of the test, 

Some children looked at me and said “I can’t read” when asked to read parts of the assessment. It was heartbreaking to see their reaction to it and I spent a lot of time reassuring children.’ 
The Government has 3 approved providers:
BASE is a computer-based 15-20 minutes test.
Early Excellence Baseline (EExBA)
Completed by the teacher through observation and interaction, no set test time required but teachers need time to record 47 assessments statement that will take 9-12 minutes.
NFER Reception Baseline Assessment
Resource based assessment with a mixture of tasks taking up to 30 minutes and observational checklists.
