A long Bucks CC report called "What is the impact of coaching on selective testing at 11+?" (dated Feb 2009):
Bucks CC: A Summary of the Evidence wrote:
• Historically, tests such as verbal reasoning have been used to determine who takes up a place at a grammar school because they are thought to measure fixed or underlying ability – therefore impacted very little by coaching.
• Modern interpretations view reasoning tests as reflecting the pupil’s experiences up to the time of testing rather than providing an indication of fixed potential, and therefore the impact of coaching is much more important for consideration.
• There are few studies which look specifically at the impact of coaching on selection at 11+ in the UK, reflective of the fact that there are now few authorities still operating a selective system.
• The key studies which have been referred to in the press most recently were carried out in Northern Ireland, where the majority of counties operated a selective system (Bunting and Mooney (2001), Egan and Bunting (1991). These studies investigated the impact of coaching on verbal reasoning test scores.
• In Northern Ireland pupils are prepared in school for the tests over a period of at least a year, through regular familiarisation, coaching and practice papers. Bunting and Mooney investigated the effect of 3 hours coaching prior to this longer period and found it lead to a small but significant gain of 5 points. Most significantly the coaching given in schools over the months following this resulting in pupils marks doubling in total.
• Egan and Bunting’s study compared groups of children who were coached for at least a year, with those who had had no coaching. Pupils with higher ability benefited most from coaching, but that even when ability was accounted for, most children could double their scores as a result of coaching. The significant gains in scores attributed to coaching would have meant that, regardless of ability, none of the pupils who had not been coached, would have achieved a score in the top 70% required for selection into grammar school.
• Other studies, though not involving pupils of the same age, and involving tests of a slightly different nature show the same trend, i.e. that coaching can have a significant impact on scores in reasoning tests.
• The current research evidence would suggest that unless you can ensure equal effectiveness or access to coaching, then you cannot make assumptions about ability based on the verbal reasoning tests.[Their bold]
Bucks CC: A review of current research into practice, coaching and reasoning tests wrote:
Buckinghamshire operates a selective secondary education system which requires pupils to take a ‘verbal reasoning test’ at the beginning of their final year of primary school. The test, known as the Bucks VRT is specially designed for Buckinghamshire by GL-assessment (formerly NFER), and draws from a bank of different question types. These include numerical, and logical reasoning questions, in addition to verbal reasoning items. Pupils take two papers, one week apart, the results are standardised, and the decision as to whether a pupil is eligible for a grammar school place is made based on the highest mark of the two.
Historically, tests such as verbal reasoning have been used to determine who takes up a place at a grammar school because they are thought to measure fixed or underlying ability, and therefore give an indication of a child’s potential. This is compared to more curriculum based or achievement tests which measure a child’s response to the education he/she has experienced up to that point. It was thought that this was a fairer way to allocate places to ensure that children would be selected on the basis of their potential rather than the opportunities that they had had up to that point in time. It was also assumed therefore, that this type of test would be impacted upon very little by coaching. Early studies, e.g Vernon (1957) found a small point difference with five practice sessions, with no further increase after that. This could be seen as the result of familiarisation with a paper, rather than any direct impact of coaching.
All pupils who take the Buckinghamshire 11+ have opportunity to work through a familiarisation pack prior to taking the selection test. Where possible this takes place in school, between the summer term of Year 5 to immediately prior to the pupils taking the actual test. The aim of the familiarisation papers is to familiarise pupils with the different types of questions in the selection test and to give instructions on how to understand the questions types and explain how they can be answered. Pupils are given feedback on success and guided through wrong answers. The aim of the practice papers is for pupils to experience the timed nature and format of the test and answer sheet.
Coaching would constitute any kind of specific preparation or pupils for the tests, outside of the familiarisation and practice materials. It may not be qualitatively different to familiarisation, but aimed usually at giving pupils more opportunity to work through practice questions and more opportunities for feedback and developing problem-solving approaches to questions. Buckinghamshire Guidelines for Schools in relation to the Bucks VRT (11+) has reflected advice from NFER that familiarisation and practice papers are sufficient to achieve ‘saturation familiarisation’, without any further coaching. Schools are not allowed to undertake any further familiarisation and parents not encouraged to provide coaching because of its ‘marginal, if any, positive impact on performance’.
However, according to Strand (2004) whilst reasoning tests, such as the Bucks VRT, show consistency over time, and demonstrate predictive validity in terms of academic outcomes, they are no longer seen as measures of fixed or innate ability. Modern interpretations view reasoning tests as reflecting the pupils experiences up to the time of testing rather than providing an indication of fixed potential (Whetton, 1995). If this view of the tests is accepted, then the question of the impact of coaching on pupils’ attainment becomes much more pertinent.
In 2008 NFER changed their advice in relation to coaching. This was directly in response to research carried out in Northern Ireland by Bunting and Mooney (2001). Although this research was not new, it came into the spotlight due to the decision in Northern Ireland to stop using the 11+ test as a form of selection for secondary education. From 1993 selection in Northern Ireland had been based on a curriculum type assessment. However, prior to this point the tests had been described as Verbal Reasoning Tests, including, like Bucks VRT verbal and quantitative items. Although reported after the change in nature of the tests, the research was conducted using the earlier Verbal Reasoning Tests. The expectation in Northern Ireland was that teachers in schools spend time preparing pupils for the tests over a period of at least a year, through regular familiarisation, coaching and practice papers.
The study looked at the effect of both test familiarisation/practice and coaching on 11+ test performance. Children aged between 10 and 11 were randomly assigned to conditions. In group 1 pupils were given 3 hours coaching, followed by 5 papers. In group 2 coaching was given after the 3rd paper only. Both groups then continued with the school’s usual coaching practice over a period of 9 months before taking the final 2, actual 11+ papers. Comparison of pupils’ scores on the tests revealed that familiarisation, or practice had no impact on mean scores, but the 3 hours coaching lead to a small but significant gain of 5 points. Both groups then continued with the school’s usual coaching practice over a period of 9 months before taking the final 2, actual 11+ papers. The scores obtained on the final 2 papers showed that this continued coaching had resulted in more substantial gains with many pupils doubling their scores over this period of time.
An earlier study conducted in Northern Ireland by Egan and Bunting (1991) compared a number of children from 6 classes in 2 Belfast Schools (where the selective system was operative) with a similar number of children from Craigavon (where selection does not take place). The assumption was made that the Belfast group would have been coached for a period of 9 months or more, and the Craigavon group would not. The researchers used Raven’s Progressive Matrices Test, a test of non-verbal reasoning ability, to match the children in terms of ability. Indications were that the pupils with higher ability benefited most from coaching, but that even when ability was accounted for, most children could double their scores through the period of sustained coaching offered in schools in areas where selection takes place. There are also indications that a plateau is not reached with coaching as pupils in Belfast continued to make gains in the 2 weeks between papers, which the Craigavon pupils did not. The significant gains in scores attributed to coaching would have meant that, regardless of ability measured by the Raven’s Progressive Matrices, none of the pupils in Craigavon, without coaching, would have achieved a score in the top 70% required for selection into grammar school.
Egan and Bunting make the point therefore that, unless you can ensure equal effectiveness or access to coaching, then you cannot make assumptions about ability based on the verbal reasoning tests. Northern Ireland differs to Buckinghamshire in the expectation that pupils are coached in school prior to the 11+, thus eliminating some of the potential for bias. However, it is still generally reported that a large proportion of pupils seek coaching and further practice outside of school (Caul, 2000), and if there is evidence to suggest that coaching effects do not reach a plateau, this external coaching is likely to have an impact on outcomes for some pupils.
In Buckinghamshire, if we accept that coaching can have an impact, then the potential for bias is greater, in that as all coaching takes place outside of schools we have no indication of the extent nature and distribution of this practice. The BBC commissioned a survey of parents of Grammar School pupils in the UK, which we can assume to include some Buckinghamshire schools, and found 68.8% of those responding had given their children extra help in preparing for the 11+. Of course, we do not know the proportion of pupils who had help and did not achieve the required mark, and there may be other significant differences in terms of those who seek coaching. For example, Kenny (2002) conducted a large scale study in Australia where pupils also undertake selective examinations at Y6. She found that there was a strong positive relationship between IQ and those who were coached. When the effect of IQ was removed, the effect of coaching, although having some positive impact, fell short of being significant.
Other than the studies described, there are relatively few studies which look specifically at the impact of coaching on selection at 11+ in the UK, reflective of the fact that there are now few authorities still operating a selective system. The remainder of the research relates mainly to the Standardised Aptitude Test (SAT) in America, or to aptitude tests used in personnel selection by Occupational Psychologists, both of which are taken by older candidates. Hausknecht et al (2007) carried out a useful meta-analysis of 50 studies of practice effects for tests of cognitive ability, which included Bunting and Mooney’s (2001) research. This allowed them to draw conclusions about the effects of coaching, which they defined as ‘instruction aimed at improving test scores considered to fall anywhere in the broad range between the two extremes of practice and instruction, entailing some combination of test familiarization, drill and practice with feedback, training in strategies for general test taking. . . and skill development exercises’. The improvement between 1st and 3rd administration of tests was found to be on average from the 50th to the 71st percentile. Mere repetition accounted for some but not all of the practice effects and there was strong evidence to support their hypothesis that practice effects (improvements in scores) were positively related to the amount of time the candidate had spent in coaching.
Other relevant findings from Hausknecht’s analysis was that coaching time did appear to be logarithmically related to score gains – early coaching providing the most benefit with diminishing returns for further effort (Messick and Jungeblt, 1981).Question type and presentation also makes a difference to coaching impact. When there were shorter response times for questions, the gains were less (Powers, 1986). Wing (1980), Brounsstein and Holahai, (1987), and Powers and Rock (1999) all found practice effect were much bigger for item types that could be solved by applying specific rules, rather than those which tapped into general information, for example, verbal testing requiring the acquisition of new information e.g. vocabulary. This is perhaps why others e.g. Becker (1990) have found more of an impact of coaching on quantitative rather than verbal reasoning tests.
Although the evidence is not conclusive in terms of importance of familiarisation, long term or plateau effect of coaching, and relationship between ability and coaching, there is sufficient evidence that coaching has an impact on tests of ability in order to support views such as Cole (1982) who states coaching can affect the construct validity of a test. This is true if pupils are not getting better at verbal reasoning, but are getting better at tests that attempt to measure verbal reasoning ability. The impact on selection is that there are pupils who have been coached to improve scores to a level whereby they gain entry to a school, where the pace of learning does not match their ability to learn. In addition, pupils who have not had the same opportunity to improve their scores through coaching, usually those in lower socio-economic groups, do not gain entry to grammar school. If this is the case, then it would seem that research into the impact of coaching on the Bucks VRT would be important to promote equal opportunities for pupils in Buckinghamshire.
Bucks CC: Possibilities for future research in Buckinghamshire wrote:
The issue of coaching and the 11+ is clearly very sensitive in Buckinghamshire, and research would have to be considered very carefully in terms of ethics and possible implications of outcomes. The research could be carried out, either externally or by the Buckinghamshire Educational Psychology Service, with resource and financial implications.
Ideally, an experiment would take place in which children aged 10-11 were randomly assigned to two groups, with baseline measure taken. One of the groups would then be coached specifically in techniques and content of the Bucks VRT, and the other either not coached, or given some other form of coaching, not related to the Bucks VRT. The impact of this could then be assessed by the groups’ performance on the two Bucks VRT tests.
The difficultly of carrying out such a study in Buckinghamshire, aside from ethical implications, would be knowing whether/how much coaching pupils were having outside of that included in the study, which would impact on results. Ideally, the study would be carried out outside of Buckinghamshire, where there are no implications for selection, but finding pupils to participate in a study which would involve such a lot of time investment for no real purpose (for those involved in coaching) could be problematic.
Another option may be a comparison study like that of Egan and Bunting, where schools in Buckinghamshire were matched with schools well outside the area, who would not be taking the Bucks VRT. The hypothesis would be made that a large number of pupils in Buckinghamshire are coached for the 11+, and that no pupils outside of the area would receive this type of coaching, and that this coaching would have the effect of increasing scores on the 11+. Selected schools from outside the area would give pupils the same familiarisation pack prior to them taking the two 11+ papers that the Buckinghamshire pupils would be taking. If scores of Buckinghamshire pupils were significantly higher than the comparison group, when other factors such as ability, socio-economic status, and achievement were taken into consideration, then this would be evidence to support the hypothesis.
A further option may be a smaller scale study within Buckinghamshire, where pupils in schools which typically few pupils achieve the required score to gain places in grammar schools, are given coaching sessions leading up to the 11+, or are given coaching for a shorter period of time between paper one and paper 2.
Alternatively, for one year only, all schools could be given the opportunity to provide coaching in between papers 1 and 2 and the effects of this could be observed.
Other non-experimental studies could make use of the data already available, for example comparing 11+ scores with CAT and SAT scores for pupils, and analysing this by school and area to see whether there are any indications of the impact of coaching. Surveys of parents and/or pupils similar to the BBC survey could be carried out to determine the nature and extent of coaching, and if this was found to differ between schools and areas, there would be opportunity to look at correlations with scores, when other factors had been taken into consideration. However, the reliability of information about coaching gained in this way is likely to be questionable, as it is unlikely that many parents will be willing to respond, with no incentive to do so. Also, any correlations observed by these methods would probably warrant further investigation before leading to a change in policy.
Becker, B. (1990) Coaching for the SAT: Further synthesis and Appraisal. Review of Educational Research Col 60(3) 373-417
Bunting B. & Mooney, E. (2001) The Effects of Practice and Coaching on Test Results for Educational Selection at Eleven Years of Age. Educational Psychology, Vol 21,(3), 2001
Caul, L, McWilliams, S & Eason, G. (2000) Coaching for the Transfer Procedure: perspectives and perceptions Paper presented at the British Educational Research Association Conference, Cardiff University, 7-10 September 2000
Cole, N (1982) The implications of coaching for ability testing. In Wigdor, A and Garner W. R. (Eds) Ability Testing: Use Consequences and Controversies, Part 2 Washington, D.C.: National Academy Press
Egan, M. & Bunting, B. (1991) The Effects of Coaching on 11+ Scores British Journal of Educational Psychology, 61, 85-91, 1991
Hausknecht, J, Halpert, J, Di Paolo, N, Gerrard, Meghan, O (2007) Retesting in Selection: A meta-Analysis of Coaching and Practice Effects for tests of Cognitive ability. Journal of Applied Psychology 92(2) March 2007, p 373-385
Kenny, D. T. (2002). To coach or not to coach: But what is the question? Online International Confederation of Principals’ Journal, September Issue.
Messick, S., & Jungeblut, A (1981) Time and method coaching for the SAT. Psychological Bulletin, 89, 191-216.
Powers, D. E. (1986) Relations of test item characteristics to test preparation/test practice effects: A quantitative summary. Psychological Bulletin, 100, 67-77
Powers, D.E., & Rock, D.A. (1999). Effects of coaching on SAT1: Reasoning test scores. Journal of Educational Measurement, 36, 93-118.
Strand, S (2004) Consistency in reasoning test scores over time. British Journal of Educational Psychology (2004), 74, 617-631
Vernon, P. E. (ed) (1957) Secondary School Selection: A British Psychological Society Enquiry. London Methuen
Whetton, C (1995) Verbal Reasoning tests. In T.Husen & N Postlethwaite (Eds) International encyclopaedia of education (pp. 526-528) Oxford: Pergamon Press