Bad science = the end of phonics.

Although educational research was in its infancy in the early decades of the twentieth century there was sufficient concern and curiosity from both teachers and universities as to the veracity of whole word methods versus phonics that a number of rudimentary studies were conducted. The majority of these studies compared phonic and non-phonic approaches to the teaching of reading and had little concern with the actual programmes followed. Many had small sample sizes and were conducted over short periods. Nonetheless, the results are enlightening. Currier and Duguid (1916), Buswell (1922) and Mosher and Newhall (1930) all found that children taught by phonic approaches were more accurate in their reading, especially when attending unknown words, had superior comprehension and made fewer guesses at newly encountered words than those taught using look-and-say methods. They were, crucially, slower at reading and read with less apparent prosody. Speed and fluency were the now well-established touchstones of ‘good reading’ no matter the age and stage of the emergent reader.

In 1924 Sexton and Herron conducted the ‘Newark Phonics Experiment’ in eight New Jersey schools specifically investigating the value of phonics in the teaching of beginning reading. Uniquely, the sample sizes were relatively large at 220 and 244 pupils and the study took place over three years. To further militate against the quality of teaching, the research design had the same teacher alternate both phonic and non-phonic instruction of the groups over the period of the study. The results in the first two years indicated that children taught by non-phonic methods had made the more progress in reading. However, by the end of the third year the group that had received phonics instruction outscored the other group on every test with particularly dramatic divergence in spelling scores. The indications from these studies were that children taught using phonics methods learned to read more accurately and comprehended better than those taught using a look-and-say method…eventually. Having a longitudinal element built into the study was crucial.

The most influential study of the era was carried was out by Gates at Columbia University (1928) who compared two samples of seven-year-olds; one taught by conventional phonics approaches and the other using a whole word approach with a focus on reading comprehension. The results were statistically inconclusive; however, Gates’ interpretation of the results did not reflect this.

In the test of phonemic awareness, the phonics group performed slightly better. Nevertheless, this was interpreted as a ‘moral victory for non-phonics methods’ (1927, p223) as the group had been taught no specific phonics knowledge. In the word recognition test the results were very similar but reported as the non-phonics group showing ‘superiority’ (1927, p223) when they encountered words they had previously learned. There were no differences in the word pronunciation tests and in assessments designed to ascertain a child’s ability to see a word as a unit of its known parts, yet Gates’ (1927) analysis implies an inherent and prejudiced difference evidenced by his use of positive comparative vocabulary. He states that the non-phonic training seemed to ‘sharpen perception’ and enable ‘rapid appraisal’ and when they made errors made a ‘more detailed study’ (1927, p224). When describing the phonics groups’ attempts at reading newly encountered words he describes their efforts pejoratively as they ‘labor (sic) longer’ (1927, 224) before attempting the word. He fails to mention whether or not this added labour resulted in success.

Gates (1927) is most emphatic in his analysis where he assesses and compares the two groups’ ability in silent reading and comprehension which he considers ‘the main objective of reading instruction,’ (1927, p225). The non-phonics trained group show ‘markedly superior attainments’ (1927, p225) and showed ‘a clear advantage…the non-phonics pupils were superior in silent reading by 35%...’ (1927, p225).

Despite the ambiguity of the research outcomes (except in silent reading) Gates is remarkably unequivocal when analysing the results of the studies:

‘That it will be the part of wisdom to curtail the use

of phonics instruction in the first grade very greatly,

is strongly implied; indeed, it is not improbable that

it should be eliminated entirely.’ (1927, p226)

For one of the leading educational researchers from one of the most influential universities in the United States writing in one of the most highly regarded journals to draw such an emphatic conclusion one would expect the study to be extremely robust. It was not. Firstly, the study had a sample size of twenty-five children in each group when the recommended minimum is thirty (Cohen and Mannion, 2016). Secondly, the research only lasted for six months, with results after three months very similar to those at the end of the study. This would have given the children in the phonics group sufficient knowledge to decode only words in the initial code and not be close to automaticity for the vast majority of words. In contrast, children in the whole word group would have learned a number of whole words and could thus give the impression of fluency; although they guessed words far more regularly. Thirdly, and most crucially, the reading test was timed, with the total number of words read correctly being the arbiter of reading efficacy and not the percentage of words read accurately. Children decoding utilising phonics strategies would be far slower as they were still at a letter by letter decoding stage, hence Gates’ observation of their stuttering work attack strategies as opposed to the immediate and confident guessing of the non-phonics group.

More damning is the unscientific pejorative language used in the analysis of the phonics group’s word attack strategies which suggests researcher partiality. The evidence for confirmation bias is strong. Gates (1928) was a staunch advocate of ‘word mastery’ (the word-method of learning words by shape and sight) of reading instruction recommending the use of flash cards, picture cards, picture dictionaries and word books as part of his programme of teaching and warning of the dangers of traditional phonics (Gray, 1929). He developed the concept of ‘intrinsic’ phonics which was phonic analysis only applied after a word had been learned and only when the word could not be read from memory. This was far less formal than analytical phonics as there was an assumption that the phonetic structure of a word would be absorbed ‘intrinsically’ by the learner. Learners only received specific help where needed. He is unspecific as to when this approach would be was necessary or where this help would stem from as the teacher had been absolved from any responsibility to teach phonics in any form but to concentrate entirely on word learning; the phonics would come naturally. Gray (1929) states that Gates’ pedagogical emphasis was on ‘fluency, fullness of comprehension and enjoyment of reading,’ (1929, p468) with an avoidance of the stuttering development of reading associated with traditional phonics approaches.

Gates’ research, article and following book had a considerable influence on reading instruction in the United States. Gates’s position at the University of Columbia endowed him with significant authority, both in terms of research profile and instruction of trainee teachers and his concepts gained traction over the century – he was inducted into The Reading Hall of Fame in 1978.

The use of phonic strategies was never denied according to Terman and Walcutt, (1958), and the adherence to the utilisation of a variety of methods of early reading instruction, applied according to the need of the specific child was the ‘reasonable…open-minded’ (1958, p93) response. This ‘legend in the mythology’ (1958, p93) created by Gates denied the evidence that the myth contradicted the facts.

Nonetheless, the relegation of phonics to the periphery of initial reading instruction continued throughout the first part of the twentieth century, with more and more reading experts proselytising whole word recognition, and was dealt another blow with Dolch and Bloomster’s research in 1937 into phonic-readiness. The authors administered a phonics test on first and second grade children whose mental ages they had assessed and who had been instructed in phonics. The findings appeared conclusive: children with a mental age below seven scored almost nothing on the assessment despite the fact that they were able to recognise some words by sight. The authors concluded that children below the third grade were not mentally developed enough to apply the principles of phonics and should be taught words by sight only.

Dolch and Bloomster’s study (1937) contained a design flaw. There was no analysis of the phonics instruction that the participants had received either in terms of the amount or the efficacy. They defined phonics instruction received in terms of generalisations about whether participants had received any instruction of how letters were sounded out. Thus, children could have been taught very little phonics and instructed in the phonics approach approved at the time: incidental phonics only after word-guessing had failed. The researchers did not attempt to instruct the participants in phonic decoding strategies relevant to their mental ages and therefore had no means of knowing whether these children had actually retained any relevant instructed knowledge or whether the knowledge instructed had not been effective. By ensuring that phonics instruction was delayed by three years, the effect of the study was ultimately to ensure that phonics was only taught as a remedial technique (Terman and Walcutt, 1958).

It was not long before almost every reading expert and every book on reading instruction took one of two attitudes towards phonics instruction for early readers: it was either useless or it was detrimental.

This blog is number 12 in a series of blogs.

The Reading Ape

Nullius in verba

Bad science = the end of phonics.

Recent Posts