Theme in discourse: 'thematic progression' and 'method of development' re-evaluated

3Previous text-counting research on TP/MOD


The relative paucity of text-counting studies on TP/MOD may be a result of theoretical difficulties over analysing Theme. Hasan and Fries (1995) discuss some of these and conclude that "both [Theme's] definition and its recognition criteria stand in need of further clarification" (Hasan and Fries 1995:xxxviii-xxxix). Continuing debate about correct placement of boundary between Theme and Rheme, for example, (as carried on explicitly in Downing 1991, Ravelli 1995, Berry 1996, and implicitly in Fries 1981, Hawes and Thomas 1996 and 1997, Mauranen 1996) has practical implications for the operationalisations of Theme on which text counting research into TP/MOD must be based.

Of the two concepts, text-counting work appears to have been done explicitly on TP rather than on MOD. This may arise from (a) the general conflation of the concepts mentioned earlier and (b) the expense and difficulty of setting up psychological testing procedures to identify MOD and non-MOD texts. Overall, research has taken the form of addressing how TP is instantiated in texts. Assuming some kind of regular relation between TP and MOD, the larger issue of whether and if so how much MOD is instantiated in text does not appear to have been explicitly addressed.

In considering the research on TP, I shall begin by attempting to answer the question "What predictions would Fries's (1981) hypotheses lead us to make about counts of TP in textual data?" and then compare these predictions with actual findings. Overall, it appears to me that Fries's hypothesis predicts two broad kinds of text: narrative texts structured by the mechanism of a pattern of Constant TP and non-narrative texts structured by the mechanism of a pattern of Linear TP. This prediction can usefully be broken down into three sub-predictions concerning related issues: (a) correlation between TP-type and genre, (b) patterning or sequences of homogeneous TP, and (c) the sufficiency of the TP typology. A fourth prediction – and the one of most interest to composition teachers – relates to the issue mentioned earlier of whether MOD’s status is descriptive or prescriptive (i.e. whether MOD is conceived as an unmarked textual norm or as a rhetorical ideal). For convenience sake I shall call this the (d) quality prediction. The discussion of previous research which follows is organised around each of these predictions in turn. In each case, the prediction is stated in its most extreme form. Given that we are dealing with texts, which are not ‘rule-bound’, it is immediately unlikely that the predictions will be borne out in the way that they are formulated here. However, expressing them in simple, unhedged form has the advantage of making it clearer precisely what issues are involved and what kind of evidence is needed as a basis for reliable claims.


Prediction: In narrative texts all or most sentences will be found to have Constant TP, while in argumentative and expository texts all or most sentences will be found to have Linear TP.

Fries cites work by Enkvist (1978), who analysed fifteen sample text segments from different literary and academic texts in terms of global proportions of the various TP types, and found "two major stylistic poles": a 'static' style in which (translated into Fries’s terms) Constant TPs predominate over Linear TPs, exemplified by a Hemingway novel extract, and a 'dynamic' style, exemplified by a social science article, in which Linear TPs (translated into Fries’s terms) predominate over Constant TPs. Dubois (1987) analysed TP in independent clauses in an academic conference paper and found Linear TPs were far more common than Constant TPs. Both Francis (1990) and Gomez (1994) found Constant TPs to predominate in narrative genres, news stories. Fries (1995b) analysed TP in three kinds of texts: obituaries, narratives, and an expository text. Overall he found that Constant TPs predominate in narrative texts (obituaries and narratives for children) and that Linear TPs predominate in descriptive sections of narrative texts and expository texts. Constant TPs predominated in narratives for adults but less so than in narratives for children. Hawes and Thomas (1996) compared TP in the editorials of two British newspapers, The Sun and The Times. While Linear occurred at similar levels in each newspaper, Constant TP occurred twice as often in Sun editorials as in Times editorials. Ventola and Mauranen (1991) found similar proportions of both Linear and Constant TP in social science journal articles written in English by native English and Finnish speakers.

Overall, then, research suggests that the prediction overstates the case but that it is correct to some extent: narrative texts are likely to have greater proportions of Constant TPs than of Linear TPs; in expository and argumentative texts these relative proportions are likely to be either equal or reversed. The proportion of Derived TPs, however, is unpredicted: this is discussed further under the sufficiency prediction below.


Prediction: In texts perceived as well-structured the constituent sentences will be found to have the same type of TP. Thus there will be patterning formed by sequences of homogeneous TP.

Fries (1981:9) offers as evidence of a "strong correlation" between the TP within a paragraph and its "perceived structure” two TP example text-segments. In the text exemplifying Linear TP, Fries notes that only one of the six sentences deviates from this TP. (Fries illustrates Constant and Linear patterning but does not discuss whether it is possible for sequences of Derived TP to form patterns: I discuss this issue below in reporting my own research (§4)).

The strongest evidence in support of the patterning prediction are the findings of Gomez (1994) who looked at TP in radio news broadcasts and found a "very smooth" single type (Constant) was the norm. Researchers who looked for patterned sequences in samples from other genres had different findings. Francis (1990) analysed TP in three different text types found in newspapers – News, Editorials, Letters. In News she found Constant TPs "common…but by no means universal”. In Editorials and Letters it was “difficult to see a theme-rheme pattern emerging, even at paragraph level" (1990:70). Hawes and Thomas (1997) counted length of sequences of TPs of a single type in texts by NNES learners of English of differing proficiency levels. They report that the Advanced learners were less than half as likely as Lower Intermediates to use sequences of three or more progressions of a single type. It is worth noting in addition that most researchers have sought to distinguish between TPs which referred to the previous sentence and those which referred to an earlier sentence, categorising the latter as 'non-contiguous progressions' (Dubois 1987) 'thematic jumps' (Mauranen 1996), 'gaps' (Hawes and Thomas 1997), 'skips' (Fries 1995b).

Overall, these findings suggest that rather than being either a discourse norm or a discourse ideal, homogeneity of TP may be associated with narrative texts, in particular oral narratives, and texts by lower proficiency learners.


Prediction: In all texts, all sentences will be found to fit one of the TP types.

In practice, the degree of fit between Fries’s adopted TP typology and the data varies according to the analytical practice of the researcher. The biggest areas of variation are in the treatment of (i) Derived TP and (ii) non-Thematic Progression.

As already noted Fries (1981) contained no exemplar text for Derived TP and has little to say about its structural implications or its association with genre. Both Dubois (1987) and Hawes and Thomas (1996) counted Derived TPs and noted that the hyper-Theme from which such Themes derived could lack a textual exponent in the previous text and need to be inferred. Hawes and Thomas (1996) found this category of TP extremely common: Derived TP was the main TP type in every Times editorial and half of the Sun editorials. Hawes and Thomas (1997) also found a correlation between ESL proficiency and Derived TP: Derived TPs constituted a third of the TPs in Advanced learners' but only 5% in Lower Intermediate learners' writing. Ventola and Mauranen (1991) found that journal articles by native English speakers contained Derived TPs while similar articles in English by native Finnish speakers did not. Mauranen (1996), by contrast, decided that Derived TP was a synoptic category and that in terms of a dynamic analysis all TPs could effectively be categorised as Constant or Linear. All these findings are from argumentative genres, so the issue of whether Derived TPs also appear in narrative genres has not yet been addressed. Perhaps the safest summary of these findings is that where Derived TP was measured it was found to be associated with either style or English proficiency rather than with genre.

Several researchers encountered TPs which could not be accounted for in terms of Fries's typology. Dubois (1987) suggested the term ‘unrecoverable’ TP. Ventola and Mauranen (1991) classed TPs which did not fit as ‘unmotivated’ and found 25% were in this class. These nearly all occurred paragraph-initially, which might offer Fries’s theory the missing structural correlate of Derived TP. Mauranen (1996) found 'unmotivated' TPs rare in articles by native-speaker writers in either English or Finnish but not uncommon in articles by non-native English speaking Finns writing in English: her conclusion was that such TPs are characteristic of deviant, lower proficiency English. Hawes and Thomas (1996) classed TPs which do not fit the typology as breaks. Overall, 31% of the TPs in Sun editorials and 15% of the TPs in Times editorials were classed as breaks. In Hawes and Thomas’s (1997) ESL learner texts there was considerable variation but for most groups breaks accounted for around at least around a quarter of all progressions. Unfortunately, they do not record whether these breaks correlated with paragraph boundaries.

Many researchers (including Enkvist 1974, Mauranen 1996, Hawes and Thomas 1996, and Cloran 1995) noted another kind of omission in the TP typology: there were occasions when a Theme had no cohesive link with previous text but there was such a link in the Rheme – what we might class Rhematic Progression (hereafter RP). However, counts of RP where recorded are unreliable because RP was considered only as a fall-back option. In fact, it is possible that the same sentence could contain both a TP and an RP. As Enkvist (1974) noted, to avoid the in-built Theme bias of TP analysis one would need to carry out two independent text analyses. While Mauranen (1996) considers RP in her data as deviance from native English-speaker norms, Cloran (1995), looking at mother-child dialogues, reaches an almost opposite conclusion: she argues that TPs and RPs mark different kinds of relation in a hierarchy of text structure: TPs indicate embedding and RPs expansion. This seems a similar claim in nature, though quite different in detail, to Fries's claim about the correlation between different TP types and subordination/coordination.

Overall, then, there is widespread evidence that Fries’s TP typology is insufficient to cover the numerically significant kinds of Theme-Rheme progression observable in argumentative texts.


Prediction: Texts conforming to the correlations hypothesised in the patterning and genre predictions are more likely to attract favourable judgements of quality than texts deviating from these correlations.

Of the studies cited only Mauranen’s (1996) appears to offer indirect confirmation of this prediction. Overall, only three studies looked at texts written by non-native-English speakers and in those cases only non-narrative texts were analysed. In the absence of analyses of narrative texts by NNES writer, and given that most researchers found considerably more complexity and variation in the TP in non-narrative texts by NES writers than did Mauranen, the bulk of the research discussed cannot be regarded as confirming the quality prediction.


The genre prediction is partly supported in that the research suggests that there is something of a correlation between Constant TP and narrative segments of text. Such text is likely to have a higher proportion of Constant TPs and more extended sequences of Constant TP than non-narrative text. The predicted correlation of non-narrative and Linear TP is much less supported. The patterning prediction is little supported: extended sequences of homogeneous TP of any type are rare. The sufficiency prediction is also little supported: Derived TPs and sentences without recognisable TP are much more common than predicted. Finally, there is little evidence either way for the quality prediction.

Before leaving this discussion of previous research, it ought to be noted that theoretical differences mean that the studies considered are not entirely comparable. The commonest unit of analysis was the independent clause, with initial subordinate clauses treated as thematic (Gomez (1994) alone counting the themes of subordinate clauses). How Theme extent was determined varied more: Enkvist (1974) considered everything prior to the main verb as thematic, Mauranen (1996) everything up to the end of the subject, while Hawes and Thomas (1996; 1997) included the subject in the case of “adjunct-only” Themes. Francis (1990) adhered to Fries’s (1981) guidelines.

