About a century after Mark Twain had written "The Adventures of Tom Sawyer", and with the emergence of the newly developed discipline of Socilinguistics, it was strongly claimed (Trudgill, 1995) that language has a 'clue-bearing' function, and that the way a person uses language reveals his/her background (social status, level of education, occupation, ethnic or regional belonging…).
This paper tries to apply a sociolinguistic approach to "The Adventures of Tom Sawyer". It analyses the linguistic features (phonological, morpho-syntactic and lexical) of the characters' speech in the novel. It further investigates the 'clue-bearing' function of this speech i.e. whether there exists a correlation between the way Twain's characters use language and their social background concluding on Twain's motivation behind the use of such linguistic variation.
Key Words: Sociolinguistic investigation, linguistic variation, dialects, social factors, language use, "The Adventures of Tom Sawyer", characterisation.
Sociolinguistics, which investigates the interrelationship between language use and society, is a relatively new discipline as it developed in the last quarter of the 20th century after the relationship between language and society had been 'ignored' by linguistic studies for the sake of theoretical advances (Trudgill, 1995). This paper tries to apply some sociolinguistic investigation tools to a literary work, "The Adventures of Tom Sawyer" by Mark Twain, which appeared in the last quarter of the 19th century i.e. a century before the relationship between language and society was recognised as a field that is worth investigation.
After presenting the sociolinguistic framework, the literary background to the study and the method of work followed in this study, the paper proceeds to a quantitative analysis of the linguistic features in the speech of the novel's characters; it subsequently tries to find a correlation between the characters' speech and their social background; finally, it tries to show that the author was aware of this correlation and used it as a means of characterisation.
1. Background and Analytic Tools This study is situated at the crossroads of literature and sociolinguistics as it applies a sociolinguistic approach to a literary work. This part explains the relevance of these two disciplines to the present work. The first section introduces the sociolinguistic background serving as the framework for the study; the second section presents factors that motivated the choice of "The Adventures of Tom Sawyer" as a case study; and the third section outlines the method adopted in the investigation
1.1. Sociolinguistic Background
There is an important debate in linguistic studies about whether language should be studied as a closed system focusing thus on its internal structure or as an open system interacting with social factors. In what follows, a brief presentation of the main arguments of this debate will be presented explaining the relevance of both approaches to the present study. Subsequently, the focus will be put on a specific sociolinguistic concept used as the framework for this paper, namely the 'clue-bearing' function of language.
1.1.1. Linguistics vs. Sociolinguistics
Sociolinguistics is introduced by Trudgill as "a relatively new sub-discipline within linguistics" (1995: 20). It may be considered as a new discipline because it deals with a relationship that has been 'ignored'-in Trudgill's terms- in the past linguistic studies, namely the relationship between language and society. Since the emergence of the discipline of sociolinguistics, the debate between linguists has centred around whether language should be studied as a closed or as an open system. The major arguments of these two approaches will be presented as both will be used in this paper.
On the one hand, theoretical linguists perceive language as a closed system that should be studied for its own sake. For them, emphasis should be put on studying the underlying structure of the linguistic system and "the concern of the theoretical linguist is to devise a theory of grammar" (Radford, 1997:5).
In order to achieve this goal, differences between speakers have to be overlooked. In this sense, "linguistic theory is concerned primarily with an ideal speaker-listener in a completely homogeneous speech community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors" (Chomsky, 1965: 3-4).
Chomsky thinks that what should be studied is the linguistic competence (I-language or Internal-Language) that is present in all the mental systems of the native speakers in a speech community and that is independent of the way they actually use language, i.e. performance (E-Language or External-Langauge). In order to study linguistic competence, theoretical linguists need to idealise, that is to "ignore certain facts at certain times in the interest of articulating the theory" (Culicover, 1997: 10). According to the principle of idealisation, the linguist treats "all native speakers as exactly the same" and assumes that "there is some stable and well-defined knowledge of language in the mind of the native speaker" (Ibid: 11).
This theoretical framework will be adopted at a first stage of the corpus analysis [2.1] trying to treat all characters' speech as homogeneous data and to find out the phonological, morpho-syntactic and lexical features that characterise it.
On the other hand, sociolinguists approach language as an open system interacting with a variety of factors. According to sociolinguists, "since speech is (obviously) social behaviour, to study it without reference to society would be like studying courtship behaviour without relating the behaviour of one partner to that of another" (Hudson, 1980: 3)
It can therefore be deduced from this simile made by Hudson that the relationship (or, more accurately, interrelationship as each one affects and is affected by the other) between language and society is so close that the one cannot be studied without reference to the other. Sociolinguists assume that language is "a very variable phenomenon"(Trudgill, 1995: 20) and not a simple code used in the same manner by all speakers. They argue that language should be studied while situated in its social and cultural contexts; in this sense, the focus is not on the linguistic system itself, but on the interaction of the system with a set of factors related, on the one hand, to language users (such as their sex, age, origin, social class, occupation…), and, on the other hand, to language use (context, situation, style…).
This sociolinguistic framework will be used at the second stage of the corpus analysis [2.2.] trying to find out a possible correlation between the characters' use of language and their social backgrounds. This analysis will be based on an important sociolinguistic assumption perceiving language as providing clues about its speakers' background (cf. 'the clue-bearing function [1.1.2.])
The use of these two frameworks in this paper is motivated by a belief in the legitimacy and the complementarity of the two approaches. In fact, both sides recognise both ways of approaching language but focus on one aspect or the other. On the one hand, theoretical linguists recognise the relationship between language and society (cf. Chomsky's performance). They just 'ignore' it to have a deeper insight into the system itself; and their concentration on competence was "a necessary simplification that led to several theoretical advances" (Trudgill, 1995: 20). On the other hand, sociolinguists recognise the importance of studying language as a system, as Saville-Troike stated while introducing the ethnography of communication: "The ethnography of communication takes language first and foremost as a socially situated cultural form, while recognising the necessity to analyse the code itself and the cognitive processes of its speakers and hearers". (1982: 3-4 cited in Fasold, 1990: 60).
1.1.2. The 'Clue-bearing' Function of Language
According to sociolinguists, the way people use language may reveal information about their social background. Trudgill (1995) speaks about two aspects of language behaviour that have a social dimension; the first is the role of language in establishing relationships between people and the second is the role of language in conveying information about people's background.
This second function, which Trudgill calls the 'clue-bearing' role of language (1995: 2) is the concern of this paper. According to Trudgill, when a person speaks, s/he uses linguistic clues which allow the hearer to guess his/ her background i.e. social class, regional origin, level of education, occupation…etc. These linguistic clues include a way of pronunciation, an accent, a set of vocabulary, and a given structure of sentences. All these features make up a linguistic code or a linguistic variety used by a group of people sharing some social variables (same class, same ethnic background, same studies, same occupation…).
In the same direction, Hudson (1980: 197) affirms that "people use speech in order to identify the particular social group to which they belong". This means that the focus of study is not on what people say i.e. the content of the message, but rather on the way they say it i.e. the form of the message. In this sense, linguistic features are considered as 'markers' (Chambers, 1995: 48) and people may be classified into groups according to the markers found in their speech. This paper will apply such an approach to the characters of Mark Twain's novel, "The Adventures of Tom Sawyer", trying to classify the characters of this literary work according to their way of speaking. The next section will introduce the novel and deal with the factors that have motivated its choice for the application of such a sociolinguistic approach.
1.2. Literary Background
This section introduces the novel understudy and presents the reasons that motivated its choice and made it suitable for the application of this sociolinguistic approach.
1.2.1. Presentation of the novel
As its title indicates, the novel deals with the adventures of Tom Sawyer, a prototype mischievous boy who hates school and looks for treasure. In all his adventures, Tom was accompanied by other characters with whom he interacts linguistically: he lived with his aunt Polly, his half-brother Sid and his cousin Mary; he went with his friend Huck to the graveyard where they witnessed the murder of Dr. Robinson by Injun Joe; he fell in love with Becky Thatcher, and had an adventure with her when they got lost in the cave; he ran away with his friends Joe Harper and Huck Finn to an island to become pirates; and he discovered a treasure with Huck.
In addition to the adventures of the main character, the novel depicts St. Petersburg, a small Mississippi River town where these adventures took place. Although the title of the novel bears the name of the main character, i.e. Tom Sawyer, the author introduces a variety of characters belonging to different social backgrounds (the wealthy, the poor, ethnic minorities, old and young people…) and describes their beliefs, their superstitions and their social interactions.
1.2.2. Reasons for the choice of "The Adventures of Tom Sawyer"
A major reason that motivated the choice of Twain's novel as a case study is the use of Direct Speech by characters. Short (1996) argues that while presenting the speech of characters, authors have a variety of choices available to them ranging from Direct Speech to Indirect Speech. The two ways differ in their linguistic form as Direct Speech is introduced by inverted commas whereas Indirect Speech is presented through subordination. They also differ in their effects and functions as "the words of Direct Speech are clearly those of the character concerned". In other words, what is said between the inverted commas is "unmediated by the reporter", whereas the words of the Indirect Speech "belong to the narrator", (Short, 1996: 288-9). The use of Direct Speech in "The Adventures of Tom Sawyer" seems important as 22,437 words are devoted to characters' utterances out of 71,733 words for the whole text (making up about the third of the novel).
While opting for Direct Speech, authors may make their characters use dialects with the aim of creating "individual voices for particular characters in novels or plays", (Short, 1996: 82). Short further argues that the notion of dialect should not be restricted to geographical location and should be extended to social-class dialects. In this sense, characters are made to use different linguistic codes reflecting thus, a social stratification that the author wants to convey. Twain is one of those authors who opted for the use of dialects by characters.
VanSpanckeren (1994: 78) argues that "Twain was the first major author to come from the interior country, and he captured its distinctive humorous slang and iconoclasm". She further states that "Twain's style, based on vigorous, realistic, colloquial American speech, gave American writers a new appreciation of their national voice". In fact, many critics comment on the high level of accuracy of Mark Twain in recording various dialects making it possible "to present his characters in a truthful light to the reader in a language that is both vivid and clear at the same time".1 In the same direction, Fowler (1996) argues that the use of non-standard language by characters may be contrasted with the standard voice of the narrator presenting thus different world views. Fowler speaks about 'plural texts' containing "a mix of registers, dialects, and sociolects woven together", (Fowler, 1996: 197). According to him, the reader cannot remain indifferent to this plurality of voices, but has to work out the different associations that exist between different codes and world views.
"The Adventures of Tom Sawyer" is a good example of a plural text as a contrast is made on the one hand between the variety used by the narrator (having the features of Standard English) and the dialect used by the characters (having the features of spoken English); and on the other hand, between the different varieties used by the characters which may be classified into sub-varieties determined by various social factors. The analysis of the corpus [Section 2] will follow this classification; first, the features of the code used by characters will be defined in comparison with the standard variety used by the narrator. Second, the variety used by the characters will be studied more thoroughly by correlating sub-varieties with social factors.
The second major factor that urged the choice of this novel is the consideration of Twain by many critics as a realistic writer (cf. High, 1986; Gerber, 1993;Van Spanckren, 1994 and Wonham, 1996). Twain himself speaks about 'the native novelist' who has the ability to give an accurate description of the nation's experience: "its soul, its life, its speech, its thought" (Twain 1895 cited in Wonham, 1996: 1). Wonham (1996: 1) argues that "Literary creativity, according to Twain, depends on the unconscious accumulation of local knowledge, for the writer is ultimately less a creator than an 'Observer of Peoples'". In this sense, the writer is a 'regional specialist' who observes his nation, shares its life and reports it. Wonham further argues that the best writings of Twain were when he dealt with village life.
Following this realistic dimension, Twain claims in the preface to "The Adventures of Tom Sawyer" (p. 1) that events really occurred and that characters are inspired by people he knew "Huck Finn is drawn from life; Tom Sawyer also, but not from an individual -- he is a combination of the characteristics of three boys whom I knew, and therefore belongs to the composite order of architecture". The analysis of the corpus will be built on this claim by the author. In fact, if events and characters are drawn from life, it is likely that the language spoken by characters is also drawn from life, hence its authenticity and its applicability to a sociolinguistic approach.
All these factors, i.e. the large number of characters, the linguistic varieties used in the novel as Direct Speech, the plurality of the text and the realistic dimension claimed about the novel have motivated the choice of "The Adventures of Tom Sawyer" as a piece of fiction that is worth studying from a sociolinguistic perspective.
The study is based on a corpus consisting of all the direct speech utterances produced by characters in the novel and adopts a quantitative approach as all the linguistic features displaying language varieties are extracted and studied statistically. The analysis of the corpus adopts the methodological framework suggested by Short (1996:105). Within this framework, the first step of the analysis shows how the dialect used in the novel varies from standard written English by defining the phonological, morpho-syntactic and lexical features of the linguistic code used by Twain's characters as opposed to the code used by the narrator. Believing that an automatic processing, rather than manual extraction may be more exhaustive, the investigation is based on an electronic version of the novel.
First of all, the utterances of the characters are extracted in a separate document (Chararcter.doc); which makes it possible to compute the ratio of words produced by characters compared to the whole text. Second, 'Character.doc' is submitted to a concordancer in order to build a vocabulary list with the exact frequency of each word.
Phonological features may be extracted from the vocabulary list as Twain tried to transcribe certain words as they were pronounced for e.g. 'feller' instead of 'fellow' (from which it can be deduced that the standard back-closing diphthong is phonetically realised as the weak short central vowel).
As for lexical features, they can also be extracted from the vocabulary list of the document; the procedure followed is to check the meaning of all the words which may seem unfamiliar to a contemporary English-speaking person. The words extracted belong to two categories: some words are entered in the dictionary2 as slang words as for e.g. goner (person or thing in desperate straits); and some others do not exist in the dictionary, in which case the help of a native American speaker was needed to know the meaning of the word (although it was sometimes possible to guess the meaning from the context).
As far as morpho-syntactic features are concerned, recourse to the vocabulary list is not very helpful. This is due to the fact that this type of analysis dealt, most of the time, with the structure of phrases and clauses, i.e. the relationship between words whereas in the vocabulary list, words are entered separately. So, it was necessary to go through the document manually in order to extract those features.
The second step of analysis suggested by Short consists in determining whether there is any variation in the text with respect to dialect. It proved to be the case in the corpus understudy as the vocabulary list shows that for some items, two forms may exist, for example: 'becuz' and 'because', 'afeared' and 'afraid'. So the second stage of analysis focuses on finding out the reasons of this variation. For this, it is necessary to link speech utterances to their origins i.e. characters producing them looking for a possible correlation between this variation and extra-linguistic factors.
2. The Analysis of the Corpus As stated above, this analysis will follow two steps; the first one consists in a general analysis of the characters' utterances in order to discover the linguistic features that characterise their speech. A further step consists in classifying these linguistic utterances according to characters in order to look for a possible correlation between the social status of a character and his/her use of the linguistic code.
2.1. Linguistic Analysis of Characters' Utterances
This section presents the linguistic features depicted by characters' speech classified in three different categories, namely phonological, morpho-syntactic and lexical. In this section, the principle of idealisation [cf. 1.1.1.] will be applied. This methodological principle which is a major foundation of theoretical linguistic research makes it possible to treat all the characters of the novel as sharing the same linguistic features. In other words, throughout this section, any linguistic differences that may exist between characters will be ignored. This is considered necessary as it is not possible to study linguistic variation without first describing the features that characterise the linguistic code used.
2.1.1. Phonological features
Some phonological features are related to vowels, some are related to consonants and others to connected speech. The following tables display those features along with their frequency in the corpus:
- Features related to vowels:
Number of occurrences in the corpus
Table 1: Phonological features related to vowels
Features related to consonants:
Number of occurrences in the corpus
Table 2: Phonological features related to consonants
Features related to connected speech:
Three different types of features related to connected speech may be noticed in the characters' utterances:
Table 3: Phonological features related to connected speech
2.1.2. Mopho-syntactic features
The following table displays the morpho-syntactic features that are found in the characters' speech with their number of occurrences in the corpus:
Number of occurrences
Absence of agreement between subject and verb
a third person singular subject is used with an non-inflected verb
- It don't hurt any more
- He come along one day
Table 4: Morpho-syntactic features
2.1.3. Lexical features
The most regular lexical pattern is the use of non-standard forms of the verbs 'to have' and 'to be':
- ain't (121 occurrences) is used as the present negative form of 'to be' in its different morphological realisations:
o In its third person singular form (is not): Why, that ain't a-going to do any good
o In its first person plural, second person and third person plural form (are not):
We ain't doing any harm
But you ain't too warm now, though
They ain't going to hurt us
o In its first person singular (am not): "Blame it, I ain't going to stir him much."
- warn't (16 occurrences) is used as the past negative form of 'to be' in its different morphological realisations:
o In its first person and third person singular form (was not):
You'd be always into that sugar if I warn't watching you."
He warn't any more responsible than a colt.
o In its second person and third person plural form (were not):
you could have come over and give me a hint some way that you warn't dead, but only run off.
Those fellows warn't likely to come again
- hain't (7 occurrences) is used as the present negative form of 'to have':
o In its third person singular form (has not):
but then he hain't ever done anything to hurt anybody.
o In its other forms (have not):
"No, I hain't. But Bob Tanner did."
Now, Tom, hain't you always ben friendly to me?
In addition to this regular pattern, it has been possible to extract from the vocabulary list of the document lexical items used by the characters and not found in Modern Standard English. Some of these items are entered in the dictionary as slang words and some other colloquial items are not entered in the dictionary. The following table displays those lexical features:
Words entered in the dictionary as slang words
Words not entered in the dictionary
Table 5: Slang words in the corpus
The following table displays the frequency of the colloquial features found in the utterances of Twain's characters:
Table 6: Distribution of colloquial features in the corpus.
Two remarks may be drawn from this classification:
The phonological features in this corpus seem to be the most productive representing 48% of the total number of linguistic features against 21% for morpho-syntactic features and 31% for lexical features as the following figure shows:
This may be accounted for by the fact that the corpus consists of spoken language, so it seems natural to find phonological processes facilitating the articulation of speech (i.e. assimilation, contraction, shortening of vowels…)
The second remark concerns the ratio of colloquial features as to the total amount of characters' utterances; 745 items displaying colloquial features represent only 3% of the total amount of speech by characters (22,437). Although Twain made his characters use colloquial language, he tried to make his text understood by any Standard English reader. This explains why the narrator sometimes interferes to explain one character's utterance as when Aunt Polly says: "He'll play hookey this evening" (p. 18), and the narrator states in a note: "Southwestern for 'afternoon'". This implies that Twain was not only trying to reproduce the speech of the society he depicted for the sake of realism but he was using linguistic variation for other purposes. This hypothesis will be investigated in the next section which will show that Twain used more than one linguistic variety and that the use of language is determined by the characeters' social status.
2.2. Linguistic Variation in Characters' Speech
This section tries to investigate whether there are differences between the characters of the novel as far as their use of language is concerned. The goal of this investigation is to look for a possible correlation between the characters' social background and their use of language.
It should be noted, first of all, that many of the linguistic features depicted above are common to nearly all the characters regardless of their age, sex or social background. The following utterances by a variety of characters illustrate different linguistic features:
Ah, there ain't many left, now, that's got hope enough, or strength enough, either, to go on searching
Use of the non-standard form of 'to be'
Table 7: Colloquial features in the corpus displayed by a variety of characters.
However, a closer look at the characters' utterances reveals that in many instances, the standard forms of some items exist along with the non-standard forms listed in the previous section. The following table displays those instances:
Number of occurrences
Number of occurrences
Table 8: Co-existence of standard and non-standard forms of the same expression
Consequently, a classification of different linguistic forms according to their source may be useful to find an underlying pattern. Such a classification may be made more interesting if in addition to the source of the utterance, the addressee to whom it is said is taken into account. The following tables give 3 examples of these linguistic differences classified according to these two parameters i. e. addresser and addressee.
Example 1: because vs. becuz
Because (26 occurrences)
Becuz (13 occurrences)
Table 9: because vs. becuz The table shows that the standard form is more frequent (twice as the non-standard form). What seems interesting is that the distribution of the two forms is socially significant: the standard form 'because' is used by respectful characters belonging to a higher social class (Aunt Polly, Judge Thatcher, the Welshman). The non-standard form is used by Huck Finn, the town drunkard's son who lives alone and has not received any education. It can therefore be deduced that the form 'becuz' is a low-class marker.
The table also shows that Tom, the main character of the novel, uses the two forms though unevenly (18 standard forms vs. 2 non-standard ones). The fact that Tom uses the standard form can be explained by his social belonging which determines the way he talks and the forms he is expected to use. Tom uses the non-standard form only when he speaks with Huck (2 instances). Huck was introduced by Twain as "cordially hated and dreaded by all the mothers of the town because he was idle and vulgar and bad, and because all their children admired him so, and delighted in his forbidden society, and wished they dared to be like him. Tom was like the rest of the respectful boys, in that he envied his gaudy outcast condition and was under strict orders not to play with him." ("The Adventures of Tom Sawyer": 70). In this way, the use of the non-standard form by Tom when he speaks to Huck is a form of linguistic convergence motivated by Tom's strong desire to be like Huck i.e. Tom tries to adjust his way of speech to that of Huck in order to sound similar to him.
However, this is not applicable to all situations where an interaction between Tom and Huck takes place; in some instances, Tom sticks to the use of the standard form especially when he wants to show Huck that he has more knowledge than him. This can be clearly seen in the following two examples:
1. Huck: "What's the reason he don't know it?"
Tom "Because he'd just got that whack when Injun Joe done it. D'you reckon he could see anything? D'you reckon he knowed anything?" (p. 110)
2. Tom : "No, they think they will, but they generally forget the marks, or else they die. Anyway, it lays there a long time and gets rusty; and by and by somebody finds an old yellow paper that tells how to find the marks -- a paper that's got to be ciphered over about a week because it's mostly signs and hy'roglyphics."
Huck : "HyroQwhich?"
Tom : "Hy'roglyphics -- pictures and things, you know, that don't seem to mean anything." (p. 229)
Example 2: Afraid vs. Afeared
Afeard (8 occurrences)
Afraid (10 occurrences)
One of the villagers
Table 10: Afeard vs. Afraid The same pattern can be depicted by this example in which the standard form is used by people from the higher class and the non-standard form by Huck and by Tom while speaking to his mates (Huck and Ben). The variation in this example may also be explained by the age factor: while 'adults' use the standard form, the boys use the non-standard one. As in the previous example, Tom uses both forms: while using the non-standard form with his mates, he uses the standard one with Becky. This may be explained by the fact that she is a girl, resulting in a linguistic variation according to the sex of the speaker (gender factor), or by the fact that she is the judge's daughter; in this case Tom's code switching is a form of linguistic convergence (social class factor) , or by a combination of both added to Tom's desire to impress Becky and to gain her admiration. This gives an idea about Tom's awareness of the importance of situation in his choice of linguistic forms.
Example 3 : knowed vs. Knew
knowed (7 occurrences)
knew (2 occurrences)
Table 11: Knowed vs. Knew The same pattern is given again by this example where the non-standard form is used by Huck who represents the lower social class.
Other instances of characters' speech confirm this pattern ; for example, the form 'stiddy'is used only by Huck, 'weepon' by Muff Potter, the town drunk, 'feller' by Huck, Potter and Tom whereas 'fellow' is used by the Welshman and some villagers.
In addition to that, some linguistic features are used exclusively by Jim, the representative of the black slaves class:
Jim: "Can't, Mars Tom. Ole missis, she tole me I got to go an' git dis water an' not stop foolin' roun' wid anybody. She say she spec' Mars Tom gwine to ax me to whitewash, an' so she tole me go 'long an' 'tend to my own business -- she 'lowed she'd 'tend to de whitewashin'." (p. 28)
This utterance by Jim illustrates two important features which characterise the English variety used by black slaves in the 19 century society depicted by Twain. These are 1) the use of the alveolar voiced stop instead of the dental voiced fricative (this pronounced dis) and 2) the use of the alveolar nasal instead of the velar nasal in words ending with '-ing'. These two features are still present in African American Vernacular English AAVE, as reported by Trudgill (1995: 51-52).
2.3. Linguistic Variation as a Means of Characterisation
The first conclusion that can be drawn from the two previous sections is that all characters share some linguistic features that give the reader an idea about the dialect spoken in the South-western society as depicted by Twain in his novel.
The analysis has also shown that this society is not linguistically homogeneous, i.e. although characters share some linguistic features, they diverge on others. The examples analysed above made it clear that the following social parameters determine the way characters use language:
Social class: there is a distinction between the speech of the majority of characters who represent a 'higher' class and the speech of the lower class represented by Muff Potter, the drunkard, and Huck Finn, the town's drunkard's son.
Ethnicity: Three ethnicities are represented by the characters of the novel: the whites (including Mr. Jones, the Welshman), the Indians represented by Injun Joe, and the blacks represented by Jim. The variety spoken by the black slave is quite distinct, as was shown above. As for Injun Joe, his use of language is not really distinctive of him or of the class he represents as the morpho-syntactic features of his speech are found in other characters' speech. What characterises Injun Joe's speech, however, is the register of revenge he uses all through the novel and in his relationship with other characters, i.e.
- Injun Joe (speaking to Dr. Robinson): "Five years ago you drove me away from your father's kitchen one night, when I come to ask for something to eat, and you said I warn't there for any good; and when I swore I'd get even with you if it took a hundred years, your father had me jailed for a vagrant. Did you think I'd forget? The Injun blood ain't in me for nothing. And now I've got you, and you got to settle, you know!" (p. 106)
- Injun Joe (speaking about Widow Douglas) : "Kill? Who said anything about killing? I would kill him if he was here; but not her. When you want to get revenge on a woman you don't kill her -- bosh! you go for her looks. You slit her nostrils -- you notch her ears like a sow!" (p. 268)
This use of this register of violence and revenge in Injun Joe's speech is an important way of his characterisation as the antagonist of the novel. While doing so, the author presents a stereotypical image of Indians prevalent at that time in the South-Western American society i.e. as people who valorise revenge and body slaughtering. This image is confirmed by the content of characters' utterances and this is made clear when Joe himself said: 'Did you think I'd forget? The Injun blood ain't in me for nothing' or when the Welshman said to Huck '"It's all plain enough, now. When you talked about notching ears and slitting noses I judged that that was your own embellishment, because white men don't take that sort of revenge. But an Injun! That's a different matter altogether." (p. 276)
Age: It has been shown in example 2 above that age is an important factor of linguistic variation as some non-standard forms are used exclusively by the young whereas the standard counterparts are used by the adults.
Situation: In nearly all the examples analysed above, the main character 'Tom', who quantitatively has the largest amount of speech, is found to use both the standard and the non-standard forms of the same item. Tom shifts from one code to another according to the addressee because he is aware of the importance of the context in the choice of the appropriate linguistic utterance. In sociolinguistic terms, it could be said that Tom identifies with different speech communities and that he experiences what some socioloinguists call 'shifting identity' to explain the causes of code switching as Diana Boxer (2002: 3) put it: "linguistic choices have to do with underlying and shifting identities".
In addition to code switching according to the addressee, Tom also switches from one form to another with the same addressee, i.e. Huck, as was shown in example 1. As has been mentioned, Tom envies Huck and wants to resemble him and this linguistic convergence is a means of doing that. However, Tom is at the same time different from Huck: he goes to school, he reads books and he knows many things that Huck does not know; this may explain his use of the standard forms with Huck in some instances. Tom's personality evolves throughout the novel as by the end, we find him trying to convince Huck of living under the widow's protection. As High (1986: 81) argues: "Although there are many similarities between Tom and Huck, there are also important differences. Twain studies the psychology of his characters very carefully. Tom is very romantic. His view of life comes from books about knights in the Middle Ages…By the end of the novel, we can see Tom growing up. Soon, he will become a part of the adult world. Huck, however, is a real outsider".
So far, the sociolinguistic approach has shown that the linguistic variation in the characters' utterances is determined by social parameters such as class, ethnicity, age and context and that a correlation exists between the characters' social background and their use of language. However, those characters are not autonomous in the sense that their use of specific linguistic features is less a matter of their own choice than the result of a skilled work by the author who wanted to depict a specific image of society at the time of his writing. Two conclusions may be drawn from Twain's use of linguistic variation:
The first one is that Twain was not only aware that social factors determine the way people use language but also recognised language varieties as 'legitimate' and gave them voice in his novel. In this respect, Twain was going against the prescriptive trend that was prevalent in the 19thth century in language studies arguing that the rules of grammar dictated how people should use language instead of describing how they actually use it. In this scope, any linguistic forms non-conforming with these puristic rules were considered as 'deviating', 'incorrect' and not worth studying. In a sense, Twain challenged this prescriptive view by including in his novel colloquial language which would be considered as incorrect by purist grammarians. Twain even expresses his view openly in the novel on behalf of Huck when Tom corrected the form he used for the word 'girl' as can be seen in the following example:
"Tom, I reckon they're all alike. They'll all comb a body. Now you better think 'bout this awhile. I tell you better. What's the name of the gal?"
"It ain't a gal at all -- it's a girl."
"It's all the same, I reckon; some says gal, some says girl -- both's right, like enough. Anyway, what's her name, Tom?" (p. 232)
The second conclusion is that this awareness on the part of Twain allowed him to use linguistic variation as a means of characterisation; for in addition to the description given by the narrator about characters, he uses, on behalf of each character, a linguistic code that helps portray the social belonging of that specific character. This portrayal of characters is an indicator of Twain's "class-consciousness" as Triki (2003) put it and further argued that "the physical, psychological as well as mental/ ideological profiles of a given character are part and parcel of the overall strategy of portraying that character" (Triki, 2003: 28). What Twain did in his novel is use language varieties in representing the social groups of the society he wanted to depict. This has led such critics as Holmgren to think that characterisation is the primary goal of Twain's use of language varieties, "Although Twain had a very good ear, he uses dialect variations principally for characterization and only secondarily for linguistic authenticity" (Holmgren 1986:72, cited in Fernando Romeu, 1998).
3. Conclusion The starting point in this paper was to apply a sociolinguistic approach to "The Adventures of Tom Sawyer" by Mark Twain trying to find a possible correlation between the social background of the novel's characters and the way Twain makes them use language. The linguistic analysis, carried out in this paper after presenting the sociolinguistic and the literary backgrounds, has depicted some phonological, morpho-syntactic and lexical features of the dialect used by Twain's characters taken as a sample of the South-western American society in the second half of the 19th century. Further investigation has shown that Twain's characters' use of language is determined by social factors, such as class, ethnicity, age and context. This has led to the conclusion that Twain was aware of the interference of external factors in language use.
It is this awareness on the part of Twain that made him use linguistic variation as a means of characterisation. This raises a very important issue: Can the characters' speech still function as a means of characterisation in another language i.e. How do translations of the novel deal with this linguistic variation; and what is the impact of this on the literary value of the novel?3
About the Author Dr Sellami-Baklouti works in the Faculty of Letters and Humanities, University of Sfax, Tunisia.
Acknowledgement The author would like to thank Dr. Mounir Guirat for reading a first draft of this paper. Special thanks are also due to Pr. Mounir Triki for his valuable comments on a first version of this work.
- Primary Source:
"The Adventures of Tom Sawyer", on-line version, University of Virginia, Charlottesville, Va. Oxford Text Archive, http://etext.lib.virginia.edu/modeng/modeng0.browse.html>, 1995
Boxer, D. (2002): Applying Sociolinguistics. John Benjamins Publishing Company.
Chambers, J. (1995): Sociolinguistic Theory. Oxford: Blackwell.
Chomsky, N. (1965): Aspects of the Theory of Syntax. Cambridge: Cambridge University Press.
Classic Notes on Tom Sawyer: www.gradesaver.com/ClassicNotes/Titles/tomsawyer/
Culicover, P. W (1997) : Principles and Parameters, An Introducion to Syntactic Theory, Oxford Textbooks in Linguistics.
Fasold, R. (1990) : The Sociolinguistics of Langauge, Oxford: Blackwell.
Fowler, R. (1996) : Linguistic Criticism, Oxford University Press.
Gerber, J. C. (1993) : The Mark Twain Encyclopedia, edited by J. R. LeMaster and James D. Wilson, Garland Publishing, INC. New York and London, P 12-15.
High, Peter B. (1986): An Outline of American Literature; London: Longman.
Hudson, R. (1980): Sociolinguistics, Cambridge: Cambridge University Press.
Radford, A. (1997) : Syntax: A Minimalist Introduction, Cambridge University Press.
Radford, A. (1999): Syntactic Theory and the Structure of English: A minimalist Approach; Cambridge Textbooks in Linguistics.
Romeu, F. (1998): The Translation of American Varieties in Mark Twain's Adventures of Huckleberry Finn, (site)
Short, M. (1996): Exploring the Language of Poems, Plays and Prose; London: Longman.
Triki, M. (2003): Unveiling the Representations of Class in Literary Discourse, in Re-Reading Class, edited by Hedi Sioud, Sousse: Imprimerie Officielle.
Trudgill, P. (1995): Sociolinguistics, Penguin.
VanSpanckren, K. (1994): American Literature, United States Information Agency.
Wonham, H. B. (1996): "Mark Twain: America's Regional Original" in U.S. Society and Values, USIA Electronic Journals, Vol. 1, No. 10.
1 (Classic Notes on Tom Sawyer: www.gradesaver.com/ClassicNotes/Titles/tomsawyer/).
2 Two dictionaries were consulted in this study: The Oxford Current Dictionary for Advanced Learners (OUP, 1974) and The New Lexicon Webster's Dictionary of the English Language (Lexicon Publications, 1992).
3 An attempt to discuss these issues can be seen in Sellami-Baklouti, A. & LeJosne, J-C. (2004): Translating Language Varieties in Literary Texts : Fiction and Drama, in Proceedings of the 5th Tunisia-Japan Symposium on Culture, Science and Technology, Sfax, May 2004.