Lost in parallel concordances

Lost in parallel concordances

Ana Frankenberg-Garcia

Instituto Superior de Línguas e Administração, Lisbon

1. Introduction

Concordances extracted from monolingual corpora have been used in a variety of ways to promote second language learning. Parallel concordances have more typically been associated with translation studies, translator training, the development of bilingual lexicography and machine translation. Although the potential benefits of parallel concordances in second language learning have not been overlooked (for example, Roussel 1991, Barlow 2000 and Johansson & Hofland 2000), they have certainly been much less exploited than monolingual concordances.
This paper calls for a reflection on when and how parallel concordances might be used to enhance second language learning. It is centred on two main questions:
a. In what language learning situations might parallel concordances be beneficial?

b. How might language learners and teachers set about navigating through a parallel corpus?

Any attempt to answer the first question will inevitably rekindle the debate on the use of the first language in the second language classroom. Despite the growing belief that using the first language it is not necessarily wrong, it is generally agreed that not every language learning situation calls for it. Given that parallel concordances encourage learners to compare mother tongue and target language, in what kind of setting and in what circumstances are they then appropriate?
How to navigate through a parallel corpus in second language learning is a question that must be posed if the fundamental structural difference between monolingual and parallel corpora is to be taken into account: while the former contemplate texts written in a single language, the latter look not only at two languages at the same time (L1 and L2) but also at two types of language (source texts and translations). In what situations is it relevant to distinguish between concordances extracted from L1 to L2 and ones extracted from L2 to L1? When are the differences between searching from source texts to translations and from translations back to source texts important? How do these four factors interact?
In this paper I shall concentrate on attempting to address these questions from the perspective of issues that have exclusively to do with parallel, as opposed to monolingual, concordances, and will ignore factors which are common to both types of concordances, such as the availability of a corpus, the representativeness of the corpus, the level of difficulty of the concordances, and the fact that, because concordances rely on a fairly sophisticated level of meta-awareness, learners should ideally be adults, literate and cognitively-oriented.

2. In what language learning situations might parallel concordances be useful?

Parallel concordances are based on translations and encourage learners to compare languages, normally their mother tongue and the language they are in the process of learning. It follows that it can only be appropriate to use parallel concordances when it is appropriate to use the first language in the second language classroom.
The idea of using the L1 is not novel. It was present in the grammar-translation method used for teaching Greek and Latin in the late eighteenth century, and this is how modern languages began to be taught in the nineteenth century. Considerable emphasis was placed on translation, and the L1 was often used to explain how the target language worked (Howatt 1984).
Modern approaches to language teaching have tipped the balance of instruction towards the target language. In doing so, while some approaches began to actively discourage the use of the L1, others took practically no notice of its existence (Atkinson 1987, Phillipson 1992). Probably the most influential and not entirely unreasonable argument behind this is the belief that the first language works against L2 fluency. In addition to this, there are a number of practical reasons for neglecting the L1: it wouldn’t work in multilingual classes, native speaker teachers might not know or might not know enough of their students’ L1, and many modern L2 teaching materials have been conceived for language learners in general rather than for learners of a single L1 background in particular.
In spite of these impediments to the use of the L1, there is a growing belief that it is not just there to impair L2 fluency, and that it can in fact be used productively in second language learning, provided that the bulk of instruction continues to be carried out in the target language. Atkinson’s (1993) book Teaching Monolingual Classes explores several different ways in which second language teachers can attempt to make the most of their students’ L1. Medgyes (1994) argues that knowledge of their students’ L1 is one of the most valuable assets second language teachers can have. For Barlow (2000:110), “learning a second language involves some use of first language schemas as templates for creation of schemas for the second language.” Cohen (2001) reports on evidence that despite ESL teachers’ general admonitions not to use the first language, learners continually resort to written or mental translation as a strategy for learning. There is also some evidence that the first language may actually contribute towards the development of a second language. Tomasello & Herron (1988, 1989), for example, report that a group of English-speaking learners of French learned more when the influence of English upon French was openly discussed in class than when instruction focused only on French.
In accordance with the above, it is believed parallel concordances can carve themselves a legitimate place in second language instruction, provided that they are used wisely. To discuss the circumstances under which parallel concordances might be beneficial, it is useful to distinguish between self-access and classroom use. Parallel concordances can be used for independent study when learners know what they want to say in the L1 and want to find out how to say it the L2, or when they see something in the L2 and want to understand what it means in the L1. According to Barlow (2000:114), a parallel corpus is like an “on-line contextualized bilingual dictionary” that gives learners access to concentrated, natural examples of language usage. Parallel concordances can therefore be used during writing in a foreign language to complement bilingual and language production dictionaries. They can not only help learners find foreign words they don’t know, but they can also give them the contexts in which these words are appropriate. Moreover, parallel concordances can help learners come to terms with the fact that there are certain words in their L1 for which there are simply no direct translations available. When reading in a foreign language, learners might also find it useful to resort to parallel concordances to help them understand foreign words, meanings and grammar that they are unfamiliar with. Extracting concentrated examples of chunks of the foreign language that they don’t quite understand matched to equivalent forms in their mother tongue can help learners grasp what is going on in the L2. The main point here, however, is that when learners resort to concordances on a self-access basis, their queries are initiated by themselves (Aston 2001). This means that they are engaged1 in looking for demonstrations of language use that might help them solve problems that are in the forefront of their minds. In this sense, learner-initiated concordances are likely to be meaningful, relevant and conducive to successful language learning.

The picture changes when it comes to using parallel concordances in the classroom, with a group of learners. It is self-evident that parallel concordances will work best with monolingual classes and with teachers who know their students’ L1. What is not so obvious is when it is appropriate to resort to them. The idea of looking at differences between L1 and L2 as a basis for teaching L2 is not novel: it was the main line of inquiry of the Contrastive Analysis Hypothesis (Lado 1957). The problem with Contrastive Analysis, however, is that not all differences between languages are relevant to L2 learning (Wardhaugh 1970, Odlin 1989). Moreover, even if there are language contrasts that are relevant, drawing attention to them might not be unconditionally helpful to all learners at all times. As Sharwood-Smith (1994:184) points out, “consciousness-raising techniques may be counterproductive where the insight has already been gained at a subconscious, intuitive level”. Language contrasts that are no longer or have never been a problem to learners could provoke overmonitoring and inhibit spontaneous performance. Indeed, those who defend L2-only approaches to language teaching would, in these circumstances, be right to affirm that the first language can undermine second language fluency.

Instead of presenting learners with L1-L2 contrasts that do not affect and could even be detrimental to their learning, Granger and Tribble (1996) propose that what is important are the differences between the learner’s interlanguage and the L2, which they call Contrastive Interlanguage Analysis. However, this does not mean to say that the idea of comparing L1 and L2 need be altogether abandoned. For Wardhaugh (1970), although L1-L2 differences might not be useful to predict errors, as originally proposed in the Contrastive Analysis Hypothesis, they do help to explain learner errors. Indeed, if you look at the L2 problems that students actually have, while it is true that not all of them have to do with their L1, it is also true that students who share the same native language often experience a significant number of second language problems that can be traced back to the influence of their first language. Lott (1983), for example, describes negative transfer errors that are common to Italian learners of English. Frankenberg-Garcia & Pina (1997) describe problems of crosslinguistic influence that are typical of Portuguese learners of English, which include not only negative transfer, but also the avoidance of transfer, whereby students avoid using perfectly acceptable English forms simply because they perceive them as being too Portuguese-like.
Problems of crosslinguistic influence like the ones described in the studies above can open the door to the use of parallel concordances the second language classroom. Instead of drawing attention to language contrast per se, or predicting problems of language learning that may fail to materialize, parallel concordances can be brought to the classroom to help learners focus on real interlanguage problems that can be traced back to the first language.
The link between problems of crosslinguistic influence and the pedagogical use of parallel concordances seems to have been first established by Roussel (1991), who showed how French learners of English tend to have problems with English tonic auxiliaries and how parallel concordances could help sensitise these students to certain prosodic features of English. Following a similar line of thought, Johansson & Hofland (2000) report that the overuse of shall is a common error among Norwegian learners of English caused by the influence of Norwegian, and proceed to show how these learners can explore the English-Norwegian Parallel Corpus to find out that the etymologically equivalent Norwegian modal auxiliary skal does not always correspond to the English shall. Frankenberg-Garcia (2000) provides several further examples of Portuguese learners of English making inappropriate use of the prepositions with, in and of as a complement to certain English verbs and adjectives because of the influence of Portuguese, and proceeds to show how a parallel corpus can be a useful source of authentic data for exercises that will help these learners become aware of problematic areas such as this one, in which they tend to get the first and the second language mixed up.
I cannot overly stress, however, that before using parallel concordances in the classroom, with a group of learners, it is important for teachers to find out, through observation, whether these learners are experiencing L2 problems can be traced back to their L1. Parallel corpora enable us to access so many comparable facts of linguistic performance that it is easy to lose sight of the language contrasts that really matter, and to overburden learners with contrasts

that bear no relation, and can even be detrimental, to their learning processes. Detecting negative transfer and other forms of crosslinguistic influence can help inform teachers which parallel concordances are likely to be pedagogically relevant to their students.

3. Navigating through a parallel corpus

When using parallel concordances in second language learning, it is not enough to know what language contrasts might be helpful to students. It is also important to consider how to focus on them, for unlike monolingual corpora, which deal with a single language, parallel corpora involve not only two languages – L1 and L2 – but also two types of language – source texts and translations. This means that it is possible for one to extract concordances taken from L1 to L2, or from L2 to L1, and from source texts (ST) to translations (TT) or from translations back to source texts. In other words, the following four types of parallel concordances are possible:

L1L2 or L2L1


Given these possibilities, one must ask in what language learning situations is it relevant to distinguish between L1L2 and L2L1 concordances? In what language learning situations is it relevant to distinguish between STTT and TTST concordances? How do these factors combine?
3.1 L1L2 or L2L1 concordances?

When using parallel concordances for pedagogical purposes, the most basic choice that has to be made is deciding whether the starting point for KWIC (key-word-in-context) search should be an L1 or and L2 term. If the aim of instruction is to promote the development of language production skills, it makes sense to use L1 search terms, which will render concordances in L1 aligned with L2 (L1L2 concordances). This will enable learners to see how the meanings they formulate in L1 can be expressed in L2. Conversely, if the aim of instruction is to help learners with language reception skills, then the logical thing to do is to use L2 search expressions, which will produce L2 concordances aligned with L1 (L2L1 concordances). This will enable learners to see how certain meanings they have seen in L2 translate into their L1.

Of course, it may be argued that the ultimate aim of instruction is to help learners with both language production and reception, and that for this reason it is important to look at L1L2 and L2L1 parallel concordances. And indeed, this is an entirely reasonable argument when learners happen to experience the same types of difficulties at the level of language production and reception. False cognates, for instance, often have a negative impact on both language reception and production. Portuguese learners of English, for example, frequently assume that words like actually and actualmente, eventually and eventualmente, pretend and pretender and resume and resumir mean the same, and this causes them problems not only when speaking and writing, but also when listening and reading (Frankenberg-Garcia & Pina 1997). In cases such as these, I believe it is fine to use both L1L2 and L2L1 parallel concordances, as long as the problems of reception and production occur at the same time. As shown in figure 1, looking up actualmente will help these learners see that the equivalent in English can be rendered as present, nowadays, these days, now, and so on2. Figure 2 shows that looking up actually can help these same learners find out that it is a word that translates into de resto, na verdade, or, most importantly, that it is a word that is often left out in Portuguese.

Figure 1

L1L2 concordances for language production (query: “actualmente”)

Com os rendimentos que actualmente tenho, podia dar 10 000 libras por ano sem grande esforço.

I could afford ten thousand a year from my present income, without much pain.

Claro que actualmente tenho posses para mandar fazer camisas por medida, mas o ar snob dos camiseiros de Picadilly dissuade-me de lá entrar e as popelines às riscas expostas nas montras são demasiado afectadas para o meu gosto.

Of course, I could afford to have my shirts made to measure nowadays, but the snobby-looking shops around Picadilly where they do it put me off and the striped poplins in the windows are too prim for my taste.

O meu irmão mais novo, o Ken, emigrou para a Austrália no princípio dos anos 70, quando era mais fácil do que actualmente, e foi a melhor decisão que tomou na vida.

My young brother Ken emigrated to Australia in the early seventies, when it was easier than it is now, and never made a better decision in his life.

Figure 2

L2L1 concordances for language reception (query: “actually”)

I actually went so far as to blindfold myself, with a sleeping mask British Airways gave me once on a flight from Los Angeles.

Fui, de resto, ao ponto de pôr uma venda nos olhos, que me deram em tempos num voo da British Airways vindo de Los Angeles.

«So you're actually making a positive contribution to the nation's trading balance?»

-- Então, você está na verdade fazendo uma contribuição positiva para a balança comercial do país.

Well, when I imagined them, I never saw myself as actually experiencing them later on.

Pois bem: nunca me vi ao fantasiá-las, como existindo-as mais tarde.

It is not always the case, however, that the problems that learners experience at the level of language reception are the same or occur at the same time as the ones they experience at the level of language production. Generally speaking, reception comes before production. Portuguese learners of English, for example, don’t seem to have much difficulty understanding the English words lose and miss. When producing the language, however, a common error is for them to say lose when they mean miss:

* I’m sorry I’m late. I lost the train.
This particular problem seems to stem from the fact that both concepts are normally expressed by a single Portuguese verb, perder. Looking up miss in the English to Portuguese direction of a parallel corpus would not tell learners what they need to know and neither would looking up lose. In both cases, the alignment results in Portuguese are bound to render perder. Figures 3 and 4 illustrate this.

Figure 3

L2L1 concordances (query: “los.*”)

But somewhere, sometime, I lost it, the knack of just living, without being anxious and depressed.

Mas houve um momento, uma altura qualquer, em que perdi o treino de viver, viver apenas, sem andar ansioso nem deprimido.

I was rapidly losing faith in this hospital.

Eu estava perdendo rápido a confiança no hospital.

He savors his freedom but doesn't lose sight of his master.

Saboreia a liberdade, mas não perde o amo de vista.

Figure 4

L2L1 concordances (query: “miss.*”)

I agreed enthusiastically, but I spent most of the flight home wondering what I'd missed.

Concordei entusiasticamente, mas passei a maior parte da viagem de regresso a pensar no que teria perdido.

I meant to catch the 4.40, but just missed it.

A minha intenção era apanhar o das 4.40, mas acabei de perdê-lo.

But he had found a guide, and didn't want to miss out on an opportunity.

Mas tinha encontrado um guia, e não ia perder esta oportunidade.

However, as shown in figure 5, looking up perder in the Portuguese to English direction renders results that can help learners notice the difference between lose and miss, which can help imprint the contrast in their minds.

Figure 5

L1L2 concordances (query: “perd.*”)

Mas houve um momento, uma altura qualquer, em que perdi o treino de viver, viver apenas, sem andar ansioso nem deprimido.

But somewhere, sometime, I lost it, the knack of just living, without being anxious and depressed.

Passou uma hora, depois outra; a neve juntava-se nas dobras das roupas; perderam-se.

An hour passed, then another; snow gathered thickly in the folds of their clothes; they missed their road.

-- Ser adoptada e depois perder a mãe.

` To be adopted and then to lose your mother?

Second language problems that affect reception but not production are not as common, and detecting them is not as simple, for they do not always result in visible errors. Still, language reception problems can sometimes be spotted through reading comprehension and translation exercises, or during conversations, when communication breaks down. Whatever the problems learners of a given native language seem to have, what is important is to be aware that L1L2 parallel concordances are different from L2L1 parallel concordances, and that the two directions serve different purposes in language teaching. While L1L2 concordances are more likely to enhance language production, L2L1 concordances are better suited to improving language reception.

3.2 STTT or TTST concordances?

Learners using parallel concordances are exposed to source texts on one side of the corpus and to translations on the other. This means that, in the same way as it is possible to extract concordances from L1 to L2 or from L2 to L1, it is also possible to present learners with parallel concordances taken from source texts to translations (STTT concordances), or from translations to source texts (TTST concordances).

In unidirectional parallel corpora, the relationship between these four factors is constant. If the learners’ L1 happens to be the language of the source texts, the L2 will have to be the language of the translations. Or the other way round. If the L1 is the language of the translations, then the source texts will necessarily be the L2. St John (2001) describes a case-study of an English speaking learner of German using the German-English INTERSECT corpus (Salkie 1995), where the source texts are in German and the translations in English. For this learner, the L1 part of the concordances are translations while the L2 part are source texts. For a German learner of English using the same corpus, the exact opposite would be the case.
For learners using bi-directional parallel corpora like COMPARA (Frankenberg-Garcia & Santos, forthcoming), CEXI (Zanettin, 2002) or the English-Norwegian part of the ENPC (Johansson et al, 1999), the part of the corpus in their L1 is made up of both translations and source texts. Conversely, the part of the corpus in their L2 also contains both translations and source texts. This means that when searching from L1 to L2, it is possible for these learners to work from translations to source texts, from source texts to translations, or even from both to both. And the same applies to the situations in which learners are working with concordances taken from L2 to L1, which may again consist of translations to source texts, source texts to translations or both to both. Given these possibilities, one must ask in what language learning situations it is relevant to distinguish between them.
It is a well documented fact in the literature that translational language is not quite the same as language which is not constrained by source texts from another language (for example, see Baker 1996). According to Gellerstam (1996), the differences between translational and non-translational language weigh against the use of parallel corpora in language learning. Indeed, exposing language learners to translational language can be problematic. If one looks at the distribution of the adverb already in COMPARA 1.6, only 35% of its occurrences come from texts originally written in English, whereas the remaining 65% of its occurrences come from English translated from the Portuguese. This suggests a need for the explicitation of already in translated English which is not present in English source texts. Portuguese learners of English, in turn, also tend to use the English adverb already in situations in which it is not required. You can often hear them say Have you already had lunch? when what they mean is simply Have you had lunch?. In other words, they might use already to ask whether or not lunch has taken place, without intending to convey the idea that it took place earlier than expected. This particular problem seems to stem from the fact that in Portuguese translation there is no grammatical difference between the two sentences. The Portuguese adverb já, which translates literally into the English adverb already, is used in both translations: Já almoçaste?

Presenting Portuguese learners of English who overuse already with parallel concordances containing this adverb in translated English is not such a good idea, for already appears a lot more frequently in translational English than in non-translational English. The concordances would not help the learners in question develop a feeling for the situations in which already might be left out.

Having said this, the fact that parallel concordances expose learners to translational language does not necessarily mean that they cannot be used constructively. In fact, parallel concordances can (and should) be used in such a way that the translational/non-translational language distinction is put to good use. If there happens to be a need to shelter learners from translational renderings of the target language, one can restrict the L2 side of parallel concordances to source texts. This might be of consequence when parallel concordances are used to draw attention to elements that exist both in the L1 and the L2, but which occur more typically in only one of the languages, as in the case of the English adverb already and the Portuguese . Figure 6 below illustrates how Portuguese-English TTST concordances can be used precisely to show Portuguese learners of English that they needn’t say already in English every time they mean in Portuguese3.
Figure 6

TT(L1)ST(L2) concordances used to shelter learners from translational L2 (query: “[Jj]á” )

Agora é a conferencista principal.

Now, she's Principal Lecturer.

Quando espreitei outra vez às 7.30 da manhã, se fora embora.

When I looked again at half-past seven this morning, he had gone.

Não acha que ele estudou muito, ficou nisso o dia inteiro, Sonny, ele deveria fechar os livros e ir dormir cedo.

Don't you think he's done enough, he's been at it all day, Sonny, he should close his books and have an early night.

Having translations in the L2 side of the corpus can in turn be useful to help learners come to grips with L1 terms, such as culturally-bound concepts, that are difficult to express in L2, or for which there are no straightforward L2 translations. Figure 7 shows how Portuguese-English STTT concordances can be used to help Portuguese learners of English describe the Brazilian carnival in English.

Figure 7

ST(L1)TT(L2) concordances used to help learners with the translation of culturally-bound concepts (query: “carnaval.*”)

Parecia um sujeito vestido para um baile de carnaval dos anos 1920.

He looked like someone dressed for a Carnival dance in the 1920s.

Um sujeito de nome Áureo de Negromonte, «famoso carnavalesco e campeão de desfiles», segundo a TV, afirmava que a morte de Angélica era uma perda irreparável para o carnaval brasileiro.

A man by the name of Áureo de Negromonte - ' a famous Carnival figure and competition winner ', according to the TV – stated that Angélica's death was an irreparable loss for Carnival in Brazil.

«São esses blocos carnavalescos», disse o motorista de mau humor, «os filhos da puta gostam de desfilar pelas ruas movimentadas...

' It's those Carnival groups, ' the driver said ill-humouredly. ' The sons of bitches like to parade down the busy streets...

There are times, however, when distinguishing between source texts and translations is less important. When the aim of instruction is simply to draw attention to certain isolated morphological, syntactic and even lexical contrasts, TTST concordances can be just as helpful as STTT concordances. Figure 8 below shows how both types of parallel concordances can be used to focus on the contrastive use of English and Portuguese prepositions.

Figure 8

ST+TT(L1)TT+ST(L2) concordances used to help learners with contrastive prepositions (query: various)

O último deles consistia em ficar de cabeça para baixo por uns minutos para fazer o sangue ir à cabeça.

The last one consisted of hanging upside down for minutes on end to make the blood rush to your head.

Não acreditei no que estava acontecendo comigo.

I couldn't believe what was happening to me.

Mais tarde, na cama, depois do sexo, Fúlvia me encheu de elogios, disse que eu era muito bom naquilo.

Later, in bed, after sex, Melissa showered me with praise, told me I was very good at it.

3.3 Putting it all together

Navigating through a parallel corpus involves deciding whether an L1 or an L2 search term is to be used and deciding whether the search term in question is to be in translational or non-translational language or in a mix of both. The most basic of these decisions is the first one. In section 3.1 I argued that L1L2 concordances (based on L1 search terms) are best for promoting language production and that L2L1 concordances (based on L2 search terms) are more suitable for language reception.
It is only after this decision has been made that one should worry about the translational/non-translational language distinction. In section 3.2 I argued that there are situations in which it is best to shelter learners from translational L2, situations in which translational L2 can be especially useful to learners, and situations in which the distinction between translational and non-translational L2 is not so important.
When putting it all together, this means that, if the distinction between translational and non-translational language is not an issue, then unidirectional and bi-directional parallel corpora can be used in any direction. However, should the need arise to shelter learners from translational L2, then unidirectional parallel corpora can be used in only one search direction, which will depend on whether the learner’s L1 is the source text or the translation language of the corpus. The same applies to situations in which parallel concordances are used to deliberately expose learners to translational L2. In contrast, bi-directional corpora can still be interrogated in any direction, provided only the part of the corpus which shelters learners from (or exposes them to) translational L2 is used.

4. Conclusion

In addition to the undeniable utility of parallel concordances in translation studies, translator education, the development of bilingual lexicography and machine translation, I have argued in this paper that there is also room for the use parallel concordances in second language learning. However, I also hope to have made it clear that it is important to consider carefully when parallel concordances are useful, and to give serious thought as to how to use them. Users must make conscious decisions on whether or not parallel concordances are called for, on whether to use L1 or L2 search terms, and on whether to distinguish between translational and non-translational L2.


The term engagement is borrowed from Smith (1982:171), who defines it as “the way a learner and a demonstration come together on those occasions when learning takes place”.

2 All parallel concordances shown in this paper were taken from COMPARA 1.6 - http://www.portugues.mct.pt/COMPARA/ [9-Jul-2002].

3 The fact that the translational, Portuguese side of TTST concordances such as these may sound odd or unnatural to native speakers of Portuguese can even help Portuguese learners of English develop a better grasp of the differences between Portuguese and English.

