- the whole set of alternative analyses of the sentence, including the analysis at stake, or
- only the analysis at stake, independently of all others.
We will call formalisms of the first type dependent-filtering formalisms, and others independent-filtering ones. Examples of formalisms of both types are described in literature. We apologize for introducing such strange terms as dependent and independent filtering, but no term has been proposed for the first type yet, and the term already proposed for the second is monotonous (K. Koskenniemi 1990), which is in fact more evocative of the general approach which consists in filtering analyses by applying formal constraints.
9.1. Dependent-filtering formalisms
Our definition of this type of ambiguity reduction formalisms is that the decision to reject or keep a given analysis of a sentence can depend on that analysis, as well as on other analyses of the same sentence, or at least on those that do not have been eliminated by previously applying other rules. An example of such a behaviour is analysed in É. Laporte & A. Monceaux (1999, pp. 357-359). When a grammatical constraint recognizes a grammatical sequence in a text (what we will call a target), it can rule out analyses belonging to the target as well as alternative analyses of the same word sequence. Several disambiguators are based on dependent-filtering formalisms (M. Silberztein 1994; A. Voutilainen 1994; É. Tzoukermann et al. 1995); in Ph. Laval (1995), one of the two formalisms described (p. 103) is of the same type.
In order to assess the potential of this type of formalism for applications, let us examine some formal properties deducible from its definition.
The first property is the existence of interactions and interdependencies among constraints included in the same grammar. Since the decision to remove or retain a given analysis of a given sentence may depend on other analyses of the same sentence that would not have been rejected by applying other rules before, the result of applying a grammatical constraint depends on other constraints previously applied. In other words, the interpretation and effects of a grammatical constraint may be different in the absence or in the presence of another grammatical constraint.
This first property has three consequences.
a) When you add new constraints to an existing ambiguity reduction grammar that has a known performance, the rate of reduction may decrease instead of increasing, since if a new constraint applies to the text and blocks some constraint of the former version, the new constraint may discard less analyses than the former.
b) The objective of maintaining recall at the level of 100% gets more and more difficult to reach as grammatical constraints accumulate, since when you introduce a new constraint into an existing grammar, the effects of others can change, so that constraints that used to keep valid analyses can now rule them out.
c) It is possible to determine whether the application of a complete grammar to a given input is correct or not: you just compare the output with the linguistic analysis you decided to adopt. But, in the framework of a dependent-filtering formalism, the application of two correct grammars is not certain to produce correct output, since the result of applying a given constraint to sentences depends on the other constraints applied before. Therefore, even if the effects of applying constraints are explicitly specified with the formalism, it is not possible to define what a correct grammatical constraint is.
The second property of dependent-filtering formalisms is the necessity to define one or several ways of combining grammatical constraints. Indeed, we already mentioned that the effect of applying a constraint to a sentence has to be defined; now, it is equally necessary to define the effect of applying several constraints formalized separately. Two kinds of modes of combination are possible:
a) With combination by composition, the output of a constraint is the input to the next constraint. This mode of combination involves a composition of functions in the mathematical sense. Final output depends on the order of composition, as is easy to check with a priority rule, for instance (É. Laporte & A. Monceaux 1999, p. 357). A variant of this mode of combination consists in iterating the application of a series of rules, always in the same order, until a complete series of applications has no effect: then the set of analyses obtained is called a fixed point, because it cannot be changed any more by applying the same series of rules again. There is a mathematical certainty that a fixed point is reached after a finite number of iterations, because all sets of analyses of a given sentence are finite and can only be reduced. However, it is easy to check that the fixed point obtained depends on the order of composition, i.e. the process can reach several different fixed points from the same text, depending on the order of composition.
b) Modes of simultaneous combination can also be defined. The simplest is the following: the result of applying a set of rules to a sentence is defined as the set of analyses that would be kept by all rules if each of them was applied to the same set of analyses as if none of the other rules existed. When all constraints are applied in this way, all interactions between rules disappear; defining what a correct grammatical constraint is becomes possible, and the formalism ensures that an accumulation of correct constraints is correct; the construction of an ambiguity reduction grammar by accumulation of grammatical constraints becomes simpler and safer; and the application of constraints is not restricted by conditions regarding the disposition of application zones of other constraints. In fact, the formal properties of this kind of formalism make it close to independent-filtering formalisms.
c) Other modes of simultaneous combination can be defined on the basis of conventions determining whether an analysis discarded by a constraint and preserved by another is discarded or preserved, according to various criteria, e.g. the disposition of application zones and overlaps between them, or a network of rule/exception relations between grammatical constraints. Such modes of simultaneous combination introduce new interactions and dependencies between grammatical constraints, with the consequences mentioned above. In the example of a network of rules and exceptions, the decision about a given analysis depends on conditions regarding other rules applied to the same analysis and marked as exceptions. Dependency relations among grammatical constraints make it difficult to maintain and extend the grammar, since any modification or new element can modify the behaviour of all constraints linked by direct or indirect dependency relations.
Whatever the mode of combination of rules, a detailed and complex specification is required. When such a specification exists, it is possible to define what a correct system of constraints is, but even so it is not always possible to define what a correct grammatical constraint is. In addition, writers of ambiguity reduction grammars must understand and digest the specification, since the correctness of their grammars depends on it. This is a cause of errors, because the description of grammatical constraints is based on linguistic intuition: intuitions are consistent with what writers think the system does, and if they did not correctly digest the formalism, their intuitions of grammatical constraints can be wrong. Of course, this is the main pragmatic reason why simplicity is an essential quality of a formalism.
9.2. Independent-filtering formalisms
Our definition of this type of ambiguity reduction formalisms is that the decision of ruling out or retaining a given analysis of a given sentence depends only on the analysis at stake, not on other analyses of the same sentence. These formalisms, therefore, have less expressive power, since the conditions of application of grammatical constraints can only include properties of the analysis in question.
The following grammatical constraints (E. Roche 1992) belong to this kind of filtering:
(29) <le,DET> does not occur before a verb at a finite tense
(30) <le,PRO> does not occur before a noun
These constraints correctly remove <le,DET:fs> from (31) and <le,PRO:3fs> from (32):
(31) Il la prononce naturellement
(32) La prononciation est naturelle
When a verb and a noun are homographs, (29) and (30) correctly reject the analyses in <le,DET:fs> <V:3s> and <le,PRO:3fs> <N:fs>:
On recherche par quel service la fiche a transité
The analyses with <le,DET:fs> <fiche,N:fs> and <le,PRO:3fs> <ficher,V:3s> are preserved. This is an example of independent filtering because (29) applies just to the analyses with the verb, and (30) just to those with the noun, as though the processing of each analysis were entirely independent. When a constraint recognizes a target, it can only remove analyses belonging to this target, not alternative analyses. In other words, when a constraint is written in order to resolve a type of ambiguity, the writer does not need to care about others. Because of this limitation of the expressive power of the formalism, some options are useless, e.g. recognizing a context only if all ambiguity has already been resolved in it. If a writer of grammatical constraints elaborates the recognition of the target, making it safer and more precise, the output of the rule itself, i.e. the selection of analyses to be discarded, is safer and more precise. Writers can thus achieve a better control on their rules.
In addition to E. Roche 1992, one of the two formalisms proposed by Ph. Laval (1995, p. 101) and ELAG (É. Laporte & A. Monceaux 1999) are examples of independent-filtering formalisms. The system of A. Voutilainen (1994) allows the user to build independent-filtering disambiguators by restricting himself to a part of the expressive power of the formalism, e.g. by writing REMOVE rules and avoiding SELECT rules. All these systems are based on finite automata and explore general approaches defined by M. Gross (1989) and K. Koskenniemi (1990).
The effect of the application of a combination of grammatical constraints is simple to define and does not present the variants we observed in the case of dependent-filtering formalisms: an analysis is kept by the ambiguity reduction grammar if, and only if it is kept by each of the grammatical constraints included in it. The fixed point in the iterative application of rules, for example, is reached after the first application, which makes the notion of fixed point useless.
An ambiguity reduction grammar stated with an independent-filtering formalism has automatically two other important applications.
- It can be used to detect non-lexical errors: when a grammar is correct and rejects all analyses of a sentence, this implies that the sentence does not have any valid analysis, and therefore that it is ill-formed. In the case of a dependent-filtering formalism, this secondary use is hardly likely to work, because invalid analyses are frequently recognized through the existence of alternative analyses, which may be valid or invalid; now, this situation of ambiguity has no reason to exist in the case of non-lexical errors.
- It can be used to resolve homophonies of the type of freine/frêne, which disturb speech recognition due to the lack of phonetic reasons for selecting the right spelling. For instance, if a word is transcribed freine/frêne, and if an ambiguity reduction grammar establishes by exploring the context that the word is a verb, the hypothesis frêne can be ruled out. Independent-filtering formalisms allow for such a use because they are based on a discrimination between right and wrong grammatical analyses, no matter whether forms are ambiguous. The same secondary use with dependent-filtering formalisms is much less realistic, since they organize description on the basis of sets of homographic forms, which are quite different from sets of homophonous forms.
Ambiguity reduction seems to be a simple task at first sight, but the examples above make it obvious that one can realistically expect more consistent, reliable and useful results than those now considered successful. Such progress will not be reached without considering more elaborate linguistic analyses: the informative content of linguistic data is as important here as it is in other fields of computational linguistics. In addition to this point, we hope we showed that a reflection about the formal and abstract aspects of the problem, too, is absolutely necessary if we want to design and implement so simple and efficient tools that they allow for elaborating good ambiguity reduction grammars.
Université de Marne-la-Vallée - CNRS
5, bd Descartes
F-77454 Marne-la-Vallée CEDEX 2
Benello, J.; A.W. Mackie; J.A. Anderson. 1989. Syntactic category disambiguation with neural networks. Computer Speech and Language 3, London: Academic Press, pp. 203-217.
Brill, Eric. 1992. A simple rule-based part-of-speech tagger. In Proceedings of the 3th Conference on applied Natural Language Processing, Trento, Italy, pp. 152-155,.
Cloeren, Jan. 1999. Tagsets. In Syntactic Wordclass Tagging, H. van Halteren (ed.), Dordrecht/Boston/London: Kluwer, pp. 37-54.
Francis, Nelson W.; Henry Kucera. 1982. Frequency Analysis of English Usage: Lexicon and Grammar. Boston: Houghton Mifflin.
Garrigues, Mylène. 1997. Une méthode de désambiguïsation locale nom/adjectif pour l'analyse automatique de textes. In Langages 126, La description syntaxique des adjectifs pour les traitements informatiques, Nam Jee-sun (ed.), Paris: Larousse, pp. 60-78..
Garside, Roger; Geoffrey Leech; Geoffrey Sampson (eds.). 1987. The computational analysis of English: a corpus-based approach. Harlow: Longman.
Greene, Barbara B.; Gerald M. Rubin. 1971. Automated grammatical tagging of English. Providence, Rhode Island.
Gross, Maurice. 1977. Grammaire transformationnelle du français. Syntaxe du nom, Paris: Larousse, 256 p.
Gross, Maurice. 1989. The use of finite automata in the lexical representation of natural language. In Lecture Notes in Computer Science 377, Electronic Dictionaries and Automata in Computational Linguistics, M. Gross & D. Perrin (eds.), Berlin: Springer, pp. 34-50.
Gross, Maurice. 1994a. Constructing Lexicon-Grammars. In Computational approaches to the Lexicon, B.T.S.Atkins, A. Zampolli (eds.), Oxford Univ. Press, pp. 213-163.
Gross, Maurice. 1994b. "Lexicon-Grammar: Application to French". In The Encyclopedia of Language and Linguistics, R.E.Asher (ed.), Oxford/NewYork/Seoul/Tokyo: Pergamon, vol. 4, pp. 2195-2205.
Gross, Maurice. 1997. The construction of local grammars. In Finite-State Language Processing, E. Roche & Y. Schabès (eds.), Cambridge, Mass.: MIT, pp. 329-352.
Harris, Zellig S. 1962. String Analysis of Language Structure. Mouton.
Johansson, Stig; Eric S. Atwell; Roger Garside; Geoffrey Leech. 1986. The Tagged LOB Corpus: Users' manual. ICAME, The Norwegian Computing Centre for the Humanities, Bergen University, Norway.
Joshi, Aravind K.; Philip Hopely. 1996. A Parser from antiquity, Natural Language Engineering 2(4), Cambridge Univ. Press, pp. 291-294.
Klein, Sheldon; Robert F. Simmons. 1963. A Computational approach to grammatical coding of English words. Journal of the Association for Computing Machinery 10(3), pp. 334-337.
Koskenniemi, Kimmo. 1990. Finite-state parsing and disambiguation. In Proceedings of COLING-1990, H. Karlgren (ed.), Helsinki, pp. 229-232.
Laporte, Éric; Anne Monceaux. 1999. Elimination of lexical ambiguities by grammars: the ELAG system. Lingvisticae Investigationes XXII, Cédrick Fairon (ed.), Amsterdam/Philadelphie: Benjamins, pp. 341-367.
Laval, Philippe. 1995. Un système simple de levée des homographies. Lingvisticae Investigationes XIX(1), Amsterdam/Philadelphie: Benjamins, pp. 97-105.
Leech, Geoffrey; Roger Garside; Michael Bryant. 1994. CLAWS4: The Tagging of the British National Corpus. In Proceedings of COLING-94, Kyoto.
Leech, Geoffrey; Andrew Wilson. 1999. Standards for Tagsets. In Syntactic Wordclass Tagging, H. van Halteren (ed.), Dordrecht/Boston/London: Kluwer, pp. 55-80.
Marcus, Mitchell P.; Beatrice Santorini; Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2), Cambridge, Mass.: MIT, pp. 315-330.
Marshall, Ian. 1983. Choice of grammatical word-class without global syntactic analysis: Tagging words in the LOB Corpus. Computers and the Humanities 17, Dordrecht/Boston/London: Kluwer, pp. 139-150.
Merialdo, Bernard. 1994. Tagging English text with a probabilistic model. Computational Linguistics 20(2), Cambridge, Mass.: MIT, pp. 155-171.
Milne, Robert W. 1986. Resolving lexical ambiguity in a deterministic parser. Computational Linguistics 12(1), Cambridge, Mass.: MIT, pp. 1-12.
Roche, Emmanuel. 1992. Text disambiguation by finite state automata, an algorithm and experiments on corpora. In Proceedings of COLING-92, Nantes.
Salkoff, Morris. 1999. A study of ambiguity using Intex. Lingvisticae Investigationes XXII, Cédrick Fairon (ed.), Amsterdam/Philadelphie: Benjamins, pp. 143-154.
Senellart, Jean. 1999. Outils de reconnaissance d'expressions linguistiques complexes dans de grands corpus. PhD thesis, LADL, University Paris 7, 290 p.
Silberztein, Max. 1994. INTEX: a corpus processing system. In Proceedings of COLING-94, Kyoto.
Tzoukermann, Évelyne; Dragomir R. Radev; William A. Gale. 1995. Combining linguistic knowledge and statistical learning in French part-of-speech tagging. In Proceedings of the SIGDAT Workshop "From Texts to Tags: Issues in Multilanguage Analysis", European Chapter of the ACL, Dublin.
Voutilainen, Atro; Pasi Tapanainen. 1993. Ambiguity resolution in a reductionistic parser. In Proceedings of the 6th Conference of the European Chapter of the ACL, Utrecht, pp. 394-403.
Voutilainen, Atro. 1994. Morphological disambiguation. In Constraint Grammar: a Language-Independent System for Parsing Unrestricted Text, F. Karlsson, A. Voutilainen, J. Heikkilä, A. Attila (eds.), Berlin/New York: Mouton-de Gruyter.
We examine various issues faced during the elaboration of lexical disambiguators, e.g. issues related with linguistic analyses underlying disambiguators, and we exemplify these issues with grammatical constraints. We also examine computational problems and show how they are connected with linguistic problems: the influence of the granularity of tagsets, the definition of realistic and useful objectives, and the construction of the data required for the reduction of ambiguity. We show why a formalism is required for automatic ambiguity reduction, we analyse its function and we present a typology of such formalisms.