Asymmetries in Natural Languages and the Performance Systems

Major Collaborative Research Project

Funded by the Social Sciences and Humanities Research Council of Canada

1998-2003



SUMMARY

This major collaborative research seeks to define the asymmetries proper to the grammar of natural languages, in spite of the linguistic diversity, and the treatment of these asymmetries by the performance systems, that is, the systems that provide the conceptual and acoustic interpretations to linguistic expressions.

Asymmetry is an essential concept in current linguistic research, as it offers an explanation to the existence of constraints on the fusion of the linguistic categories, their dependency, and their linear order (Chomsky, 1995; Kayne, 1994). It is also important in computational linguistics, which integrates the current developments of linguistic theories in computational models (Berwick, 1991; Berwick and Weinberg, 1997). A typical model includes axiomatization of a grammar and a set of heuristics that associate one or more formal representations to the expressions submitted to processing. Notwithstanding the progress achieved in these fields, the nature of the configurations interpretable by the performance systems is yet to be defined. Our research provides a strong hypothesis in this respect, stating that only asymmetrical configurations are interpretable in an optimal way.

While syntactic subject/object or complement/non-complement asymmetries (Chomsky, 1981; Huang, 1982, Hornstein, 1995) as well as phonological head/dependent and head/complement asymmetries (Dresher and Hulst, 1991; Rice and Avery 1993; Guerssel and Lowenstam, 1995) are well attested, the discovery of morphological asymmetries (Di Sciullo 1993, 1995 and 1996; Kayne, 1994) opens a new line of inquiry which proposes that the concept of asymmetry is basic to grammar, and is instantiated in each component in specific ways. If, as we proposed, the modular architecture of grammar is based on asymmetries in the derivations of the linguistic expressions, and that there are asymmetries in morphology, in syntax and phonology, the question that arises is whether the concept of asymmetry is basic to grammar. A closely related question is to determine whether the performance systems, in their human or technological instantiations, make a direct use of asymmetry in processing natural language. It is the purpose of this project to consider these questions in a systematic way, using expertise from linguistic and computer sciences in Canada, Europe and North America.

Our main hypotheses with respect to the theory of grammar is that the derivation of the linguistic expressions is driven by the conceptual necessity to obtain what we call a "target configuration" at the interfaces with the performance systems, these configurations being restricted to asymmetrical configurations. The second hypothesis is that the feature systems defining the different grammatical categories are also defined in terms of asymmetrical relations, providing a rationale to the restrictiveness of their paradigms. The third core hypothesis is that only certain asymmetrical configurations can be interpreted at certain interfaces, providing an explanation to the distinctiveness for such systems.

Our main hypotheses with respect to computational linguistics is that there is a direct implementation of the asymmetry in the computational model: the properties of the competence grammar are directly used by the processor. The second hypothesis is that the computational operations are oriented by the asymmetries of the grammar of natural languages and not by language independent heuristics. The third hypothesis is that the computational treatment of natural languages is incremental, mimicking human natural language understanding: the expressions are processed/interpreted rapidly as they are heard or read.

The originality of the proposed contribution to linguistic theory lies in the formulation of a model which parts are based on asymmetrical relations. Consequently, the empirical manifestations of the asymmetries do not follow from a set of heterogeneous principles, as it is generally assumed, but are the effects of the basic asymmetry of the grammar, which is instantiated in a specific way in each component. This means that in spite of the linguistic diversity, there are regular phenomena that are dependent on the asymmetrical relations of the language faculty, shedding a new light on the relations between the language, the grammar and the mind.

The originality of the proposed contribution to computational linguistics lies in the formulation of a computational model integrating a grammar based on asymmetry. The project will contribute to the formulation of fast parsers and learning systems, which will overcome the overgeneration and slow parsing problems typical of computational models with integrated heterogeneous set of principles or rules. Presently, no computational model based on asymmetrical grammar or oriented by asymmetry is yet available. Therefore, the construction of prototypes based on such models will provide a major contribution to this field. Prototypes for morpho-syntactic and syntactic analyses of natural languages as well as learning systems will be developed and parametrized to handle prima facie very different sorts of languages ranging from languages with concatenative morphology such as the romance and the germanic languages to languages with non-concatenative morphology such as the semitic languages.

This major work on asymmetry will also contribute to the field of information retrieval, as its computational aspect includes the elaboration of a search engine based on natural language processing techniques. The integration of morphosyntactic and syntactic parsers based on asymmetry to an information retrieval system will enhance the precision of the system on large data base structures such as the Internet. This will constitute a major improvement with respect to the performance of extensively used search systems exclusively based on statistical methods.

Anna Maria Di Sciullo 30/06/97

 



Return to the principale page