RESEARCH IN LANGUAGE AND LITERATURE: OLD PROBLEMS, NEW SOLUTIONS?

Luis Guerra Salas
Departamento de Filología Española
Universidad Europea de Madrid
Villaviciosa de Odón 28670 Madrid (España)
luis.guerra@esp.fil.uem.es


This talk will deal with the effects of the new technologies on the traditional Translation and Interpreting curriculum at the Universidad Europea de Madrid. It will focus, in particular, on the course: Industrias de la Lengua, which is delivered as a third year course at UEM.

Our objective in developing the programme was to emphasize the humanistic aspects over the purely technological ones, and highlight the advantages that the new technology can bring to traditional methods of research.

We divided the 60 hour semester course into the following topics:

  1. Introduction: Las industrias de la lengua (language industries)
  2. Computational Linguistics
  3. Digital Files
  4. Speech Processing
  5. Word and Morpheme Processing
  6. Automatic Syntactic Analysis (parsing)
  7. Meaning Processing
  8. Automatic Translation
  9. Electronic Books
  10. Philology and Computers
We were especially interested in the relationship between linguistic and literary research and the new technologies, such that not all the topics were presented in the same manner. Instead we emphasized the use of tools which facilitate research and allow the results to be observed. In the case of Automatic Translation, we limited the topic to a brief overview because this subject is dealt with in far greater detail within the curriculum of the Translation and Interpreting degree programme.

Our most succesful programmes to date have been the following:

- The Corpus de Referencia del EspaÒol Actual (CREA): developed by the Real Academia EspaÒola (RAE); although this corpus will not be fully operative until October 1998, it has provided us with a framework for studying the design, structure and codification of a linguistic corpus. The CREA offers researchers a representative sample of current standard Spanish. CREA allows the researcher to select texts with specific characteristcs: geographic, content specific, genre specific, etc. We plan to use this corpus as a regular tool in our 1998/9 courses.

- The Corpus DiacrÛnico del EspaÒol (CORDE) : developed by the Real Academia EspaÒola. This corpus will include 125.000.000 words in its final version and will span the beginnings of language up to 1975, and the years thereafter are covered by CREA. CORDE divides the history of language into three stages: Middle Ages, Siglos de Oro, Contemporary Age and allows the researcher to specify more than one criteria when consulting the corpus.

- The Data Base Testuale (BDT), developed by Eugenio Picchi at the Istituto di Linguistica Computazionale del Consiglio Nazionale delle Ricerche de Pisa, is a programme which allows the researcher to consult text files. This same programme was used in the new critical edition of Don Quijote de la Mancha edited by Francisco Rico (Barcelona, Instituto Cervantes - CrÌtica, 1998); the programme allows the researcher to look for words and families of words, graphic elements, morphosyntactic elements and stylistic elements, as well as find indices and statistics, generate concordances and analyze the use of prepositions in a text.

- The style and grammar check Grammatik (San Francisco, Microsoft, 1992) gives us an objective measure of style for English. The controversial topic of style is given a new perspective by this programme, which also distinguishes clearly one style from another. This function permits the linguist to reflect on these differences and look for the best solutions for each communicative act.

The work with this and other programmes is expanding the field of language and literature to new horizons. The talk will conclude with a short evaluation of the consecuences of these new tools in the area of the professional preparation of translators and intepreters.