Joseba Abaitua: Asignaturas de licenciatura

Despacho: 323-D  Dirección electrónica: abaitua@fil.deusto.es

Horas de tutoría

 Lu 12.00, Mi 11.00, Ju 16.00
English Language and New Technologies

Objectives Chronology Methodology Readings Evaluation Grades


  1. Become acquaintance with new technologies applied to natural languages in general and to English in particular.
  2. Learn the contribution of linguistics to the development of information technologies.
  3. Apply information technologies to your own education curricula.
  4. Review the main techniques developed in computational linguistics.
  5. Focus on practical applications of theoretical and of computational linguistics in the framework of the new information society. We will consider applications such as information extraction, natural language interfaces, multilingual electronic publishing, machine translation, digital document localization, and the multilingual Internet.
  6. Study the connection between formal syntax and computational parsers.
  7. Review the structure of computational lexicons.
  8. Experiment with some translation software and corpus processors.


  1. Human Language Technologies and the information society (Presentation of Action Line, by the EC: caché)
  2. Language Engineering (Brouchure by HLTCentral: caché)
  3. Information overload
  4. Information management, retrieval and extraction
  5. Translation technology (Paper)
  6. Text collections and corpora (Reference page)
  7. Automatic text processing and tagging (Paper)
  8. Grammars formalisms (Short description by Hans Uszkoreit & Annie Zaenen)
  9. Applications: Information management, retrieval and extraction (PPT presentation by Julio Gonzalo)
  10. Applications: Machine translation
  11. Applications: The multilingual Internet
  12. Applications: Electronic publishing


Weekly questionnaire
  1. Feb 12-14. First contact with the browser and on-line materials.
  2. Feb 17-21. Language technologies and the Information Society.
  3. Feb 24-28. "Having too much information can be as dangerous as having too little"
  4. Mar 3-7. Language technologies and resources.
  5. Mar 10-14. Multilinguality is a barrier to communication. We will review translation technology and its potential to help overcome that problem.
  6. Mar 17-21.Machine Translation (MT) is a possible way of overcoming language barriers. However quality is often a problem for MT systems.
  7. Mar 24-28. We are going to learn more about MT: history, methods, approaches, best known systems, etc. Linguistic Diversity on the Internet: Assessment of the Contribution of Machine Translation. Multilingual resources
  8. Apr 31-4. Corpus Linguistics.
  9. Apr 7-11. Submission of Report A. See the list of submitted reports (Apr. 16).
  10. Apr 14. Evaluation of Machie Translation systems
  11. Apr 28. Evaluation of Machie Translation systems
  12. May 5-9. Evaluation of Machie Translation systems
  13. May 12-16. Submission of Report B.
    Evaluation of search and content retrieval engines on internet (1)
  14. May 19-23. Evaluation of search and content retrieval engines on internet (and 2)
  15. May 26-30. Submission of Report C. Review.


We will learn how to use electronic documentation as referential material for the course. Every week a questionnaire will be set and the students will work on the documentation to answer the questionnaire. This will be combined with practical exercises and the utilization of dedicated software. References and documentation will be provided mainly in the form of hypertext.


In addition to regular attendance and participation, students will submit three reports (see below). A written examination will be set when regular evaluation has not been accomplished. Grading in this course will depend on class attendance and participation (10%), group projects (30%), and an individual project or exam (60%).

  1. By April, 11th: Report based on notes taken from on-line references (optional and individual)
  2. By May, 12th: Evaluation of Machie Translation systems
  3. By May, 27th: Evaluation of search and content retrieval engines
Report A

Grades May 22nd, 2003

This report could have a title similar to this "A review of Human Language Technologies and their role in the Information Society"

It can contain as many text fragments or quotatations taken (copied and pasted) from the on-line documentation as necessary. Although the author must clarify and justify the relevance of those selected fragments with his own text. The main line of argumentation must be original, not copied!

It is a very important that quotations are acknowledged, i.e. do not forget to provide the source!.

  • Name of the author, group or institution
  • Date
  • Publishers's name, or URL

In addition to the main body of the report, it must also include an

  • Abstract (no more than 100 words)
  • Introduction (around 500 words)
  • Conclusion (around 200 words)

It is very important that these sections contain original text. Try to convey your own personal view and style.

The size of the report will be of 3,000 to 7,000 words (i.e. 15-25 pages).

Have a look to these interesting recomendations:

How to Write a Good Report, by Prof. G. Yadigaroglu, ETHZ (caché)

Student Handbook and Essay Writing Guide, Brock University (caché)

A letter of Professor Michael Stubbs to his students (Englische Sprachwissenschaft Universität Trier ) (caché)

Check these documents:

Academic paper: Is it worth learning translation technology?

Technical report: Linguistic Diversity on the Internet: Assessment of the Contribution of Machine Translation (caché)


Reports B and C: These will illustrate experiments with on-line resources, and can be developed in groups of four people at most.



In print
  • Arnold, Douglas, L. Balkan, Lee Humphreys, and S. Meijr, eds. 1994. Machine translation: introductory guide. Blackwell Publishers.
  • Gazdar, Gerald & Chris Mellish. 1989. Natural Language Processing In Prolog. An Introduction to Computational Linguistics. Addison-Wesley Publishing Company.
  • Hutchins, W. J. and Harold Somers. 1992. An introduction to machine translation. Academic Press.
  • Jones, Daniel. 1996. Analogical Natural Language Processing. 1996. University College London Press. .
  • McEnery, Tony. 1992. Computational Linguistics: A handbook and toolbox for natural language processing. Sigma Press.
  • McEnery, Tony and Andrew Wilson. 1996. Corpus Linguistics. Edinburgh University Press .
  • Melby, Alan K. and C.T. Warner. 1995. The Possibility of Language. A discussion of the nature of language, with the implications for human and machine translation. John Benjamins. Amsterdam.
© Universidad de Deusto 2003
Última modificación: junio 2003