Human Language Technologies

[ Back to Human Language Tecnologies - English Language and New Technologies]

Lecture 1: Introduction to Applications of Language Technology

[Excerpts from Edinburgh-Stanford link (E-S.l), EuroMap (EM), Centre for Language Technology (CLT - Macquarie University)]

"Language technology refers to a range of technologies that have been developed over the last 40 years to enable people to more easily and naturally communicate with computers, through speech or text and, when called for, receive an intelligent and natural reply in much the same way as a person might respond." (E-S.l)

"Human Language Techology is the term for the language capabilities designed into the computing applications used in information and communication technology systems." (EM)

"Human Language Technology is sometimes quite familiar, e.g. the spell checker in your word processor, but can often be hidden away inside complex networks – a machine for automatically reading postal addresses, for example." (EM)

"From speech recognition to automatic translation, Human Language Technology products and services enable humans to communicate more naturally and more effectively with their computers – but above all, with each other." (EM)

"Language Technology is all about getting computers to do useful things with human language, whether in spoken or written form." (CLT)

"It's a key technology that will drive advances in computing in the next decade. Imagine being able to talk to your car and have it respond intelligently, giving detailed advice on routes or summarising up to date news you just missed on the radio." (CLT)

"Or, being able to speak or type queries to your Web search engine in ordinary language, just as you would ask a person, and have it return just the document you were looking for, perhaps in summarized form for easy reading, translated from another language and with the key points for your purposes highlighted."

" Or imagine having an intelligent agent in your electronic mailbox that scans incoming mail for requests for lunch appointments and books your favourite restaurant automatically." (CLT)

Speech technology

Speech recognition: "This technology involves recognising spoken language and transforming it into text. The technology ranges from extremely accurate continuous dictation by systems trained to recognise an individual voice, through to systems that work in specific domains and with any user." (E-S.l)

Speech synthesis: "By breaking down human language into all its key parts, scientists have been able to recreate human voices that even have regional accents. Such synthesised voices can be used to instantly read out any text." (E-S.l)

Spoken dialogue systems: "The latest advances now allow users to have natural dialogue with a computer or 'agent' that can react to unexpected events in an apparently intelligent way. Scientists are now working on spoken group dialogue systems that involve numerous agents. " (E-S.l)

"These systems enable you to talk to a computer via a telephone in order to enact some transaction or information-seeking task. For example, you might call up on the phone and talk to a machine in order to buy or sell stocks and shares, or to get route directions from one city to another." (CLT)

Speech to speech machine translation: These are systems that recognise spoken input, analyise and translate this input and then utter the translation into a target language. Some are bidirectional and cope with spontaneous dialogs. However, the majority can only cope with restricted topics and limitted vocabularies.

Information management

"In today's "information society" there is an increasing need for effective management of the information available; through local networks and the Internet. Improving access to and efficient use of on-line information mainly involves "Information Retrieval" (IR), i.e. identifying the documents containing relevant information; and "Information Extraction" (IE), i.e. extracting relevant information from documents, and "Text Summarisation" (TS), i.e. presenting condensed information." (CCL-UMIST)

Information extraction: "Computers can now be taught to automatically recognise entities in unstructured data, such as picking out an applicants name and skills from a resume, and then working out the various relationships between data sets, before finally storing the information in specially indexed databases." (E-S.l)

Text summaries: "Summarisation technology produces shortened versions of longer documents for those situations where you don’t have the time to read the whole thing: an essential tool for dealing with information overload. Linked with smart technologies like optical character recognition, the photocopier that reduces a 10 page document to a 1 page document is not far away!" (CLT)

Intelligent Text Processing: "Ever been frustrated by a search engine? Find out how they work, but more importantly, find out how to make them intelligent. This unit also covers sophisticated web-based language technologies like document summarization, information extraction and machine translation. If you want to know about the Semantic Web, this is the unit for you." (CLT)

Question-answering systems: "Search engines are ready to get more intelligent. Web-based systems can use natural language processing techniques to better understand what information you’re looking for, and natural language generation can provide more carefully crafted answers." (CLT)

Cross-lingual information retrieval: "Users can formulate, expand and disambiguate queries, filter the search results and read the retrieved documents by using only their native language. This multilingual functionality is achieved by the use of dictionary-based query translation, multilingual document categorisation and automatic translation of summaries and documents." (MULINEX)

Machine translation

Machine Translation: "Machine translation technology takes a document in one language and translates it into a document in another language." (CLT)

More references

© Joseba Abaitua, Universidad de Deusto