This first report is going to be based on Human Language Technologies and the influence of it in our society. Human Language Technologies ( HLT) help languages accessing to information and at the same time accessing to communication, in other words it helps people to have a nice and easy communication with their computers. HLT would be the result of many other aspects like Language Engineering and the techniques for it, Speech Recognition, Speech Generation Speech Synthesis and Machine Translation among many others.


In this part of the report, I will analize all those aspects that help HLT to have a great communication in our society.

1·1 Human language technologiesHuman Language Technologies will help to build bridges across languages and cultures and provide natural access to information and communication services. It will enable an active use and assimilation of multimedia content, and further strengthen Europe's position at the forefront of language-enabled digital services. It will support business activities in a global context and promote a truly human-centred infostructure ensuring equal access and usage opportunities for all. The ultimate goal of Human Language Technologies is an optimal use of the human capital, maximising businesses' competitiveness and empowering people.

1·2 Natural language processing:A natural language is one that evolved along with a culture of human native speakers who use the language for general-purpose communication. Languages like English, American Sign Language and Japanese are natural languages, while languages like Esperanto are called constructed languages, having been deliberately created for a specific purpose.

Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form.

Some people view NLG as the opposite of natural language understanding. The difference can be put this way: whereas in natural language understanding the system needs to disambiguate the input sentence to produce the machine representation language, in NLG the system needs to take decisions about how to put a concept into words.

From Wikipedia, the free encyclopaedia.

Version 1.2, November 2002

 1·3 What’s computational linguistics?Computational linguistics (CL) is a discipline between linguistics and computer science which is concerned with the computational aspects of the human language faculty. It belongs to the cognitive sciences and overlaps with the field of artificial intelligence (AI), a branch of computer science aiming at computational models of human cognition. Computational linguistics has applied and theoretical components.


1·4 Language engineering:Language Engineering is the application of knowledge of language to the development of computer systems which can recognise, understand, interpret, and generate human language in all its forms. In practice, Language Engineering comprises a set of techniques and language resources. The former are implemented in computer software and the latter are a repository of knowledge which can be accessed by computer software.

hltteam (.at.) Last updated: 16.02.042 Does the notion of "Information Society" have any relation to human language?

The term Information Society has been around for a long time now and, indeed, has become something of a cliché. The notion of the coming Information Society reminds me of the way the idea of the Sydney 2000 Olympics and the way it shimmers in the distance. We look towards the Olympics and resolve to prepare hard for it. We must rapidly transform ourselves, our city, our demeanour to be ready and worthy. Time is of the essence in making ourselves ready for the challenge. There is certain breathlessness in all of this rhetoric.

3 Is there any concern in Europe with Human Language Technologies?In the European Union, the concept of the Information Society has been evolving strongly over the past few years building on the philosophy originally spelled out by Commissioner Martin Bangemann in 1994. Bangemann argued that the Information Society represents a "revolution based on information ... [which] adds huge new capacities to human intelligence and constitutes a resource which changes the way we work together and the way we live together..." (European Commission, 1994:4). One of the main implications of this "revolution" for Bangemann is that the Information Society can secure badly needed jobs (Europe and the Global Information Society, 1994:3). In other words, a driving motivation for the Information Society is the creation of employment for depressed economies.

 Closer to home it is instructive to look at just a few policy (or would-be policy) documents to see the views of the Information Society dominant here. The Goldsworthy report sees the Information Society as a "societal revolution based around information and communication technologies and about the role of these in developing global competitiveness and managing the transition to a globalised free trade world" (Department of Industry, Science and Tourism, 1997). In short, Goldsworthy's idea of the Information Society is entirely an economic one. At a broader level Barry Jones, the author of the House of Representatives Standing Committee's 1991 report 'Australia as a Information Society' sets out a definition of the Information Society which sees it as simply "a period when use of time, family life, employment, education and social interaction are increasingly influenced by access to Information Technology" (Australia as an Information Society: Grasping New Paradigms, 1991).These are just a few examples of ideas underpinning information policy drives in the developed world where the concept is accepted almost without challenge, and there is an inherent belief that like the Olympics, the Information Society is real - or will be very soon if only we can get ourselves organised properly. Some claim, of course, that the Information Society is here already and not just on its way. But one way or the other "it" exists and is a "good thing". By and large, national and regional Information Society documents do not question the belief that the Information Society will bring prosperity and happiness if a few basic safeguards are put in place. Some of the very few notes of serious caution in the practice of information policy have come through the influence of the Scandinavian countries which joined the European Union when the EU was already in full flight with implementing the actions flowing from the Bangemann report.

Interestingly, in recent travels in India I noticed an extraordinary level of hope and trust in that developing country in the potential of information technology to transform India into a modern fully developed economy. The push to develop information and technological infrastructure initiated by Rajiv Gandhi is seen as positive and a necessary step for the goal of a universally prosperous society in India. Effectively there is the same acceptance of the goodness of an Information Society and the absolute necessity to be one, that is found in the West.

Given this blind faith in the existence and the desirability of an Information Society among diverse nations, it is instructive to look at the theoretical literature which has spawned the idea to see what it claims for the Information Society. The term Information Society has many synonyms: Information Age, Information Revolution, Information Explosion and so on and it is found across a wide spectrum of disciplines. Fortunately the task of unravelling many of these ideas has been accomplished in a masterly way by Frank Webster. He has categorised the variety of concepts of the Information Society, Information Revolution, or whatever, and provided an analysis of five common conceptions of the Information Society (Webster, 1995).

 4 What is the current situation of the officeThe overall objective of HLT is to support e-business in a global context and to promote a human centred infostructure ensuring equal access and usage opportunities for all. This is to be achieved by developing multilingual technologies and demonstrating exemplary applications providing features and functions that are critical for the realisation of a truly user friendly Information Society. Projects address generic and applied RTD from a multi- and cross-lingual perspective, and undertake to demonstrate how language specific solutions can be transferred to and adapted for other languages. (2001)1·Which are the main techniques used in Language Engineering? *TechniquesThere are many techniques used in Language Engineering and some of these are described below.


*Speaker Identification and VerificationA human voice is as unique to an individual as a fingerprint. This makes it possible to identify a speaker and to use this identification as the basis for verifying that the individual is entitled to access a service or a resource. The types of problems which have to be overcome are, for example, recognising that the speech is not recorded, selecting the voice through noise (either in the environment or the transfer medium), and identifying reliably despite temporary changes (such as caused by illness).

 *Speech RecognitionThe sound of speech is received by a computer in analogue wave forms which are analysed to identify the units of sound (called phonemes) which make up words. Statistical models of phonemes and words are used to recognise discrete or continuous speech input. The production of quality statistical models requires extensive training samples (corpora) and vast quantities of speech have been collected, and continue to be collected, for this purpose.

 There are a number of significant problems to be overcome if speech is to become a commonly used medium for dealing with a computer. The first of these is the ability to recognise continuous speech rather than speech which is deliberately delivered by the speaker as a series of discrete words separated by a pause. The next is to recognise any speaker, avoiding the need to train the system to recognise the speech of a particular individual. There is also the serious problem of the noise which can interfere with recognition, either from the environment in which the speaker uses the system or through noise introduced by the transmission medium, the telephone line, for example. Noise reduction, signal enhancement and key word spotting can be used to allow accurate and robust recognition in noisy environments or over telecommunication networks. Finally, there is the problem of dealing with accents, dialects, and language spoken, as it often is, ungrammatically. *Character and Document Image RecognitionRecognition of written or printed language requires that a symbolic representation of the language is derived from its spatial form of graphical marks. For most languages this means recognising and transforming characters.

 *Natural Language UnderstandingThe understanding of language is obviously fundamental to many applications. However, perfect understanding is not always a requirement. In fact, gaining a partial understanding is often a very useful preliminary step in the process because it makes it possible to be intelligently selective about taking the depth of understanding to further levels.

Trivial or partial analysis of texts is used to obtain a robust initial classification of unrestricted texts efficiently. One use for this initial analysis can be to focus on 'interesting' parts of a text for a deeper semantic analysis which determines the content of the text within a limited domain.

 Semantic models are used to represent the meaning of language in terms of concepts and relationships between them. Combinations of analysis and generation with a semantic model allow texts to be translated. At the current stage of development, applications where this can be achieved need be limited in vocabulary and concepts so that adequate Language Engineering resources can be applied. *Natural Language GenerationA semantic representation of a text can be used as the basis for generating language. An interpretation of basic data or the underlying meaning of a sentence or phrase can be mapped into a surface string in a selected fashion; either in a chosen language or according to stylistic specifications by a text planning system.

 *Speech GenerationSpeech is generated from filled templates, by playing 'canned' recordings or concatenating units of speech (phonemes, words) together. Speech generated has to account for aspects such as intensity, duration and stress in order to produce a continuous and natural response.

Dialogue can be established by combining speech recognition with simple generation, either from concatenation of stored human speech components or synthesising speech using rules.

Providing a library of speech recognisers and generators, together with a graphical tool for structuring their application, allows someone who is neither a speech expert nor a computer programmer to design a structured dialogue which can be used, for example, in automated handling of telephone calls.·Which language resources are essential components of Language Engineering?  The essential components of Language Engineering are the ones that follow: *Language ResourcesLanguage resources are essential components of Language Engineering. They are one of the main ways of representing the knowledge of language, which is used for the analytical work leading to recognition and understanding.

The work of producing and maintaining language resources is a huge task. Resources are produced, according to standard formats and protocols to enable access, in many EU languages, by research laboratories and public institutions. Many of these resources are being made available through the European Language Resources Association (ELRA).

 *LexiconsA lexicon is a repository of words and knowledge about those words. This knowledge may include details of the grammatical structure of each word (morphology), the sound structure (phonology), the meaning of the word in different textual contexts, e.g. depending on the word or punctuation mark before or after it. A useful lexicon may have hundreds of thousands of entries. Lexicons are needed for every language of application.

 *Specialist LexiconsThere are a number of special cases which are usually researched and produced separately from general purpose lexicons:

 Proper names: Dictionaries of proper names are essential to effective understanding of language, at least so that they can be recognised within their context as places, objects, or person, or maybe animals. They take on a special significance in many applications, however, where the name is key to the application such as in a voice operated navigation system, a holiday reservations system, or railway timetable information system, based on automated telephone call handling.Terminology: In today's complex technological environment there are a host of terminologies which need to be recorded, structured and made available for language enhanced applications. Many of the most cost-effective applications of Language Engineering, such as multi-lingual technical document management and machine translation, depend on the availability of the appropriate terminology banks.

 Wordnets: A wordnet describes the relationships between words; for example, synonyms, antonyms, collective nouns, and so on. These can be invaluable in such applications as information retrieval, translator workbenches and intelligent office automation facilities for authoring. *GrammarsA grammar describes the structure of a language at different levels: word (morphological grammar), phrase, sentence, etc. A grammar can deal with structure both in terms of surface (syntax) and meaning (semantics and discourse).

 *CorporaA corpus is a body of language, either text or speech, which provides the basis for:

 analysis of language to establish its characteristicstraining a machine, usually to adapt its behaviour to particular circumstances

verifying empirically a theory concerning language

a test set for a Language Engineering technique or application to establish how well it works in practice.

 There are national corpora of hundreds of millions of words but there are also corpora which are constructed for particular purposes. For example, a corpus could comprise recordings of car drivers speaking to a simulation of a control system, which recognises spoken commands, which is then used to help establish the user requirements for a voice operated control system for the market.



 3·Check for the following terms (choose at least five): 

*StemmerA stemmer is a program or algorithm which determines the morphological root of a given inflected (or, sometimes, derived) word form -- generally a written word form.

A stemmer for English, for example, should identify the string "cats" (and possibly "catlike", "catty" etc.) as based on the root "cat", and "stemmer", "stemming", "stemmed" as based on "stem".

English stemmers are fairly trivial (with only occasional problems, such as "dries" being the third-person singular present form of the verb "dry", "axes" being the plural of "ax" as well as "axis"); but stemmers become harder to design as the morphology, orthography, and character encoding of the target language becomes more complex. For example, an Italian stemmer is more complex than an English one (because of more possible verb inflections), a Russian one is more complex (more possible noun declensions), a Hebrew one is even more complex (a hairy writing system), and so on.

Stemmers are common elements in query systems, since a user who runs a query on "daffodils" probably cares about documents that contain the word "daffodil" (without the s).*Shallow parser

Shallow parser software which parses language to a point where a rudimentary level of understanding can be realised; this is often used in order to identify passages of text which can then be analysed in further depth to fulfil the particular objective.


Formalism a means to represent the rules used in the establishment of a model of linguistic knowledge.


Domain usually applied to the area of application of the language enabled software e.g. banking, insurance, travel, etc.; the significance in Language Engineering is that the vocabulary of an application is restricted so the language resource requirements are effectively limited by limiting the domain of application

*Translator`s workbench

Translator's workbench a software system providing a working environment for a human translator, which offers a range of aids such as on-line dictionaries, thesauri, translation memories, etc.

 *Authoring toolsAuthoring tools facilities provided in conjunction with word processing to aid the author of documents, typically including an on-line dictionary and thesaurus, spell-, grammar-, and style-checking, and facilities for structuring, integrating and linking documents.

1.2.2 State of the Art

Comments about the state-of-the-art need to be made in the context of specific applications which reflect the constraints on the task. Moreover, different technologies are sometimes appropriate for different tasks. For example, when the vocabulary is small, the entire word can be modeled as a single unit. Such an approach is not practical for large vocabularies, where word models must be built up from subword units.

Performance of speech recognition systems is typically described in terms of word error rate, E, defined as:

E= S+I+D100


Where N is the total number of words in the test set, and S, I, and D are the total number of substitutions, insertions, and deletions, respectively.

 The past decade has witnessed significant progress in speech recognition technology. Word error rates continue to drop by a factor of 2 every two years. Substantial progress has been made in the basic technology, leading to the lowering of barriers to speaker independence, continuous speech, and large vocabularies. There are several factors that have contributed to this rapid progress. First, there is the coming of age of the HMM. HMM is powerful in that, with the availability of training data, the parameters of the model can be trained automatically to give optimal performance. Second, much effort has gone into the development of large speech corpora for system development, training, and testing. Some of these corpora are designed for acoustic phonetic research, while others are highly task specific. Nowadays, it is not uncommon to have tens of thousands of sentences available for system training and testing. These corpora permit researchers to quantify the acoustic cues important for phonetic contrasts and to determine parameters of the recognizers in a statistically meaningful way. While many of these corpora (e.g., TIMIT, RM, ATIS, and WSJ; see section 12.3) were originally collected under the sponsorship of the U.S. Defense Advanced Research Projects Agency (ARPA) to spur human language technology development among its contractors, they have nevertheless gained world-wide acceptance (e.g., in Canada, France, Germany, Japan, and the U.K.) as standards on which to evaluate speech recognition.Third, progress has been brought about by the establishment of standards for performance evaluation. Only a decade ago, researchers trained and tested their systems using locally collected data, and had not been very careful in delineating training and testing sets. As a result, it was very difficult to compare performance across systems, and a system's performance typically degraded when it was presented with previously unseen data. The recent availability of a large body of data in the public domain, coupled with the specification of evaluation standards, has resulted in uniform documentation of test results, thus contributing to greater reliability in monitoring progress (corpus development activities and evaluation methodologies are summarized in chapters 12 and 13 respectively).

 Finally, advances in computer technology have also indirectly influenced our progress. The availability of fast computers with inexpensive mass storage capabilities has enabled researchers to run many large scale experiments in a short amount of time. This means that the elapsed time between an idea and its implementation and evaluation is greatly reduced. In fact, speech recognition systems with reasonable performance can now run in real time using high-end workstations without additional hardware---a feat unimaginable only a few years ago.One of the most popular, and potentially most useful tasks with low perplexity (PP=11) is the recognition of digits. For American English, speaker-independent recognition of digit strings spoken continuously and restricted to telephone bandwidth can achieve an error rate of 0.3% when the string length is known.

One of the best known moderate-perplexity tasks is the 1,000-word so-called Resource Management (RM) task, in which inquiries can be made concerning various naval vessels in the Pacific ocean. The best speaker-independent performance on the RM task is less than 4%, using a word-pair language model that constrains the possible words following a given word (PP=60). More recently, researchers have begun to address the issue of recognizing spontaneously generated speech. For example, in the Air Travel Information Service (ATIS) domain, word error rates of less than 3% has been reported for a vocabulary of nearly 2,000 words and a bigram language model with a perplexity of around 15.

High perplexity tasks with a vocabulary of thousands of words are intended primarily for the dictation application. After working on isolated-word, speaker-dependent systems for many years, the community has since 1992 moved towards very-large-vocabulary (20,000 words and more), high-perplexity ,speaker-independent, continuous speech recognition. The best system in 1994 achieved an error rate of 7.2% on read sentences drawn from North America business news.

With the steady improvements in speech recognition performance, systems are now being deployed within telephone and cellular networks in many countries. Within the next few years, speech recognition will be pervasive in telephone networks around the world. There are tremendous forces driving the development of the technology; in many countries, touch tone penetration is low, and voice is the only option for controlling automated services. In voice dialing, for example, users can dial 10--20 telephone numbers by voice (e.g., call home) after having enrolled their voices by saying the words associated with telephone numbers. AT&T, on the other hand, has installed a call routing system using speaker-independent word-spotting technology that can detect a few key phrases (e.g., person to person, calling card) in sentences such as: I want to charge it to my calling card.

At present, several very large vocabulary dictation systems are available for document generation. These systems generally require speakers to pause between words. Their performance can be further enhanced if one can apply constraints of the specific domain such as dictating medical reports.

 Even though much progress is being made, machines are a long way from recognizing conversational speech. Word recognition rates on telephone conversations in the Switchboard corpus are around 50% . It will be many years before unlimited vocabulary, speaker-independent continuous dictation capability is realized.

Speech recognition

(Or voice recognition) The identification of spoken words by a machine. The spoken words are digitised (turned into sequence of numbers) and matched against coded dictionaries in order to identify the words.

Most systems must be "trained," requiring samples of all the actual words that will be spoken by the user of the system. The sample words are digitised, stored in the computer and used to match against future words. More sophisticated systems require voice samples, but not of every word. The system uses the voice samples in conjunction with dictionaries of larger vocabularies to match the incoming words. Yet other systems aim to be "speaker-independent", i.e. they will recognise words in their vocabulary from any speaker without training.

Another variation is the degree with which systems can cope with connected speech. People tend to run words together, e.g. "next week" becomes "neksweek" (the "t" is dropped). For a voice recognition system to identify words in connected speech it must take into account the way words are modified by the preceding and following words.

It has been said (in 1994) that computers will need to be something like 1000 times faster before large vocabulary (a few thousand words), speaker-independent, connected speech voice recognition will be feasible.

 Speech synthesisThe generation of an sound waveform of human speech from a textual or phonetic description. See also speech recognition.


It is worth remembering that most prototypes developed within research projects are currently only capable of processing a few hundreds of sentences (around 300), on very specific topics (accommodation-booking, planning trips, etc.) and for a small group of languages—English, German, Japanese, Spanish, Italian. It seems unlikely that any application will be able to go beyond these boundaries in the near future.

 The direct incorporation of speech translation prototypes into industrial applications is at present too costly. However, the growing demand for these products leads us to believe that they will soon be on the market at more affordable prices. The systems developed in projects such as Verbmobil, EuTrans or Janus—despite being at the laboratory phase—contain in practice thoroughly evaluated and robust technologies. A manufacturer considering their integration may join R&D projects and take part in the development of prototypes with the prospect of a fast return on investment. It is quite clear that we are witnessing the emergence of a new technology with great potential for penetrating the telecommunications and microelectronics market in the not too distant future. Another remarkable aspect of the EuTrans project is its methodological contribution to machine translation as a whole, both in speech and written modes. Although these two modes of communication are very different in essence, and their respective technologies cannot always be compared, speech-to-speech translation has brought prospects of improvement for text translation. Traditional methods for written texts tend to be based on grammatical rules. Therefore, many MT systems show no coverage problem, although this is achieved at the expense of quality. The most common way of improving quality is by restricting the topic of interest. It is widely accepted that broadening of coverage immediately endangers quality. In this sense, learning techniques that enable systems to automatically adapt to new textual typologies, styles, structures, terminological and lexical items could have a radical impact on the technology.Due to the differences between oral and written communication, rule-based systems prepared for written texts can hardly be re-adapted to oral applications. This is an approach that has been tried, and has failed. On the contrary, example-based learning methods designed for speech-to-speech translation systems can easily be adapted to the written texts, given the increasing availability of bilingual corpora. One of the main contributions of the PRHLT-ITI group is precisely in its learning model based on bilingual corpora. Herein lie some interesting prospects for improving written translation techniques.

 Effective speech-to-speech translation, along with other voice-oriented technologies, will become available in the coming years, albeit with some limitations e.g. the number of languages, linguistic coverage, and context. It could be argued that EuTrans' main contribution has been to raise the possibilities of speech-to-speech translation to the levels of speech recognition technology, making any new innovation immediatly accessible. 2?

 Speech Recognition (SR)This involves the computer taking the user's speech and interpreting what has been said. This allows the user to control the computer (or certain aspects of it) by voice, rather than having to use the mouse and keyboard, or alternatively just dictating the contents of a document.

The complex nature of translating the raw audio into phonemes involves a lot of signal processing and is not focused on here. These details are taken care of by an SR engine that will be installed on your machine. SR engines are often called recognisers and these days typically implement continuous speech recognition (older recognisers implemented isolated or discrete speech recognition, where pauses were required between words).

Speech recognition usually means one of two things. The application can understand and follow simple commands that it has been educated about in advance. This is known as command and control (sometimes seen abbreviated as CnC, or simply SR).

Alternatively an application can support dictation (sometimes abbreviated to DSR). Dictation is more complex as the engine has to try and identify arbitrary spoken words, and will need to decide which spelling of similarly sounding words is required. It develops context information based on the preceding and following words to try and help decide. Because this context analysis is not required with Command and Control recognition, CnC is sometimes referred to as context-free recognition.


A Text-To-Speech (TTS) synthesizer is a computer-based system that should be able to read any text aloud, whether it was directly introduced in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system. Let us try to be clear. There is a fundamental difference between the system we are about to discuss here and any other talking machine (as a cassette-player for example) in the sense that we are interested in the automatic production of new sentences. This definition still needs some refinements. Systems that simply concatenate isolated words or parts of sentences, denoted as Voice Response Systems, are only applicable when a limited vocabulary is required (typically a few one hundreds of words), and when the sentences to be pronounced respect a very restricted structure, as is the case for the announcement of arrivals in train stations for instance. In the context of TTS synthesis, it is impossible (and luckily useless) to record and store all the words of the language. It is thus more suitable to define Text-To-Speech as the automatic production of speech, through a grapheme-to-phoneme transcription of the sentences to utter.

Speech recognition

From Wikipedia, the free encyclopedia.

Speech recognition technologies allow computers equipped with microphones to interpret human speech, e.g. for transcription or as a control method.

Such systems can be classified as to whether they require the user to "train" the system to recognise their own particular speech patterns or not, whether the system can recognise continuous speech or requires users to break up their speech into discrete words, and whether the vocabulary the system recognises is small (in the order of tens or at most hundreds of words), or large (thousands of words).

Systems requiring a short amount of training can (as of 2001) capture continuous speech with a large vocabulary at normal pace with an accuracy of about 98% (getting two words in one hundred wrong), and different systems that require no training can recognize a small number of words (for instance, the ten digits of the decimal system) as spoken by most English speakers. Such systems are popular for routing incoming phone calls to their destinations in large organisations.

Commercial systems for speech recognition have been available off-the-shelf since the 1990s. However, it is interesting to note that despite the apparent success of the technology, few people use such speech recognition systems.

It appears that most computer users can create and edit documents more quickly with a conventional keyboard, despite the fact that most people are able to speak considerably faster than they can type. Additionally, heavy use of the speech organs results in vocal loading.

Some of the key technical problems in speech recognition are that:

Inter-speaker differences are often large and difficult to account for. It is not clear which characteristics of speech are speaker-independent.

The interpretation of many phonemes, words and phrases are context sensitive. For example, phonemes are often shorter in long words than in short words. Words have different meanings in different sentences, e.g. "Philip lies" could be interpreted either as Philip being a liar, or that Philip is lying on a bed.

Intonation and speech timbre can completely change the correct interpretation of a word or sentence, e.g. "Go!", "Go?" and "Go." can clearly be recognised by a human, but not so easily by a computer.

Words and sentences can have several valid interpretations such that the speaker leaves the choice of the correct one to the listener.

Written language may need punctuation according to strict rules that are not strongly present in speech, and are difficult to infer without knowing the meaning (commas, ending of sentences, quotations).

The "understanding" of the meaning of spoken words is regarded by some as a separate field, that of natural language understanding. However, there are many examples of sentences that sound the same, but can only be disambiguated by an appeal to context: one famous T-shirt worn by Apple Computer researchers stated:

I helped Apple wreck a nice beach.

A general solution of many of the above problems effectively requires human knowledge and experience, and would thus require advanced artificial intelligence technologies to be implemented on a computer. In particular, statistical language models are often employed for disambiguation and improvement of the recognition accuracies.

Speech Synthesis

Speech synthesis is the computer-generated simulation of human speech. It is used to translate written information into aural information where it is more convenient, especially for mobile applications such as voice-enabled e-mail and unified messaging. It is also used to assist the vision-impaired so that, for example, the contents of a display screen can be automatically read aloud to a blind user. Speech synthesis is the counterpart of speech or voice recognition. The earliest speech synthesis effort was in 1779 when Russian Professor Christian Kratzenstein created an apparatus based on the human vocal tract to demonstrate the physiological differences involved in the production of five long vowel sounds. The first fully functional voice synthesizer, Homer Dudley's VODER (Voice Operating Demonstrator), was shown at the 1939 World's Fair. The VODER was based on Bell Laboratories' vocoder (voice coder) research of the mid-thirties.,,sid9_gci773595,00.html


Information fatigue syndrome

David Lewis coined the term "information fatigue syndrome" for what he expects will soon be a recognized medical condition.

"Having too much information can be as dangerous as having too little. Among other problems, it can lead to a paralysis of analysis, making it far harder to find the right solutions or make the best decisions."

"Information is supposed to speed the flow of commerce, but it often just clogs the pipes."

David Lewis

Dr. David Lewis is a British psychologist, author of the report Dying for Information?, commissioned by London based Reuters Business Information. Lewis has coined the term "information fatigue syndrome" for what he expects will soon be a recognized medical condition. Lewis is a consultant who has studied the impact of data proliferation in the corporate world.




I. Summary of Findings

How much new information is created each year? Newly created information is stored in four physical media – print, film, magnetic and optical – and seen or heard in four information flows through electronic channels – telephone, radio and TV, and the Internet. This study of information storage and flows analyzes the year 2002 in order to estimate the annual size of the stock of new information recorded in storage media, and heard or seen each year in information flows. Where reliable data was available we have compared the 2002 findings to those of our 2000 study (which used 1999 data) in order to describe a few trends in the growth rate of information.

Print, film, magnetic, and optical storage media produced about 5 exabytes of new information in 2002. Ninety-two percent of the new information was stored on magnetic media, mostly in hard disks.

 How big is five exabytes? If digitized, the nineteen million books and other print collections in the Library of Congress would contain about ten terabytes of information; five exabytes of information is equivalent in size to the information contained in half a million new libraries the size of the Library of Congress print collections.Hard disks store most new information. Ninety-two percent of new information is stored on magnetic media, primarily hard disks. Film represents 7% of the total, paper 0.01%, and optical media 0.002%.

The United States produces about 40% of the world's new stored information, including 33% of the world's new printed information, 30% of the world's new film titles, 40% of the world's information stored on optical media, and about 50% of the information stored on magnetic media.

How much new information per person? According to the Population Reference Bureau, the world population is 6.3 billion, thus almost 800 MB of recorded information is produced per person each year. It would take about 30 feet of books to store the equivalent of 800 MB of information on paper.

We estimate that the amount of new information stored on paper, film, magnetic, and optical media has about doubled in the last three years.

 Information explosion? We estimate that new stored information grew about 30% a year between 1999 and 2002.Paperless society? The amount of information printed on paper is still increasing, but the vast majority of original information on paper is produced by individuals in office documents and postal mail, not in formally published titles such as books, newspapers and journals.

Information flows through electronic channels -- telephone, radio, TV, and the Internet -- contained almost 18 exabytes of new information in 2002, three and a half times more than is recorded in storage media. Ninety eight percent of this total is the information sent and received in telephone calls - including both voice and data on both fixed lines and wireless.

 Telephone calls worldwide – on both landlines and mobile phones – contained 17.3 exabytes of new information if stored in digital form; this represents 98% of the total of all information transmitted in electronic information flows, most of it person to person.Most radio and TV broadcast content is not new information. About 70 million hours (3,500 terabytes) of the 320 million hours of radio broadcasting is original programming. TV worldwide produces about 31 million hours of original programming (70,000 terabytes) out of 123 million total hours of broadcasting.

The World Wide Web contains about 170 terabytes of information on its surface; in volume this is seventeen times the size of the Library of Congress print collections.

Instant messaging generates five billion messages a day (750GB), or 274 Terabytes a year.

Email generates about 400,000 terabytes of new information each year worldwide.

P2P file exchange on the Internet is growing rapidly. Seven percent of users provide files for sharing, while 93% of P2P users only download files. The largest files exchanged are video files larger than 100 MB, but the most frequently exchanged files contain music (MP3 files).

How we use information. Published studies on media use say that the average American adult uses the telephone 16.17 hours a month, listens to radio 90 hours a month, and watches TV 131 hours a month. About 53% of the U.S. population uses the Internet, averaging 25 hours and 25 minutes a month at home, and 74 hours and 26 minutes a month at work – about 13% of the time.


PUNTO 3  HUMAN LANGUAGE TECHNOLOGIESThe overall objective of HLT is to support e-business in a global context and to promote a human centred infostructure ensuring equal access and usage opportunities for all. This is to be achieved by developing multilingual technologies and demonstrating exemplary applications providing features and functions that are critical for the realisation of a truly user friendly Information Society. Projects address generic and applied RTD from a multi- and cross-lingual perspective, and undertake to demonstrate how language specific solutions can be transferred to and adapted for other languages.

While elements of the three initial HLT action lines - Multilinguality, Natural Interactivity and Crosslingual Information Management are still present, there has been periodic re-assessment and tuning of them to emerging trends and changes in the surrounding economic, social, and technological environment. The trials and best practice in multilingual e-service and e-commerce action line was introduced in the IST 2000 work programme (IST2000) to stimulate new forms of partnership between technology providers, system integrators and users through trials and best practice actions addressing end-to-end multi-language platforms and solutions for e-service and e-commerce. The fifth IST call for proposals covered this action line.

Human language technologies

"Language technology refers to a range of technologies that have been developed over the last 40 years to enable people to more easily and naturally communicate with computers, through speech or text and, when called for, receive an intelligent and natural reply in much the same way as a person might respond." (E-S.l)

"Human Language Techology is the term for the language capabilities designed into the computing applications used in information and communication technology systems." (EM)

"Human Language Technology is sometimes quite familiar, e.g. the spell checker in your word processor, but can often be hidden away inside complex networks – a machine for automatically reading postal addresses, for example." (EM)

"From speech recognition to automatic translation, Human Language Technology products and services enable humans to communicate more naturally and more effectively with their computers – but above all, with each other." (EM)



Intelligent Text Processing: "Ever been frustrated by a search engine? Find out how they work, but more importantly, find out how to make them intelligent. This unit also covers sophisticated web-based language technologies like document summarization, information extraction and machine translation. If you want to know about the Semantic Web, this is the unit for you." (CLT)



Semantic Web

The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming.

"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001


1.Which are the most usual interpretations of the term "machineDefinition of MT

The term machine translation (MT) is normally taken in its restricted and precise meaning of fully automatic translation. However, in this chapter we consider the whole range of tools that may support translation and document production in general, which is especially important when considering the integration of other language processing techniques and resources with MT. We therefore define Machine Translation to include any computer-based process that transforms (or helps a user to transform) written text from one human language into another. We define Fully Automated Machine Translation (FAMT) to be MT performed without the intervention of a human being during the process. Human-Assisted Machine Translation (HAMT) is the style of translation in which a computer system does most of the translation, appealing in case of difficulty to a (mono- or bilingual) human for help. Machine-Aided Translation (MAT) is the style of translation in which a human does most of the work but uses one of more computer systems, mainly as resources such as dictionaries and spelling checkers, as assistants.

Traditionally, two very different classes of MT have been identified. Assimilation refers to the class of translation in which an individual or organization wants to gather material written by others in a variety of languages and convert them all into his or her own language. Dissemination refers to the class in which an individual or organization wants to broadcast his or her own material, written in one language, in a variety of language to the world. A third class of translation has also recently become evident. Communication refers to the class in which two or more individuals are in more or less immediate interaction, typically via email or otherwise online, with an MT system mediating between them. Each class of translation has very different features, is best supported by different underlying technology, and is to be evaluated according to somewhat different criteria.

Where We Were Five Years Ago

Machine Translation was the first computer-based application related to natural language, starting after World War II, when Warren Weaver suggested using ideas from cryptography and information theory. The first large-scale project was funded by the US Government to translate Russian Air Force manuals into English. After a decade of initial optimism, funding for MT research became harder to obtain in the US. However, MT research continued to flourish in Europe and then, during the 1970s, in Japan. Today, over 50 companies worldwide produce and sell translations by computer, whether as translation services to outsiders, as in-house translation bureaux, or as providers of online multilingual chat rooms. By some estimates, MT expenditure in 1989 was over $20 million worldwide, involving 200—300 million pages per year (Wilks 92).

Ten years ago, the typical users of machine translation were large organizations such as the European Commission, the US Government, the Pan American Health Organization, Xerox, Fujitsu, etc. Fewer small companies or freelance translators used MT, although translation tools such as online dictionaries were becoming more popular. However, ongoing commercial successes in Europe, Asia, and North America continued to illustrate that, despite imperfect levels of achievement, the levels of quality being produced by FAMT and HAMT systems did address some users’ real needs. Systems were being produced and sold by companies such as Fujitsu, NEC, Hitachi, and others in Japan, Siemens and others in Europe, and Systran, Globalink, and Logos in North America (not to mentioned the unprecedented growth of cheap, rather simple MT assistant tools such as PowerTranslator).

In response, the European Commission funded the Europe-wide MT research project Eurotra, which involved representatives from most of the European languages, to develop a large multilingual MT system (Johnson, et al., 1985). Eurotra, which ended in the early 1990s, had the important effect of establishing Computational Linguistics groups in a several countries where none had existed before. Following this effort, and responding to the promise of statistics-based techniques (as introduced into Computational Linguistics by the IBM group with their MT system CANDIDE), the US Government funded a four-year effort, pitting three theoretical approaches against each other in a frequently evaluated research program. The CANDIDE system (Brown et al., 1990), taking a purely-statistical approach, stood in contrast to the Pangloss system (Frederking et al., 1994), which initially was formulated as a HAMT system using a symbolic-linguistic approach involving an interlingua; complementing these two was the LingStat system (Yamron et al., 1994), which sought to combine statistical and symbolic/linguistic approaches. As we reach the end of the decade, the only large-scale multi-year research project on MT worldwide is Verbmobil in Germany (Niemann et al., 1997), which focuses on speech-to-speech translation of dialogues in the rather narrow domain of scheduling meetings. do FAHQT and ALPAC mean in the evolution of MT?

fully automatic high quality translation (FAHQT)Automatic Language Processing Advisory Committee (ALPAC)

3.List some of the major methods, techniques and approaches

Major Methods, Techniques and Approaches

Statistical vs. Linguistic MT

One of the most pressing questions of MT results from the recent introduction of a new paradigm into Computational Linguistics. It had always been thought that MT, which combines the complexities of two languages (at least), requires highly sophisticated theories of linguistics in order to produce reasonable quality output.

As described above, the CANDIDE system (Brown et al., 1990) challenged that view. The DARPA MT Evaluation series of four MT evaluations, the last of which was held in 1994, compared the performance of three research systems, more than 5 commercial systems, and two human translators (White et al., 1992—94). It forever changed the face of MT, showing that MT systems using statistical techniques to gather their rules of cross-language correspondence were feasible competitors to traditional, purely hand-built ones. However, CANDIDE did not convince the community that the statistics-only approach was the optimal path; in developments since 1994, it has included steadily more knowledge derived from linguistics. This left the burning question: which aspects of MT systems are best approached by statistical methods, and which by traditional, linguistic ones?

Since 1994, a new generation of research MT systems is investigating various hybridizations of statistical and symbolic techniques (Knight et al., 1995; Brown and Frederking, 1995; Dorr , 1997; Nirenburg et al., 1992; Wahlster, 1993; Kay et al., 1994). While it is clear by now that some modules are best approached under one paradigm or the other, it is a relatively safe bet that others are genuinely hermaphroditic, and that their best design and deployment will be determined by the eventual use of the system in the world. Given the large variety of phenomena inherent in language, it is highly unlikely that there exists a single method to handle all the phenomena--both in the data/rule collection stage and in the data/rule application (translation) stage--optimally. Thus one can expect all future non-toy MT systems to be hybrids. Methods of statistics and probability combination will predominate where robustness and wide coverage are at issue, while generalizations of linguistic phenomena, symbol manipulation, and structure creation and transformation will predominate where fine nuances (i.e., translation quality) are important. Just as we today have limousines, trucks, passenger cars, trolley buses, and bulldozers, just so we will have different kind of MT systems that use different translation engines and concentrate on different functions.

One way to summarize the essential variations is as follows:

Feature Symbolic Statistical

robustness/coverage: lower higher

quality/fluency: higher lower

representation: deeper shallower

How exactly to combine modules into systems, however, remains a challenging puzzle. As argued in (Church and Hovy, 1993), one can use MT function to identify productive areas for guiding research. The `niches of functionality’ provide clearly identifiable MT goals. Major applications include:

assimilation tasks: lower quality, broad domains – statistical techniques predominatedissemination tasks: higher quality, limited domains – symbolic techniques predominatecommunication tasks: medium quality, medium domain – mixed techniques predominate

Ideally, systems will employ statistical techniques to augment linguistic insights, allowing the system builder, a computational linguist, to specify the knowledge in the form most convenient to him or her, and have the system perform the tedious work of data collection, generalization, and rule creation. Such collaboration will capitalize on the (complementary) strengths of linguist and computer, and result in much more rapid construction of MT systems for new languages, with greater coverage and higher quality. Still, how exactly to achieve this optimal collaboration is far from clear. Chapter 6 discusses this tradeoff in more detail. was MT ten years ago?

New directions and challenges

Within the last ten years, research on spoken translation has developed into a major focus of MT activity. Of course, the idea or dream of translating the spoken word automatically was present from the beginning (Locke 1955), but it has remained a dream until now. Research projects such as those at ATR, CMU and on the Verbmobil project in Germany are ambitious. But they do not make the mistake of attempting to build all-purpose systems. The constraints and limitations are clearly defined by definition of domains, sublanguages and categories of users. That lesson has been learnt. The potential benefits even if success is only partial are clear for all to see, and it is a reflection of the standing of MT in general and a sign that it is no longer suffering from old perceptions that such ambitious projects can receive funding.

5.New directions and foreseeable breakthroughs of MT in the sort term.

In the future, much MT research will be oriented towards the development of `translation modules' to be integrated in general `office' systems, rather than the design of systems to be self-contained and independent. It is already evident that the range of computer-based translation activities is expanding to embrace any process which results in the production or generation of texts and documents in bilingual and multilingual contexts, and it is quite possible that MT will be seen as the most significant component in the facilitation of international communication and understanding in the future `information age'.

In this respect, the development of MT systems appropriate for electronic mail is an area which ought to be explored. Those systems which are in use (e.g. DP/Translator on CompuServe) were developed for quite different purposes and circumstances. It would be wrong to assume that existing systems are completely adequate for this purpose. They were not designed for the colloquial and often ungrammatical and incomplete dialogue style of the discussion lists on networks.


Computer/Machine aided Translation (CAT or MAT) is an interesting complement to Machine Translation. Many translation professionals view it as an alternative to MT, and they prefer this idea because their role is mantained and their productivity and capacity considerably augmented.

1.How would you describe CAT? Talk about its main functions.

Computer Aided Translation (CAT)

Computer Aided Translation (CAT) is is intended for professional translators who are already fluent in the two languages they are translating. CAT tools often include "terminology management tools" and "translation memory" to enhance the efficiency and accuracy of translations.

2.What are Translation Memories (TM)? Define TMX and talk of its advantages.

Every so often someone says DV is excellent but for the interface and suggest they adopt a more Trados-like look and feel. What they don't know is that early-early DV versions had a different, more Word-like interface. The interface was changed because the developer felt it hindered further development of the program.

If you use DV (or other programs using proprietary interfaces, such as SDLX or TRANS Suite, for that matter) you will soon notice that the Word interface, although very simpático to beginners, is really a hindrance. No way I can explain it here. You will have to trust me for that.

Also some translators with a technical turn of mind claim the Trados memory concept is also exhausted and that future versions won't bring many improvements. I don't know. I wish I did, but I am a simple working translator, not a programmer. Unfortunately.

3.Give the reference of at least five TM systems.

Machine Translation (MT) software reviews (co-)authored by Jeff Allen

ALLEN, Jeffrey and Thomas WASSMER. 2004. Review of @promt Standard, @promt Professional and @promt Expert Machine Translation software. To appear in Multilingual Computing and Technology magazine, Number 62, March 2004. click here to access the article (in separate browser window).

ALLEN, Jeffrey. 2003. Review of Systran Premium 4.0 Machine Translation software. In Multilingual Computing and Technology magazine. Number 58, Vol. 14, Issue 6. September 2003. Pp. 19-22. click here to access the article (in separate browser window).

ALLEN, Jeffrey. 2002. Review of Reverso Pro 5 and Reverso Expert Machine Translation software. In Multilingual Computing and Technology magazine. Number 50, Vol. 13, Issue 6. September 2002. Pp. 18-21. click here to access the article (in separate browser window).


Machine Translation (MT) Postediting publications and presentations by Jeff Allen

ALLEN, Jeffrey. 2003. Post-editing. In Computers and Translation: A Translators Guide. Edited by Harold Somers. Benjamins Translation Library, 35. Amsterdam: John Benjamins. (ISBN 90 272 1640 1). click here for information on the book and chapter.

ALLEN, Jeffrey. 2002. Review of "Repairing Texts: Empirical Investigations of Machine Translation Post-Editing Processes". (KRINGS Hans, edited by Geoffrey KOBY. 2001. Translated from German to English by Geoffrey Koby, Gregory Shreve, Katjz Mischerikow and Sarah Litzer) Translation Studies series. Ohio: Kent State University Press click here for the book abstract and ordering information. In Multilingual Computing and Technology magazine. Number 46. March 2002. Pp. 27-29. click here to access the article (in separate browser window).

ALLEN, Jeffrey. 2001. Postediting: an integrated part of a translation software program. In Language International magazine, April 2001, Vol. 13, No. 2, pp. 26-29. click here to access the article in PDF format (in separate browser window).

ALLEN, Jeffrey. 2001. Post-editing or no post-editing? In International Journal for Language and Documentation, Issue 8, December 2000/January 2001. pp. 41-42. click here to download the article in MS Word format.

ALLEN, Jeffrey and Christopher HOGAN. 2000. Toward the development of a post-editing module for Machine Translation raw output: a new productivity tool for processing controlled language. Presented at the Third International Controlled Language Applications Workshop (CLAW2000), held in Seattle, Washington, 29-30 April 2000. click here to download the article in MS Word format or click here to access the article (in separate browser window) in HTML format. Order proceedings from here for North America or here for Europe

ALLEN, Jeff. 2001. postediting answers (was: questions on MT usage). Appeared on the MT-List discussion forum. Click here to access the discussion list post.

ALLEN, Jeff. 2004. Who should be doing MT post-editing? Appeared on the MT-List discussion forum. Click here to access the discussion list post.

ALLEN, Jeff. 2003. MT is worthwhile for...? Appeared on the MT-List discussion forum. Click here to access the discussion list post.

Machine Translation (MT) MT business case - return on investment

ALLEN, Jeff. 2003. article: The Business Case for MT: The Breakthrough Is for Real. Appeared on the MT-List discussion forum. click here to access the discussion list post

ALLEN, Jeff. 2003. article: Pricing Post-editing of MT. Appeared on the MT-List discussion here to access the discussion list post

The Bible and Translation technologies

ALLEN, Jeff. 2002.

English version: The Bible as a Resource for Translation Software: A proposal for Machine Translation (MT) development using an untapped language resource database. In Multilingual Computing and Technology magazine. Number 51, Vol. 13, Issue 7. October/November 2002. Pp. 40-45. click here to access the original English version of the article (in separate browser window).

Version française: La Bible comme Ressource pour les Logiciels de Traduction: Une proposition de développement des systèmes de traduction automatique (TA) en utilisant une ressource linguistique inexploitée. cliquer ici pour accéder à l'article.

ALLEN, Jeff. 2003.

English version: Review of Online Bible Millennium Edition 1.11. In Multilingual Computing and Technology magazine. Number 53, Vol. 14, Issue 1. January/February 2003. Pp. 20-22. click here to access the original English version of the review article (in separate browser window).

Version française: Evaluation de La Bible Online édition Millenium v1.11. cliquer ici pour accéder à l'article.

Matching the MT system to the document type

ALLEN, Jeff. 2003. MT systems used for literary translation purposes. Appeared on the MT-List discussion forum. Click here to access the discussion list post.

ALLEN, Jeff. 2003. re: MT for literary purposes. Appeared on the MT-List discussion forum. Click here to access the discussion list post.

Types of MT systems

ALLEN, Jeff. 1999. Example-based MT. Appeared on the MT-List discussion forum. Click here to access the discussion list post.

Also see the links to LANTRA-L discussion list postings in a different section of this web site. The discussion on MT misconceptions and MT vs TM provide explanations about the different types of systems.



After writing this report I have realized that thanks to Human Language Technologies the relationship between men and computers is improving. This mens that communiction between men and computers has improved so much in the last few years, although there are still some problems to reach to a perfect communication. The developement of these new techniques is great because it helps to the communication system. I think that New Technologies are essential for our present daily work and even more and more we will need them for the future.

Language Technologies and the Information Society, Language Technologies and Resources, Multilinguality, Translation Technology, Machine Translation, methods, approaches, problems, etc..., multilingual resources. These are some themes of a great importance for New Technologies.

The most developed theme is the one about Machine TranslationAlthough New Technologies help very much to the translation of texts, novels, poems... the human capacity is better gifted to do this kind of work, because we can see things with more than one perspective, and this is necessary for a good job. As far as I am concerned, this is one of thebad points of the Machine Translation has.

I would say that HLT is helping us vanishing the big barriers that we have with the new technologies. In a no so far future, we will be able to control most of the information thanks to the computers and our improvements.

Finally I would say that we should keep on improving HLT because language is the basic mean of communication.