HUMAN LANGUAGE TECHNOLOGIES AND THE INFORMATION SOCIETY

By: Tatiana Vegas

 

ABSTRACT

This report is going to be an easy review on the questionnaires made in our class time. What I am trying to explain here is based on the information taken from web’s directions given by our teacher in a course- English Language and New Technologies- but also searching from the wide possibilities found in pages like google or yahoo. The structure this report is going to follow is based on the weekly questionnaires we should answer during this term.

The objective of this report is to give a general overview about the relation between new technologies and the language. How to solve the communication problems through new technologies, available these days.

 

INTRODUCTION

The computer and new technologies had been considered a problem to a lot of people, because it is not easy to learn how to use them. However, these technologies allow people to have a lot of advantages in their lives. One of the main advantage is not only the possibility of keeping touch with people all over the world but also the accessibility of getting information about almost any theme at the moment. Although these new technologies have brought a lot of advantages to our lives we can also find a huge problem: the multilinguality of the wide world web (www). This problem is trying to be solved through the development of new researches in technologies such as Machine Translation. These kinds of machines allow us to translate the piece of information we need into our native language.

The report I am going to make tries to be an easy overview on some of the most important things of these new technologies. I will try to define briefly some terminology such as; Language Engineering, Speech Recognition or Speech Technology.

The report is going to be divided into several parts( at least four) in forms of questionnaires(as our course has been developed). The first part is related with the information society. It covers the importance of the new communications and technologies, and the effect those things have in our society. The second part is related to the excess of information. One problem with this is that people may get a lot of different information and won’t be able to chose, and also the problem of the multilinguality, the information may appear in a language that the person may not understand. This second part deals with some terms such as Language Engineering, that may help us.

The third part of this report deals with terminology related to Speech Technology, we are going to see some good definitions of interesting terms. Our last part in this report A is going to be about Human Language Technology, the amount of information that may be created every year and also the problems this can take create and how can Human Language Technology help to solve them.

 

QUESTIONNAIRE I (17- 20 February)

The questions I am trying to solve in this first part of the report are the ones that follow:

1· Describe the different senses and usages of the terms

1·1 Human Language Technologies

1·2 Natural Language Processing

1·3 Computational Linguistics

1·4 Language Engineering

2· Does the notion of "Information Society" have any relation to Human Language?

3· Is there any concern in Europe with Human Langugae Technologies?

4· What is the current situation of the HLTCentral.org office?

1·1 Human language technologies

Human Language Technologies will help to build bridges across languages and cultures and provide natural access to information and communication services. It will enable an active use and assimilation of multimedia content, and further strengthen Europe's position at the forefront of language-enabled digital services. It will support business activities in a global context and promote a truly human-centred info structure ensuring equal access and usage opportunities for all. The ultimate goal of Human Language Technologies is an optimal use of the human capital, maximising businesses' competitiveness and empowering people.

HLTTeam@HLTCentral.org

1·2 Natural language processing:

A natural language is one that evolved along with a culture of human native speakers who use the language for general-purpose communication. Languages like English, American Sign Language and Japanese are natural languages, while languages like Esperanto are called constructed languages, having been deliberately created for a specific purpose.

Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form.

Some people view NLG as the opposite of natural language understanding. The difference can be put this way: whereas in natural language understanding the system needs to disambiguate the input sentence to produce the machine representation language, in NLG the system needs to take decisions about how to put a concept into words.

From Wikipedia, the free encyclopaedia.

Version 1.2, November 2002

 What is NLP

Natural Language Processing (NLP) is both a modern computational technology and a method of investigating and evaluating claims about human language itself. Some prefer the term Computational Linguistics in order to capture this last function, but NLP is a term that links back into the history of Artificial Intelligence (AI), the general study of cognitive function by computational processes, normally with an emphasis on the role of knowledge representations, that is to say the need for representations of our knowledge of the world in order to understand human language with computers.

Natural Language Processing (NLP) is the use of computers to process written and spoken language for some practical, useful, purpose: to translate languages, to get information from the web on text data banks so as to answer questions, to carry on conversations with machines, so as to get advice about, say, pensions and so on. These are only examples of major types of NLP, and there is also a huge range of lesser but interesting applications, e.g. getting a computer to decide if one newspaper story has been rewritten from another or not. NLP is not simply applications but the core technical methods and theories that the major tasks above divide up into, such as Machine Learning techniques, which is automating the construction and adaptation of machine dictionaries, modelling human agents' beliefs an d desires etc. This last is closer to Artificial Intelligence, and is an essential component of NLP if computers are to engage in realistic conversations: they must, like us, have an internal model of the humans they converse with.

http://www.dcs.shef.ac.uk/nlp/

 

1·3 What’s computational linguistics?

Computational linguistics (CL) is a discipline between linguistics and computer science which is concerned with the computational aspects of the human language faculty. It belongs to the cognitive sciences and overlaps with the field of artificial intelligence (AI), a branch of computer science aiming at computational models of human cognition. Computational linguistics has applied and theoretical components.

http://www.coli.uni-sb.de/~hansu/

 

1·4 Language engineering:

Language Engineering is the application of knowledge of language to the development of computer systems which can recognise, understand, interpret, and generate human language in all its forms. In practice, Language Engineering comprises a set of techniques and language resources. The former are implemented in computer software and the latter are a repository of knowledge which can be accessed by computer software.

hltteam (.at.) HLTCentral.org  Last updated: 16.02.04

2 Does the notion of "Information Society" have any relation to human language?

The term Information Society has been around for a long time now and, indeed, has become something of a cliché. The notion of the coming Information Society reminds me of the way the idea of the Sydney 2000 Olympics and the way it shimmers in the distance. We look towards the Olympics and resolve to prepare hard for it. We must rapidly transform ourselves, our city, our demeanour to be ready and worthy. Time is of the essence in making ourselves ready for the challenge. There is certain breathlessness in all of this rhetoric.

http://www.gu.edu.au/centre/cmp/Papers_97/Browne_M.html

The term Information Society has a variety of connotations and concepts which have been conveniently categorized by Frank Webster (1995):

1. Technological

The inicial concept focuses on the development, modernization and increasingly complexity of computers and telecommunications and their capacity to manipulate, store and transmit data. As to when our society becomes an Information Society, whether it will be based on the number or technological promotion of computer-based appliances in the house or office is still unclear.

2. Economic

Fritz Machlup (1960) studied the size and effect of US information industries demonstrating that education, the media, computing and information services accounted for 30% of the GDP. However, the influence of economy on the growth of the Information Society has been demonstrated by other studies which suggest the early exponential growth of information activities as a proportion of economic activities has actually slowed with little change from 1958 to 1980.

 

 

 

3. Occupational

The apparent growth of the 'information' workers over the 'production' workers as stipulated by Daniel Bell in his report 'Coming of the Post-Industrial Society' (1974) arguing that professional and technical classes would dominate the new era with theoretically-based work is still confusing. Defining the job task becomes difficult when almost all employment positions involve some level of information processing.

(taken from a report made last year)

3 Is there any concern in Europe with Human Language Technologies?

In the European Union, the concept of the Information Society has been evolving strongly over the past few years building on the philosophy originally spelled out by Commissioner Martin Bangemann in 1994. Bangemann argued that the Information Society represents a "revolution based on information ... [which] adds huge new capacities to human intelligence and constitutes a resource which changes the way we work together and the way we live together..." (European Commission, 1994:4). One of the main implications of this "revolution" for Bangemann is that the Information Society can secure badly needed jobs (Europe and the Global Information Society, 1994:3). In other words, a driving motivation for the Information Society is the creation of employment for depressed economies.

 

Closer to home it is instructive to look at just a few policy (or would-be policy) documents to see the views of the Information Society dominant here. The Goldsworthy report sees the Information Society as a "societal revolution based around information and communication technologies and about the role of these in developing global competitiveness and managing the transition to a globalised free trade world" (Department of Industry, Science and Tourism, 1997). In short, Goldsworthy's idea of the Information Society is entirely an economic one. At a broader level Barry Jones, the author of the House of Representatives Standing Committee's 1991 report 'Australia as a Information Society' sets out a definition of the Information Society which sees it as simply "a period when use of time, family life, employment, education and social interaction are increasingly influenced by access to Information Technology" (Australia as an Information Society: Grasping New Pa radigms, 1991).

These are just a few examples of ideas underpinning information policy drives in the developed world where the concept is accepted almost without challenge, and there is an inherent belief that like the Olympics, the Information Society is real - or will be very soon if only we can get ourselves organised properly. Some claim, of course, that the Information Society is here already and not just on its way. But one way or the other "it" exists and is a "good thing". By and large, national and regional Information Society documents do not question the belief that the Information Society will bring prosperity and happiness if a few basic safeguards are put in place. Some of the very few notes of serious caution in the practice of information policy have come through the influence of the Scandinavian countries which joined the European Union when the EU was already in full flight with implementing the actions flowing from the Bangemann report.

Interestingly, in recent travels in India I noticed an extraordinary level of hope and trust in that developing country in the potential of information technology to transform India into a modern fully developed economy. The push to develop information and technological infrastructure initiated by Rajiv Gandhi is seen as positive and a necessary step for the goal of a universally prosperous society in India. Effectively there is the same acceptance of the goodness of an Information Society and the absolute necessity to be one, that is found in the West.

Given this blind faith in the existence and the desirability of an Information Society among diverse nations, it is instructive to look at the theoretical literature which has spawned the idea to see what it claims for the Information Society. The term Information Society has many synonyms: Information Age, Information Revolution, Information Explosion and so on and it is found across a wide spectrum of disciplines. Fortunately the task of unravelling many of these ideas has been accomplished in a masterly way by Frank Webster. He has categorised the variety of concepts of the Information Society, Information Revolution, or whatever, and provided an analysis of five common conceptions of the Information Society (Webster, 1995).

http://www.gu.edu.au/centre/cmp/Papers_97/Browne_M.html

 

4 What is the current situation of the HLTCentral.org office

The overall objective of HLT is to support e-business in a global context and to promote a human centred infostructure ensuring equal access and usage opportunities for all. This is to be achieved by developing multilingual technologies and demonstrating exemplary applications providing features and functions that are critical for the realisation of a truly user friendly Information Society. Projects address generic and applied RTD from a multi- and cross-lingual perspective, and undertake to demonstrate how language specific solutions can be transferred to and adapted for other languages.

HLTCentral is a dedicated server providing a gateway to speech and language technology opportunities on the Web. HLTCentral web site is an online information resource of human language technologies and related topics of interest to the HLT community at large. It covers news, R&D, technological and business developments in the field of speech, language, multilinguality, automatic translation, localisation and related areas. Its coverage of HLT news and developments is worldwide - with a unique European perspective.

mailto:HLTTeam@HLTCentral.org (2001)

 

QUESTIONNAIRE II (24-27 February)

In this second part I am going to deal with:

 

1· Which are the main techniques used in Language Engineering?

2· Which language resources are essential components of Language Engineering?

3· Check for the following terms :

1·Which are the main techniques used in Language Engineering?

*Techniques

There are many techniques used in Language Engineering and some of these are described below.

*Speaker Identification and Verification

A human voice is as unique to an individual as a fingerprint. This makes it possible to identify a speaker and to use this identification as the basis for verifying that the individual is entitled to access a service or a resource. The types of problems which have to be overcome are, for example, recognising that the speech is not recorded, selecting the voice through noise (either in the environment or the transfer medium), and identifying reliably despite temporary changes (such as caused by illness).

*Speech Recognition

The sound of speech is received by a computer in analogue wave forms which are analysed to identify the units of sound (called phonemes) which make up words. Statistical models of phonemes and words are used to recognise discrete or continuous speech input. The production of quality statistical models requires extensive training samples (corpora) and vast quantities of speech have been collected, and continue to be collected, for this purpose.

There are a number of significant problems to be overcome if speech is to become a commonly used medium for dealing with a computer. The first of these is the ability to recognise continuous speech rather than speech which is deliberately delivered by the speaker as a series of discrete words separated by a pause. The next is to recognise any speaker, avoiding the need to train the system to recognise the speech of a particular individual. There is also the serious problem of the noise which can interfere with recognition, either from the environment in which the speaker uses the system or through noise introduced by the transmission medium, the telephone line, for example. Noise reduction, signal enhancement and key word spotting can be used to allow accurate and robust recognition in noisy environments or over telecommunication networks. Finally, there is the problem of dealing with accents, dialects, and language spoken, as it often is, ungrammatically.

*Character and Document Image Recognition

Recognition of written or printed language requires that a symbolic representation of the language is derived from its spatial form of graphical marks. For most languages this means recognising and transforming characters.

 

*Natural Language Understanding

The understanding of language is obviously fundamental to many applications. However, perfect understanding is not always a requirement. In fact, gaining a partial understanding is often a very useful preliminary step in the process because it makes it possible to be intelligently selective about taking the depth of understanding to further levels.

Trivial or partial analysis of texts is used to obtain a robust initial classification of unrestricted texts efficiently. One use for this initial analysis can be to focus on 'interesting' parts of a text for a deeper semantic analysis which determines the content of the text within a limited domain.

Semantic models are used to represent the meaning of language in terms of concepts and relationships between them.

Combinations of analysis and generation with a semantic model allow texts to be translated. At the current stage of development, applications where this can be achieved need be limited in vocabulary and concepts so that adequate Language Engineering resources can be applied.

*Natural Language Generation

A semantic representation of a text can be used as the basis for generating language. An interpretation of basic data or the underlying meaning of a sentence or phrase can be mapped into a surface string in a selected fashion; either in a chosen language or according to stylistic specifications by a text planning system.

*Speech Generation

Speech is generated from filled templates, by playing 'canned' recordings or concatenating units of speech (phonemes, words) together. Speech generated has to account for aspects such as intensity, duration and stress in order to produce a continuous and natural response.

Dialogue can be established by combining speech recognition with simple generation, either from concatenation of stored human speech components or synthesising speech using rules.

Providing a library of speech recognisers and generators, together with a graphical tool for structuring their application, allows someone who is neither a speech expert nor a computer programmer to design a structured dialogue which can be used, for example, in automated handling of telephone calls.

http://www.hltcentral.org/usr_docs/project-source/en/index.html

2·Which language resources are essential components of Language Engineering?

 The essential components of Language Engineering are the ones that follow:

 

 

 

*Language Resources

Language resources are essential components of Language Engineering. They are one of the main ways of representing the knowledge of language, which is used for the analytical work leading to recognition and understanding.

The work of producing and maintaining language resources is a huge task. Resources are produced, according to standard formats and protocols to enable access, in many EU languages, by research laboratories and public institutions. Many of these resources are being made available through the European Language Resources Association (ELRA).

*Lexicons

A lexicon is a repository of words and knowledge about those words. This knowledge may include details of the grammatical structure of each word (morphology), the sound structure (phonology), the meaning of the word in different textual contexts, e.g. depending on the word or punctuation mark before or after it. A useful lexicon may have hundreds of thousands of entries. Lexicons are needed for every language of application.

*Specialist Lexicons

There are a number of special cases which are usually researched and produced separately from general purpose lexicons:

Proper names: Dictionaries of proper names are essential to effective understanding of language, at least so that they can be recognised within their context as places, objects, or person, or maybe animals. They take on a special significance in many applications, however, where the name is key to the application such as in a voice operated navigation system, a holiday reservations system, or railway timetable information system, based on automated telephone call handling.

Terminology: In today's complex technological environment there are a host of terminologies which need to be recorded, structured and made available for language enhanced applications. Many of the most cost-effective applications of Language Engineering, such as multi-lingual technical document management and machine translation, depend on the availability of the appropriate terminology banks.

Word nets: A word net describes the relationships between words; for example, synonyms, antonyms, collective nouns, and so on. These can be invaluable in such applications as information retrieval, translator workbenches and intelligent office automation facilities for authoring.

*Grammars

A grammar describes the structure of a language at different levels: word (morphological grammar), phrase, sentence, etc. A grammar can deal with structure both in terms of surface (syntax) and meaning (semantics and discourse).

*Corpora

A corpus is a body of language, either text or speech, which provides the basis for:

There are national corpora of hundreds of millions of words but there are also corpora which are constructed for particular purposes. For example, a corpus could comprise recordings of car drivers speaking to a simulation of a control system, which recognises spoken commands, which is then used to help establish the user requirements for a voice operated control system for the market.

http://www.hltcentral.org/usr_docs/project-source/en/index.html

 

 

 3·Check for the following terms (choose at least five):

 

 

*Stemmer

A stemmer is a program or algorithm which determines the morphological root of a given inflected (or, sometimes, derived) word form -- generally a written word form.

A stemmer for English, for example, should identify the string "cats" (and possibly "catlike", "catty" etc.) as based on the root "cat", and "stemmer", "stemming", "stemmed" as based on "stem".

English stemmers are fairly trivial (with only occasional problems, such as "dries" being the third-person singular present form of the verb "dry", "axes" being the plural of "ax" as well as "axis"); but stemmers become harder to design as the morphology, orthography, and character encoding of the target language becomes more complex. For example, an Italian stemmer is more complex than an English one (because of more possible verb inflections), a Russian one is more complex (more possible noun declensions), a Hebrew one is even more complex (a hairy writing system), and so on.

Stemmers are common elements in query systems, since a user who runs a query on "daffodils" probably cares about documents that contain the word "daffodil" (without the s).

http://en.wikipedia.org/wiki/Main_Page

*Shallow parser

Shallow parser software which parses language to a point where a rudimentary level of understanding can be realised; this is often used in order to identify passages of text which can then be analysed in further depth to fulfil the particular objective.

*Formalism

Formalism a means to represent the rules used in the establishment of a model of linguistic knowledge.

*Domain

Domain usually applied to the area of application of the language enabled software e.g. banking, insurance, travel, etc.; the significance in Language Engineering is that the vocabulary of an application is restricted so the language resource requirements are effectively limited by limiting the domain of application

*Translator`s workbench

Translator's workbench a software system providing a working environment for a human translator, which offers a range of aids such as on-line dictionaries, thesauri, translation memories, etc.

 

*Authoring tools

Authoring tools facilities provided in conjunction with word processing to aid the author of documents, typically including an on-line dictionary and thesaurus, spell-, grammar-, and style-checking, and facilities for structuring, integrating and linking documents.

http://www.hltcentral.org/usr_docs/project-source/en/index.html

QUESTIONNAIRE III (2-5 March)

 

The followings are the questions I Had been asked about in this web:

 

State of the Art

Comments about the state-of-the-art need to be made in the context of specific applications which reflect the constraints on the task. Moreover, different technologies are sometimes appropriate for different tasks. For example, when the vocabulary is small, the entire word can be modeled as a single unit. Such an approach is not practical for large vocabularies, where word models must be built up from subword units.

The past decade has witnessed significant progress in speech recognition technology. Word error rates continue to drop by a factor of 2 every two years. Substantial progress has been made in the basic technology, leading to the lowering of barriers to speaker independence, continuous speech, and large vocabularies. There are several factors that have contributed to this rapid progress. First, there is the coming of age of the HMM. HMM is powerful in that, with the availability of training data, the parameters of the model can be trained automatically to give optimal performance.

Second, much effort has gone into the development of large speech corpora for system development, training, and testing. Some of these corpora are designed for acoustic phonetic research, while others are highly task specific. Nowadays, it is not uncommon to have tens of thousands of sentences available for system training and testing. These corpora permit researchers to quantify the acoustic cues important for phonetic contrasts and to determine parameters of the recognizers in a statistically meaningful way. While many of these corpora (e.g., TIMIT, RM, ATIS, and WSJ; see section 12.3) were originally collected under the sponsorship of the U.S. Defense Advanced Research Projects Agency (ARPA) to spur human language technology development among its contractors, they have nevertheless gained world-wide acceptance (e.g., in Canada, France, Germany, Japan, and the U.K.) as standards on which to evaluate speech recognition.

Third, progress has been brought about by the establishment of standards for performance evaluation. Only a decade ago, researchers trained and tested their systems using locally collected data, and had not been very careful in delineating training and testing sets. As a result, it was very difficult to compare performance across systems, and a system's performance typically degraded when it was presented with previously unseen data. The recent availability of a large body of data in the public domain, coupled with the specification of evaluation standards, has resulted in uniform documentation of test results, thus contributing to greater reliability in monitoring progress (corpus development activities and evaluation methodologies are summarized in chapters 12 and 13 respectively).

Finally, advances in computer technology have also indirectly influenced our progress. The availability of fast computers with inexpensive mass storage capabilities has enabled researchers to run many large scale experiments in a short amount of time. This means that the elapsed time between an idea and its implementation and evaluation is greatly reduced. In fact, speech recognition systems with reasonable performance can now run in real time using high-end workstations without additional hardware---a feat unimaginable only a few years ago.

One of the most popular, and potentially most useful tasks with low perplexity is the recognition of digits. For American English, speaker-independent recognition of digit strings spoken continuously and restricted to telephone bandwidth can achieve an error rate of 0.3% when the string length is known.

One of the best known moderate-perplexity tasks is the 1,000-word so-called Resource Management (RM) task, in which inquiries can be made concerning various naval vessels in the Pacific ocean. The best speaker-independent performance on the RM task is less than 4%, using a word-pair language model that constrains the possible words following a given word . More recently, researchers have begun to address the issue of recognizing spontaneously generated speech. For example, in the Air Travel Information Service (ATIS) domain, word error rates of less than 3% has been reported for a vocabulary of nearly 2,000 words and a bigram language model with a perplexity of around 15.

High perplexity tasks with a vocabulary of thousands of words are intended primarily for the dictation application. After working on isolated-word, speaker-dependent systems for many years, the community has since 1992 moved towards very-large-vocabulary (20,000 words and more), high-perplexity ,speaker-independent, continuous speech recognition. The best system in 1994 achieved an error rate of 7.2% on read sentences drawn from North America business news.

With the steady improvements in speech recognition performance, systems are now being deployed within telephone and cellular networks in many countries. Within the next few years, speech recognition will be pervasive in telephone networks around the world. There are tremendous forces driving the development of the technology; in many countries, touch tone penetration is low, and voice is the only option for controlling automated services. In voice dialing, for example, users can dial 10--20 telephone numbers by voice (e.g., call home) after having enrolled their voices by saying the words associated with telephone numbers. AT&T, on the other hand, has installed a call routing system using speaker-independent word-spotting technology that can detect a few key phrases (e.g., person to person, calling card) in sentences such as: I want to charge it to my calling card.

At present, several very large vocabulary dictation systems are available for document generation. These systems generally require speakers to pause between words. Their performance can be further enhanced if one can apply constraints of the specific domain such as dictating medical reports.

Even though much progress is being made, machines are a long way from recognizing conversational speech. Word recognition rates on telephone conversations in the Switchboard corpus are around 50% . It will be many years before unlimited vocabulary, speaker-independent continuous dictation capability is realized.

http://cslu.cse.ogi.edu/HLTsurvey/ch1node2.html#Chapter1

Speech recognition

(Or voice recognition) The identification of spoken words by a machine. The spoken words are digitised (turned into sequence of numbers) and matched against coded dictionaries in order to identify the words.

Most systems must be "trained," requiring samples of all the actual words that will be spoken by the user of the system. The sample words are digitised, stored in the computer and used to match against future words. More sophisticated systems require voice samples, but not of every word. The system uses the voice samples in conjunction with dictionaries of larger vocabularies to match the incoming words. Yet other systems aim to be "speaker-independent", i.e. they will recognise words in their vocabulary from any speaker without training.

Another variation is the degree with which systems can cope with connected speech. People tend to run words together, e.g. "next week" becomes "neksweek" (the "t" is dropped). For a voice recognition system to identify words in connected speech it must take into account the way words are modified by the preceding and following words.

It has been said (in 1994) that computers will need to be something like 1000 times faster before large vocabulary (a few thousand words), speaker-independent, connected speech voice recognition will be feasible.

http://www.hyperdictionary.com/index.html

 

Speech Recognition (SR)

This involves the computer taking the user's speech and interpreting what has been said. This allows the user to control the computer (or certain aspects of it) by voice, rather than having to use the mouse and keyboard, or alternatively just dictating the contents of a document.

The complex nature of translating the raw audio into phonemes involves a lot of signal processing and is not focused on here. These details are taken care of by an SR engine that will be installed on your machine. SR engines are often called recognisers and these days typically implement continuous speech recognition (older recognisers implemented isolated or discrete speech recognition, where pauses were required between words).

Speech recognition usually means one of two things. The application can understand and follow simple commands that it has been educated about in advance. This is known as command and control (sometimes seen abbreviated as CnC, or simply SR).

Alternatively an application can support dictation (sometimes abbreviated to DSR). Dictation is more complex as the engine has to try and identify arbitrary spoken words, and will need to decide which spelling of similarly sounding words is required. It develops context information based on the preceding and following words to try and help decide. Because this context analysis is not required with Command and Control recognition, CnC is sometimes referred to as context-free recognition.

http://www.blong.com/Conferences/DCon2002/Speech/Speech.htm

 

Another definition

From Wikipedia, the free encyclopedia.

Speech recognition technologies allow computers equipped with microphones to interpret human speech, e.g. for transcription or as a control method.

Such systems can be classified as to whether they require the user to "train" the system to recognise their own particular speech patterns or not, whether the system can recognise continuous speech or requires users to break up their speech into discrete words, and whether the vocabulary the system recognises is small (in the order of tens or at most hundreds of words), or large (thousands of words).

Systems requiring a short amount of training can (as of 2001) capture continuous speech with a large vocabulary at normal pace with an accuracy of about 98% (getting two words in one hundred wrong), and different systems that require no training can recognize a small number of words (for instance, the ten digits of the decimal system) as spoken by most English speakers. Such systems are popular for routing incoming phone calls to their destinations in large organisations.

Commercial systems for speech recognition have been available off-the-shelf since the 1990s. However, it is interesting to note that despite the apparent success of the technology, few people use such speech recognition systems.

It appears that most computer users can create and edit documents more quickly with a conventional keyboard, despite the fact that most people are able to speak considerably faster than they can type. Additionally, heavy use of the speech organs results in vocal loading.

Some of the key technical problems in speech recognition are that:

The "understanding" of the meaning of spoken words is regarded by some as a separate field, that of natural language understanding. However, there are many examples of sentences that sound the same, but can only be disambiguated by an appeal to context: one famous T-shirt worn by Apple Computer researchers stated:

I helped Apple wreck a nice beach.

A general solution of many of the above problems effectively requires human knowledge and experience, and would thus require advanced artificial intelligence technologies to be implemented on a computer. In particular, statistical language models are often employed for disambiguation and improvement of the recognition accuracies.

http://en.wikipedia.org/wiki/Main_Page

Speech Synthesis

A Text-To-Speech (TTS) synthesizer is a computer-based system that should be able to read any text aloud, whether it was directly introduced in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system. Let us try to be clear. There is a fundamental difference between the system we are about to discuss here and any other talking machine (as a cassette-player for example) in the sense that we are interested in the automatic production of new sentences. This definition still needs some refinements. Systems that simply concatenate isolated words or parts of sentences, denoted as Voice Response Systems, are only applicable when a limited vocabulary is required (typically a few one hundreds of words), and when the sentences to be pronounced respect a very restricted structure, as is the case for the announcement of arrivals in train stations for instance. In the context of TTS synthesis, it is impossible (and luckily useless) to record and store all the words of the language. It is thus more suitable to define Text-To-Speech as the automatic production of speech, through a grapheme-to-phoneme transcription of the sentences to utter.

http://tcts.fpms.ac.be/synthesis/introtts.html

Another definition


Speech synthesis is the computer-generated simulation of human speech. It is used to translate written information into aural information where it is more convenient, especially for mobile applications such as voice-enabled e-mail and unified messaging. It is also used to assist the vision-impaired so that, for example, the contents of a display screen can be automatically read aloud to a blind user. Speech synthesis is the counterpart of speech or voice recognition. The earliest speech synthesis effort was in 1779 when Russian Professor Christian Kratzenstein created an apparatus based on the human vocal tract to demonstrate the physiological differences involved in the production of five long vowel sounds. The first fully functional voice synthesizer, Homer Dudley's VODER (Voice Operating Demonstrator), was shown at the 1939 World's Fair. The VODER was based on Bell Laboratories' vocoder (voice coder) research of the mid-thirties.

http://www.whatis.com/definition/0,,sid9_gci773595,00.html

Speech-to-Speech translation

It is worth remembering that most prototypes developed within research projects are currently only capable of processing a few hundreds of sentences (around 300), on very specific topics (accommodation-booking, planning trips, etc.) and for a small group of languages—English, German, Japanese, Spanish, Italian. It seems unlikely that any application will be able to go beyond these boundaries in the near future.

The direct incorporation of speech translation prototypes into industrial applications is at present too costly. However, the growing demand for these products leads us to believe that they will soon be on the market at more affordable prices. The systems developed in projects such as Verbmobil, EuTrans or Janus—despite being at the laboratory phase—contain in practice thoroughly evaluated and robust technologies. A manufacturer considering their integration may join R&D projects and take part in the development of prototypes with the prospect of a fast return on investment. It is quite clear that we are witnessing the emergence of a new technology with great potential for penetrating the telecommunications and microelectronics market in the not too distant future.

Another remarkable aspect of the EuTrans project is its methodological contribution to machine translation as a whole, both in speech and written modes. Although these two modes of communication are very different in essence, and their respective technologies cannot always be compared, speech-to-speech translation has brought prospects of improvement for text translation. Traditional methods for written texts tend to be based on grammatical rules. Therefore, many MT systems show no coverage problem, although this is achieved at the expense of quality. The most common way of improving quality is by restricting the topic of interest. It is widely accepted that broadening of coverage immediately endangers quality. In this sense, learning techniques that enable systems to automatically adapt to new textual typologies, styles, structures, terminological and lexical items could have a radical impact on the technology.

Due to the differences between oral and written communication, rule-based systems prepared for written texts can hardly be re-adapted to oral applications. This is an approach that has been tried, and has failed. On the contrary, example-based learning methods designed for speech-to-speech translation systems can easily be adapted to the written texts, given the increasing availability of bilingual corpora. One of the main contributions of the PRHLT-ITI group is precisely in its learning model based on bilingual corpora. Herein lie some interesting prospects for improving written translation techniques.

Effective speech-to-speech translation, along with other voice-oriented technologies, will become available in the coming years, albeit with some limitations e.g. the number of languages, linguistic coverage, and context. It could be argued that EuTrans' main contribution has been to raise the possibilities of speech-to-speech translation to the levels of speech recognition technology, making any new innovation immediatly accessible.

http://www.hltcentral.org/page-1086.0.shtml

 

QUESTIONNAIRE IV (9-12 Marzo)

 

Information fatigue syndrome

David Lewis coined the term "information fatigue syndrome" for what he expects will soon be a recognized medical condition.

"Having too much information can be as dangerous as having too little. Among other problems, it can lead to a paralysis of analysis, making it far harder to find the right solutions or make the best decisions."

"Information is supposed to speed the flow of commerce, but it often just clogs the pipes."

David Lewis

Dr. David Lewis is a British psychologist, author of the report Dying for Information?, commissioned by London based Reuters Business Information. Lewis has coined the term "information fatigue syndrome" for what he expects will soon be a recognized medical condition. Lewis is a consultant who has studied the impact of data proliferation in the corporate world.

http://http://sirio.deusto.es/abaitua/konzeptu/fatiga.htm

 

Information fatigue

An interesting consequence of the existence of the information society in which administrators are forced to deal with gigantic amounts of information has been recognised by David Lewis as 'information fatigue syndrome'. He points out stress-related imperfections among information processors in the report 'Dying for Information' comissioned for Reuters Business International. The stated origin of the disorder, arising from the proliferation of data in the corporate world, comes from the inability of individuals to cope with the processing of a multitude of information in the decision making process and manifests itself in such symptoms as tension, irritability and feelings of hopelessness. The specific training of individuals in the management of data, including the rationalization of the relevant from the non-relevant is viewed as the key to the management of the syndrome.

(Taken from a Report from last year)

EXECUTIVE SUMMARY

I. Summary of Findings

How much new information is created each year? Newly created information is stored in four physical media – print, film, magnetic and optical – and seen or heard in four information flows through electronic channels – telephone, radio and TV, and the Internet. This study of information storage and flows analyzes the year 2002 in order to estimate the annual size of the stock of new information recorded in storage media, and heard or seen each year in information flows. Where reliable data was available we have compared the 2002 findings to those of our 2000 study (which used 1999 data) in order to describe a few trends in the growth rate of information.

 

  1. Print, film, magnetic, and optical storage media produced about 5 exabytes of new information in 2002. Ninety-two percent of the new information was stored on magnetic media, mostly in hard disks.
  2.  

  3. We estimate that the amount of new information stored on paper, film, magnetic, and optical media has about doubled in the last three years.
  4. Information flows through electronic channels -- telephone, radio, TV, and the Internet -- contained almost 18 exabytes of new information in 2002, three and a half times more than is recorded in storage media. Ninety eight percent of this total is the information sent and received in telephone calls - including both voice and data on both fixed lines and wireless.
  5. http://http://www.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm#summary

     

    HUMAN LANGUAGE TECHNOLOGIES

    The overall objective of HLT is to support e-business in a global context and to promote a human centred infostructure ensuring equal access and usage opportunities for all. This is to be achieved by developing multilingual technologies and demonstrating exemplary applications providing features and functions that are critical for the realisation of a truly user friendly Information Society. Projects address generic and applied RTD from a multi- and cross-lingual perspective, and undertake to demonstrate how language specific solutions can be transferred to and adapted for other languages.

    While elements of the three initial HLT action lines - Multilinguality, Natural Interactivity and Crosslingual Information Management are still present, there has been periodic re-assessment and tuning of them to emerging trends and changes in the surrounding economic, social, and technological environment. The trials and best practice in multilingual e-service and e-commerce action line was introduced in the IST 2000 work programme (IST2000) to stimulate new forms of partnership between technology providers, system integrators and users through trials and best practice actions addressing end-to-end multi-language platforms and solutions for e-service and e-commerce. The fifth IST call for proposals covered this action line.

    http://sirio.deusto.es/abaitua/konzeptu/nlp/hlt.htm

    Human language technologies

    "Language technology refers to a range of technologies that have been developed over the last 40 years to enable people to more easily and naturally communicate with computers, through speech or text and, when called for, receive an intelligent and natural reply in much the same way as a person might respond." (E-S.l)

    "Human Language Techology is the term for the language capabilities designed into the computing applications used in information and communication technology systems." (EM).

    "Human Language Technology is sometimes quite familiar, e.g. the spell checker in your word processor, but can often be hidden away inside complex networks – a machine for automatically reading postal addresses, for example." (EM)

    "From speech recognition to automatic translation, Human Language Technology products and services enable humans to communicate more naturally and more effectively with their computers – but above all, with each other." (EM)

    http://http://sirio.deusto.es/abaitua/konzeptu/nlp/hlt/hlt0304_t1.htm#top

    Intelligent Text Processing

    Intelligent Text Processing: "Ever been frustrated by a search engine? Find out how they work, but more importantly, find out how to make them intelligent. This unit also covers sophisticated web-based language technologies like document summarization, information extraction and machine translation. If you want to know about the Semantic Web, this is the unit for you." (CLT)

    http://http://sirio.deusto.es/abaitua/konzeptu/nlp/hlt/hlt0304_t1.htm#top

     

     

     

    What Is The Semantic Web?

    The Semantic Web is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. You can think of it as being an efficient way of representing data on the World Wide Web, or as a globally linked database.

    The Semantic Web was thought up by Tim Berners-Lee, inventor of the WWW, URIs, HTTP, and HTML. There is a dedicated team of people at the World Wide Web consortium (W3C) working to improve, extend and standardize the system, and many languages, publications, tools and so on have already been developed. However, Semantic Web technologies are still very much in their infancies, and although the future of the project in general appears to be bright, there seems to be little consensus about the likely direction and characteristics of the early Semantic Web.

    What's the rationale for such a system? Data that is geneally hidden away in HTML files is often useful in some contexts, but not in others. The problem with the majority of data on the Web that is in this form at the moment is that it is difficult to use on a large scale, because there is no global system for publishing data in such a way as it can be easily processed by anyone. For example, just think of information about local sports events, weather information, plane times, Major League Baseball statistics, and television guides... all of this information is presented by numerous sites, but all in HTML. The problem with that is that, is some contexts, it is difficult to use this data in the ways that one might want to do so.

    So the Semantic Web can be seen as a huge engineering solution... but it is more than that. We will find that as it becomes easier to publish data in a repurposable form, so more people will want to pubish data, and there will be a knock-on or domino effect. We may find that a large number of Semantic Web applications can be used for a variety of different tasks, increasing the modularity of applications on the Web. But enough subjective reasoning... onto how this will be accomplished.

    The Semantic Web is generally built on syntaxes which use URIs to represent data, usually in triples based structures: i.e. many triples of URI data that can be held in databases, or interchanged on the world Wide Web using a set of particular syntaxes developed especially for the task. These syntaxes are called "Resource Description Framework" syntaxes.

    http://infomesh.net/2001/swintro/

    CONCLUSION

    Through this report I have learnt that our society develops in such a way that New Technologies must adapt to the necessities people have. The information is really important for our lives because through it we may understand a lot of daily troubles. There is one big problem people may find with these technologies, which is the language; this problem can be solved through the use of the so-called Machine Translation, something really useful for those who are not able to understand the information they get. (We, as students of English philology, should not have these kinds of troubles.)

    In this report I have tried to show you that New Technologies can become a problem for the amount of information you can received, you must be able to distinguise what you really need and what you must reject.Nowadays,we must realize that having information is extremely important for us, as students. Our degree needs a lot of external information we may found in the web.But we sometimes find problems with the language, in our particular case we are supposed to understand the information we get.