A REWIEW OF HUMAN LANGUAJE TECHNOLOGIES AND THEIR ROLE UN THE INFORMATION SOCIETY

Abstract
Introduction
1. Language Technologies and the Information Society
2. Information overload and methods to improve data-management
3. Language Technology and Engineering
4. Multilinguality
5. Machine Traslations
6. Machine Traslations II
7. Assessment of the contribution of Machine Traslation to Linguistic diversity on the Internet
Conclusion
References

ABSTRACT

This report contain information taken about Internet and analizes what the Information Society is,the focus of HLTCentral.org, and why languajes are important for this kind of Society. The report shows how application of telematics and the use of language technology will benefit our way of life, offering us a lot information , but some problems can arise like the selection of the important information or the difficulty to be able to understand the contain of the documents. To continue with this in the second point of the report the reader will be able check that having too much information can be as dangerous as having too little,so the report introduces some methods to carry out the information overload because the important thing is the knowledge.

The report third point talks about how Language technology and engineering improve the use of language as language technology are specialized for dealing with the most complex information medium in our world and provides the interface to the fast growing area of knowledge technologies. Besides the report show how Computational linguistics is concerned with the computational aspects of the human language faculty and Language Engineering uses the knowledge to enhance the application of computer systems.
Language Engineering can improve the quality of information services by using techniques which not only give more accurate results to search requests, but also increase greatly the possibility of finding all the relevant information available. Use of techniques like concept searches, i.e. using a semantic analysis of the search criteria and matching them against a semantic analysis of the database, give far better results than simple keyword searches.
A problem that the report analizes is the multilinguality like a barrier to communication.
Localization companies are now investing heavily in new technologies to provide a total globalization solution. Multilingual content management, translation management systems, translation memory technology and machine translation systems are now integrated in a workflow environment to provide real-timelocalization and connectivity.

Other topic of the report is "Machine traslation". Using machine translation facilities the person seeking information, will be able to complete an information request in his or her native language and receive the information in that same language, regardless of the language in which the information is recorded. But they have an inconvenient: Sometimes Natural language is so complexity that machine traslation becames impracticable. In the report there are written some examples of lexical ambiguity, structural ambiguity, lexical and structural mismaches, collocations ...
Moreover the report contains information about the evolution of machine traslation above all obaut ALPAC and FAQTH (the second and the third decade of the evolution),and their methods, techniques and approaches such as Interlingua approach, transfer approach,interlingua approach, corpus based methods ...

The last point or part of the report is about the features of Internet that they can be summed up in four points: its operation is efficient; its extension is global; its use is flexible; and its form is electronic.
Finally this part finish with the ways what Machine Translation can be applied on the Internet.


INTRODUCTION

The term information Society has been around for a long time now and, indeed, has become something of a cliché.
In the European Union, the concept of the Information Society has been evolving strongly over the past few years building on the philosophy originally spelled out by Commissioner Martin Bangemann in 1994. Bangemann argued that the Information Society represents a revolution based on information ... [which] adds huge new capacities to human intelligence and constitutes a resource which changes the way we work together and the way we live together..." (European Commission, 1994:4).

Our society information develop fastly to a Society based in the knowlegment, where the important thing not only is the acces to the information but having tools that allow getting,exploting, spreading and sharing the knowlegment too. This is very important in places where telematics allow to develop virtual works and progress enviroments independient of the distance and the geografic place.

Information is one of those terms we use a dozen times a day with probably a dozen different meanings. This is not surprising as our language has not yet provided us with ways of distinguishing the variety of meanings we wish to impart when we use the word.

The report introduces some methods to carry out the information overload and how the development telecomunication technologies and language technologies can help us manage information. Moreover, multilingual services are been developing in many areas, so automated traslation combined with the management of documentation, including technical manuals and user handbooks, will help to improve the quality of service in a global marketplace.
The increase of information in electronic format is linked to avances in computational techniques for dealing with it. Because of informational Webs in Internet, we can see a growing number of search and retrieval devices, some of which integrate translation technology.

Information technologies and Communications are necesary to "be global" to mantein efficients levels of local gestion.
Language technologies can make important contributions by improving the processing and retrieval of information,providing more natural interfaces with computers and helping to overcome language barriers
They can be applied to a wide range of problems in business and administration to produce better, more effective solutions, and can also be used in education, to help the disabled, and to bring new services both to organisations and to consumers.
Language Engineering can improve the quality of information services by using techniques which not only give more accurate results to search requests, but also increase greatly the possibility of finding all the relevant information available. Use of techniques like concept searches, i.e. using a semantic analysis of the search criteria and matching them against a semantic analysis of the database, give far better results than simple keyword searches.

The report analizes that traslation is not an easy task, above all in a multilinguality environment, where the linguistic and computational complexities of machine translation are not always apparent to all users or potential purchasers of systems. As a consequence, they are sometimes unable to distinguish between the failings of particular systems and the problems which the best system would have so the goal of machine traslation must be the development of fully automatic systems producing high quality traslations, and to be able to take part in Internet.

As important consequence of the popularization of Internet is that the access to information is now truly global and the demand for localizing instituional and commercial Web sites is growing fast. In the localization industry, developing adequate tool has economic benefits.

The report mainly is based in nowdays topics like Information Society, Machine Traslation, Internet ..etc It is divided in small chapters which their content is worked according a weekly questionnaire, whose answers can be founded in on-line references like wep pages or books.
The first chapter talk about Language technologies and Information Society. It describes the role of HLTCentral.org that is is to support e-business in a global context and to promote a human centred infostructure ensuring equal access and usage opportunities for all, and shows the importance of the Language Technologies for the Information Society.
The second chapter gives us information about the information overload and the methods to improve data management.In this part of the report we can see how the quantity of information isn´t important because knowledge is power. Besides we can read how computer science and language technologies help manage information.
Next chapter describe three terms very popular: Language Technology, Computational Linguistics and Language Engineering, as they have relation with human language, linguistics and computer science. in the report is described their meaning with some examples and Language Engineering´s main techniques.
In the fourth chapter describe how Technical documentation is becoming electronic, in the form of CD-ROM, on-line manuals, intranets, etc. An important consequence of the popularization of Internet is that the access to information is now truly global and the demand for localizing institutional and commercial Web sites is growing fast. In the localization industry, the utilization of technology is congenital, and developing adequate tools has immediate economic benefits.
The reader, after reading this chapter will be able to define internalization, globalization and localization terms and how they affect the design of software products. He/She will know the words Machine Traslation, Human Traslation an Traslation Workstation.
The last chapters demostrate that traslation is a difficult task and sometimes in expecific contens, human traslation participate is necesary as diferent types of ambiguity can appear. Besides a part of the evolution of machine traslation is described and the techniques, methods and approaches they use. Finally we can check the ways Machine Translation can be applied on the Internet

1.LANGUAJE TECHNOLOGIES AND THE INFORMATION SOCIETY

The "Information Society"

"Information Society" has this name because we can get information inmediately an of time and space at presnt. The development and convergence of computer and telecommunication technologies has led to a revolution in the way that we work, communicate with each other, buy goods and use services, and even the way we entertain and educate ourselves. One of the results of this revolution is that large volumes of information will increasingly be held in a form which is more natural for human users than the strictly formatted, structured data typical of computer systems of the past. Information presented in visual images, as sound, and in natural language, either as text or speech, will become the norm. We all deal with computer systems and services, either directly or indirectly, every day of our lives. This is the information age and we are a society in which information is vital to economic, social, and political success as well as to our quality of life. There are still many new ways in which the application of telematics and the use of language technology will benefit our way of life, from interactive entertainment to lifelong learning. Although these changes will bring great benefits, it is important that we anticipate difficulties which may arise, and develop ways to overcome them. Examples of such problems are:

The information society, by its very nature, cuts across traditional boundaries. This website is a guide through its many and various aspects, covering for example:

The last few years have witnessed a transformation in the industrial landscape of the developed world. Telecommunications liberalisation, the explosive growth of the Internet and a growing tide of mergers between computer, media and telecommunications companies all point to one thing - the birth of the information society
For more details you can go to this page:
http://europa.eu.int/iformation_society/index_en.html Feb 2003

The Europe Union Europe has opened his servers www in Internet and in our days, we can read o consult his data base not only althrougth a dedicate line of telephone, but athrough the Net too. Telecomunications wold has changed, so the way to access to the information is influence by the universalization that supose the information motorway. The pass from a industry society to Information Society , not only is aprophecy, it tis a constetable reality too.
So the Internet Net by menas of his famous tool, the Web, make a elementary element to switch on this new Information Society. The resources guide has been created to falicity the localization of the plentiful information in Internet about The Europe Union, and his informations politics. You can visit:
http://www.bib.uc3m.es/~mendez/politicas/europaframe.htm Feb 2003

The Information Society (TIS) journal, published since 1981, is a key critical forum for leading edge analysis of the impacts, policies, system concepts, and methodologies related to information technologies and changes in society and culture. Some of the key information technologies include computers and telecommunications; the sites of social change include homelife, workplaces, schools, communities and diverse organizations, as well as new social forms in cyberspace. TIS is a refereed journal that publishes scholarly articles, position papers, debates, short communications and book reviews. Rob Kling serves as TIS's Editor-in-Chief. TIS is published by Taylor & Francis, who has a long tradition of publishing fine journals. The Information Society editorial office is located in the Center for Social Informatics at the School of Library and Information Science (SLIS) at Indiana University.
If you want more information about this article (Tittle and editor and Tis Editorial office Menber) click here:
Information.html Feb 2003
Moreover you can read about History of The Information Society Journal in this addrees: http://www.slis.indiana.edu/TIS/contact/history.html Feb 2003
http://www.slis.indiana.edu/TIS/ Feb 2003

Would yo like to know the Information Society Topics? Clic here:
Topics.html Feb 2003

For more information about Information Society, read these pages:
http://www.cordis.lu/ist/ Feb 2003
http://europa.eu.int/information_society/newsroom/documents/catalogue_en.pdf Feb 2003
http://www.ics.uci.edu/~kling/tistopic.html Feb 2003
http://www.unesco.org/webworld/observatory/indexs.html Feb 2003
http://www.slis.indiana.edu/TIS/contact/history.html Feb 2003

The more relevant technologie to language (computers, satellites, printing machines, writing systems)

The more relevant technology to languaje are writing sistems because as Walter J text can read, this technology, by contrast with natural or oral speech, is completely artificial. There is no way to write "naturally". [...] Technologies are artificial, but -paradox again- artificiality is natural to human beings."
Ong's appointment: "Writing makes it possible to separate logic (thought structure of discourse) from rhetoric (socially effective discourse). The invention of logic, it seems, is tied not to any kind of writing system but to the completely vocalic phonetic alphabet and the intensive analytic activity which such an alphabet demands of its inventors and subsequently encourages in all sorts of noetic filds."

The importance of language technologies for the Information Society

Language technologies are so important for the information Society beacuse Human Language Technologies RTD contributes to enhancing usability and accessibility of digital content and services while supporting linguistic diversity in Europe. The overarching aim of HLT is to maximise the effectiveness and competitiveness of global business activities and to promote a truly human-centred infostructure ensuring equal access and usage opportunities for all. HLT actions initially addressed three intertwined areas centred around how people interact with information, with information services and with each other

For more information in detail you can vist:
http://www.hltcentral.org/page-842.0.shtml Feb 2003
Human Languaje Technologies.html Feb 2003

The role of HLTCentral.org

HLTCentral is a Gateway to Speech & Language Technology Opportunities on the Web HLTCentral web site was established as an online information resource of human language technologies and related topics of interest to the HLT community at large. It covers news, R&D, technological and business developments in the field of speech, language, multilinguality, automatic translation, localisation and related areas. Its coverage of HLT news and developments is worldwide - with a unique European perspective.
Two EU funded projects, ELSNET and EUROMAP, are behind the development of HLTCentral. EUROMAP ("Facilitating the path to market for language and speech technologies in Europe") - aims to provide awareness, bridge-building and market-enabling services for accelerating the rate of technology transfer and market take-up of the results of European HLT RTD projects.
ELSNET ("The European Network of Excellence in Human Language Technologies") - aims to bring together the key players in language and speech technology, both in industry and in academia, and to encourage interdisciplinary co-operation through a variety of events and services. If you want to Know about its projects click here:
Projecs.html Feb 2003
http://www.hltcentral.org Feb 2003

2. INFORMATION OVERLOAD AND METHODS TO IMPROVE DATA-MANAGEMENT

As David Lewis says, an excess of information is strangling many businesses and causing mental anguish and even physicall illness in managers at all levels, so having too much information can be as dangerous as having too little. The problem is expected to worsen as more use is made of the Internet .More information in:
Information overload.html Feb 2003 ( In this page we can see how Peter Guilford, a spokesman for the Eurepean Commission in Brussels, sometimes fells that he is suffering from information overload.)

Others:
Information fatigue syndrome. html
Strategies for dealing with information and solution .html
"Books you can read"

We should consider knowlegment more value than the information, because Knowledge is power, but information is not. It's like the detritus that a gold-panner needs to sift through in order to find the nuggets. ("El conocimiento otorga poder, pero la información no. Es como el fango que el buscador de oro debe filtrar para dar con las pepitas." Dr. David Lewis) Sometimes the more information we have the worse understand it and the more difficult to find we find, is that, having too much information can be as dangerous as having too little. Among other problems, it can lead to a paralysis of analysis, making it far harder to find the right solutions or make the best decisions." "Information is supposed to speed the flow of commerce, but it often just clogs the pipes"Dealing with the information burden is one of the most urgent challenges facing businesses. Unless we can discover ways of staying afloat amidst the surging torrents of information, we may end u drowing in them."

Some notes of curiosity

The representation of the information (getting available information)

One of the key features of an information service is its ability to deliver information which meets the immediate, real needs of its client in a focused way. It is not sufficient to provide information which is broadly in the category requested, in such a way that the client must sift through it to extract what is useful. Equally, if the way that the information is extracted leads to important omissions, then the results are at best inadequate and at worst they could be seriously misleading. Information is available throughout the world, on the World Wide Web, for example, in different languages. In reality, however, it is only available to a client who can firstly request the information in the language in which it is recorded and then understand the language in which the information is presented. Using machine translation facilities the person seeking information will be able to complete an information request in his or her native language and receive the information in that same language, regardless of the language in which the information is recorded.
Language Engineering can improve the quality of information services by using techniques which not only give more accurate results to search requests, but also increase greatly the possibility of finding all the relevant information available. Use of techniques like concept searches, i.e. using a semantic analysis of the search criteria and matching them against a semantic analysis of the database, give far better results than simple keyword searches.
To sum up this part the most convienent way of representation information could be to publish it in a place where everybody can acces. Supporting it in differents languajes is important as more people can understand it.

The help of computer science and languaje technologies managing information

Distance learning has become an important part of the provision of education services. It is especially important to the concept of 'life-long learning' which is expected to become an important feature of life in the Information Age. The effectiveness of distance learning and self-study is improved by using telematics services and computer aided learning, this means that computer and languaje technologies can help manage information. The quality and success of computer aided learning can be greatly enhanced by the use of Language Engineering techniques. If the computer aided learning package can understand the answers which its users give to questions, rather than simply recognise that the answer is right or wrong, it can direct them down a path which is more appropriate to their needs. In this way, students are likely to learn more effectively and have a longer concentration span, because a more sensitive package is inherently more comfortable to work with.

The possibility of tele-presence in virtual environments such as museums, art galleries and libraries will provide a rich cultural experience, available to a wide section of society in the comfort and convenience of their own homes. Virtual visits to such cultural archives will be aided by language technology enabling the research and selection of all forms of digitised language based records, indexing and retrieval of images, dubbing of films and automatic production of sub-titles and providing translation of library and archive material

Sometimes languajes an be barrier to the comunication

Communication is probably the most obvious use of language. On the other hand, language is also the most obvious barrier to communication. Across cultures and between nations, difficulties arise all the time not only because of the problem of translating accurately from one language to another, but also because of the cultural connotations of word and phrases. A typical example in the European context is the word 'federal' which can mean a devolved form of government to someone who already lives in a federation, but to someone living in a unitary sovereign state, it is likely to mean the imposition of another level of more remote, centralised government.

As the application of language knowledge enables better support for translators, with electronic dictionaries, thesauri, and other language resources, and eventually when high quality machine translation becomes a reality, so the barriers will be lowered. Agreements at all levels, whether political or commercial, will be better drafted more quickly in a variety of languages. International working will become more effective with a far wider range of individuals able to contribute. An example of a project which is successfully helping to improve communications in Europe is one which interconnects many of the police forces of northern Europe using a limited, controlled language which can be automatically translated, in real-time. Such a facility not only helps in preventing and detecting international crime, but also assists the emergency services to communicate effectively during a major incident.


3. LANGUAGE TECHNOLOGY AND ENGINEERING

The ways what Language Engineering improves the use of language

Language Engineering provides ways in which we can extend and improve our use of language to make it a more effective tool. It is based on a vast amount of knowledge about language and the way it works, which has been accumulated through research. It uses language resources, such as electronic dictionaries and grammars, terminology banks and corpora, which have been developed over time. The research tells us what we need to know about language and develops the techniques needed to understand and manipulate it. The resources represent the knowledge base needed to recognise, validate, understand, and manipulate language using the power of computers. By applying this knowledge of language we can develop new ways to help solve problems across the political, social, and economic spectrum.
Language Engineering is a technology which uses our knowledge of language to enhance our application of computer systems:


New opportunities are becoming available to change the way we do many things, to make them easier and more effective by exploiting our developing knowledge of language.

The differences and similarities of Language Technology, Language Engineering and Computational Linguistic

Differences

In their definition

The Technologies are techniques and tools, the methods of analysis , Language Engineering is the implementation of the technology, the developing of tools and services, to put in practice the Knowleges. Computational Linguistics is the technology which is concerned with the computational aspects of the human language faculty.

Examples:
Between Language Technology and Language Engineering
Between Computational Linguistics and the others

Similarities
Between Language Technology and Computational Linguistics

Betwen Language Engineering and Computational Linguistics
Betwen Language Technology and Language Engineering
Between three of them

The main techniques used in Language Engineering

Definicion of some terms:

Natural language processing
It is a a term in use since the 1980s to define a class of software systems which handle text intelligently
Translator´s workbench
It is a software system providing a working environment for a human translator, which offers a range of aids such as on-line dictionaries, thesauri, translation memories, etc
Shallow parser
It is a software which parses language to a point where a rudimentary level of understanding can be realised; this is often used in order to identify passages of text which can then be analysed in further depth to fulfil the particular objective
Formalism
It is a means to represent the rules used in the establishment of a model of linguistic knowledge
Speech recognition
It is a technique used in Lnaguaje Engineering (read: The main techniques used in Language Engineering
Text alignment
It is the process of aligning different language versions of a text in order to be able to identify equivalent terms, phrases, or expressions
Authoring tools
The are facilities provided in conjunction with word processing to aid the author of documents, typically including an on-line dictionary and thesaurus, spell-, grammar-, and style-checking, and facilities for structuring, integrating and linking documents
Controlled languaje
It is a language which has been designed to restrict the number of words and the structure of (also artificial language) language used, in order to make language processing easier; typical users of controlled language work in an area where precision of language and speed of response is critical, such as the police and emergency services, aircraft pilots, air traffic control, etc
Domain
It is usually applied to the area of application of the language enabled software e.g. banking, insurance, travel, etc.; the significance in Language Engineering is that the vocabulary of an application is restricted so the language resource requirements are effectively limited by limiting the domain of application

4. MULTILINGUALITY

Factors make technology indispensable in the traslation curricula

When discussing the relevance of technological training in the translation curricula, it is important to clarify the factors that make technology more indispensable and show how the training should be tuned accordingly. The relevance of technology will depend on the medium that contains the text to be translated. This particular aspect is becoming increasingly evident with the rise of the localization industry, which deals solely with information in digital form. There may be no other imaginable means for approaching the translation of such things as on-line manuals in software packages or CD-ROMs with technical documentation than computational ones.

The nedding (not needing) of traslation technology for profesional interpretes and literary traslators

With the exception of a few eccentrics or maniacs, it will be rare in the future to see good professional interpreters and literary translators not using more or less sophisticated and specialized tools for their jobs, comparable to the familiarization with tape recorders or typewriters in the past. In any case, this might be something best left to the professional to decide
What professional translators need are tools to assist them to translate: access to dictionaries and terminological databanks, multilingual word processing, management of glossaries and terminology resources, input and output communication (e.g. OCR scanners, electronic transmission, high-class printing). For these reasons, the most appropriate and successful developments of the last few years have been the translator workstations.

The conversion of the documentation in electronic form

Technical documentation is becoming electronic, in the form of CD-ROM, on-line manuals, intranets, etc. An important consequence of the popularization of Internet is that the access to information is now truly global and the demand for localizing institutional and commercial Web sites is growing fast. In the localization industry, the utilization of technology is congenital, and developing adequate tools has immediate economic benefits.
The recent expansion of these industries has considerably increased the demand for translation products and has created a new burgeoning market for the language business. According to a recent industry survey by LISA (the Localization Industry Standards Association), almost one third of software publishers, such as Microsoft, Oracle, Adobe, Quark, etc., generate above 20 percent of their sales from localized products, that is, from products which have been adapted to the language and culture of their targeted markets, and the great majority of publishers expect to be localizing into more than ten different languages.

The focus of the localization industry

The main role of localization companies is to help software publishers, hardware manufacturers and telecommunications companies with versions of their software, documentation, marketing, and Web-based information in different languages for simultaneous worldwide release. The recent expansion of these industries has considerably increased the demand for translation products and has created a new burgeoning market for the language business. According to a recent industry survey by LISA (the Localization Industry Standards Association), almost one third of software publishers, such as Microsoft, Oracle, Adobe, Quark, etc., generate above 20 percent of their sales from localized products, that is, from products which have been adapted to the language and culture of their targeted markets, and the great majority of publishers expect to be localizing into more than ten different languages.
Localization is not limited to the software-publishing business and it has infiltrated many other facets of the market, from software for manufacturing and enterprise resource planning, games, home banking, and edutainment (education and entertainment), to retail automation systems, medical instruments, mobile phones, personal digital assistants (PDA), and the Internet. Doing business in an integrated global economy, with growing electronic transactions, and world wide access to products and services means an urgent need to break through language barriers.
I believe that there might be a job for me in that sector becausethe localization business is intimately connected with the software industry

Internazionalization, globalization and localization

The differences between translation and localization

They are not the same thing:
Localization was originally intended to set software (or information technology) translators apart from 'old fashioned' non-technical translators of all types of documents.Localization is the process of adapting software for a specific region or language by adding locale-specific components and translating text Usually, the most time-consuming portion of the localization phase is the translation of text. Other types of data, such as sounds and images, may require localization if they are culturally sensitive. Localizers also verify that the formatting of dates, numbers, and currencies conforms to local requirements. Software translation required a different skill set: software translators had to understand programming code, they had to work under tremendous time pressure and be flexible about product changes and updates. Originally there was only a select group--the localizers--who knew how to respond to the needs of the software industry. >From these beginnings, pure localization companies emerged focusing on testing, engineering, and project management.
Althought the main goal of the LEIT initiative is to introduce localization courseware into translation studies, with versions ready for the start of the 1999 academic year. More information in : http://java.sun.com/docs/books/tutorial/i18n/intro/index.html

Traslation workstation

The ideal workstation for the translator would combine the following features:
Full integration in the translator's general working environment, which comprises the operating system, the document editor (hypertext authoring, desktop publisher or the standard word-processor), as well as the emailer or the Web browser. These would be complemented with a wide collection of linguistic tools: from spell, grammar and style checkers to on-line dictionaries, and glossaries, including terminology management, annotated corpora, concordances, collated texts, etc.
The system should comprise all advances in machine translation (MT) and translation memory (TM) technologies, be able to perform batch extraction and reuse of validated translations, enable searches into TM databases by various keywords (such as phrases, authors, or issuing institutions). These TM databases could be distributed and accessible through Internet. There is a new standard for TM exchange (TMX) that would permit translators and companies to work remotely and share memories in real-time.

Machine traslation vs human traslation

Machine Traslation:
Machine translation is never plug-and-play. It requires a huge effort in preparation, evaluation, and maintenance. Suitability of technology depends on many factors, but fundamentally text type. Without these considerations, the technology may be seen as a fiasco. Few informed people still see the original ideal of fully automatic high-quality translation of arbitrary texts as a realistic goal. Translation technology suppliers are now working under the assumption that, rather than batch processes, man-machine interaction together with the integration of tools into the translator's working environment is the solution.
Translation technology suppliers are now working under the assumption that, rather than batch processes, man-machine interaction together with the integration of tools into the translator's working environment is the solution.

Human traslation:
Human´s creativity becomes indispensable. Translators of the highest quality are only obtainable from first-class raw materials and constant and disciplined training. The potentially good translator must be a sensitive, wise, vigilant, talented, gifted, experienced, and knowledgeable
For skilled human translators, translation is often difficult. One clear example is when linguistic form, as opposed to content, becomes an important part of a literary piece. Conveying the content, but missing the poetic aspects of the signifier may considerably hinder the quality of the translation. This is a challenge to any translator Professional translators need to become acquainted with technology, because good use of technology will make their jobs more competitive and satisfactory. But they should not dismiss craftsmanship. Technology enhances productivity, but translation excellence goes beyond technology. It is important to delimit the roles of humans and machines in translation. Martin Kay's (1987) words in this respect are most illustrative:
A computer is a device that can be used to magnify human productivity. Properly used, it does not dehumanize by imposing its own Orwellian stamp on the products of human spirit and the dignity of human labor but, by taking over what is mechanical and routine, it frees human beings over what is mechanical and routine. Translation is a fine and exacting art, but there is much about it that is mechanical and routine, if this were given over to a machine, the productivity of the translator would not only be magnified but this work would become more rewarding, more exciting, more human.

The profiles that a University person with a University degree in Traslation should be qualified for

The profile they look for in translators is an excellent knowledge of computer technology and superb linguistic ability in both the source and target languages. They must know how to use the leading CAT [computer assisted translation] tools and applications and be flexible. The information technology and localization industries are evolving very rapidly and translators need to move with them

5. MACHINE TRASLATIONS

Traslation is a difficult task

Technology enhances productivity, but translation excellence goes beyond technology. It is important to delimit the roles of humans and machines in translation Traslation is adifficult task becauseit is a fine and exacting art, but there is much about it that is mechanical and routine, if this were given over to a machine, the productivity of the translator would not only be magnified but this work would become more rewarding, more exciting, more human.

Tha main problems of machine traslation

Machine traslation is the process of automatically translating from one language to another by a computer. Natural language is complexity, so in many cases MT is impracticable and human creativity becomes indispensable. Translators of the highest quality are only obtainable from first-class raw materials and constant and disciplined training

The parts of the Linguistics are more relevant for Machine Traslations

(a) the use of other specific words in the same phrase or sentence (b) the use of morphological information (c) the use of information about syntactic functions and relations (d) the use of semantic features and relations (e) the use of knowledge about the subject domain (f) the use of stylistic preferences

Differents types of ambiguity

There are 2 types of ambiguity:
Lexically ambiguous: :When a word has more than one meaning.
Structurally ambiguous: When a phrase or sentence can have more than one structure.

Examples of lexical ambiguity

Example 1:

In the first sentence use is a verb, and in the second a noun, that is, we have a case of lexical ambiguity. An English-French dictionary will say that the verb can be translated by (inter alia) se servir de and employer, whereas the noun is translated as emploi or utilisation. One way a reader or an automatic parser can find out whether the noun or verb form of use is being employed in a sentence is by working out whether it is grammatically possible to have a noun or a verb in the place where it occurs. For example, in English, there is no grammatical sequence of words which consists of the + V + PP --- so of the two possible parts of speech to which use can belong, only the noun is possible in the second sentence

Emaple 2

For example the word button, it can be either a verb or a noun.

More examples in:
Lexical Ambiguity.html

Example of structural ambiguity

It occurs when a word is assigned to more than one category in the grammar. For example, assume that the word cleaning is both an adjective and a verb in our grammar. This will allow us to assign two different analyses to the following sentence. One of these analyses will have cleaning as a verb, and one will have it as an adjective. In the former (less plausible) case the sense is `to clean a fluid may be dangerous', i.e. it is about an activity being dangerous. In the latter case the sense is that fluids used for cleaning can be dangerous. Choosing between these alternative syntactic analyses requires knowledge about meaning. It may be worth noting, in passing, that this ambiguity disappears when can is replaced by a verb which shows number agreement by having different forms for third person singular and plural. For example, the following are not ambiguous in this way:

We have seen that syntactic analysis is useful in ruling out some wrong analyses, and this is another such case, since, by checking for agreement of subject and object, it is possible to find the correct interpretations. A system which ignored such syntactic facts would have to consider all these examples ambiguous, and would have to find some other way of working out which sense was intended, running the risk of making the wrong choice. For a system with proper syntactic analysis, this problem would arise only in the case of verbs like can which do not show number agreement.

Another source of syntactic ambiguity is where whole phrases, typically prepositional phrases, can attach to more than one position in a sentence. For example, in the following example, the prepositional phrase with a Postscript interface can attach either to the NP the word processor package, meaning ``the word-processor which is fitted or supplied with a Postscript interface'', or to the verb connect, in which case the sense is that the Postscript interface is to be used to make the connection.

Notice, however, that this example is not genuinely ambiguous at all, knowledge of what a Postscript interface is (in particular, the fact that it is a piece of software, not a piece of hardware that could be used for making a physical connection between a printer to an office computer) serves to disambiguate. Similar problems arisewhich could mean that the printer and the word processor both need Postscript interfaces, or that only the word processor needs them.

This kind of real world knowledge is also an essential component in disambiguating the pronoun it in examples such as the following

In order to work out that it is the printer that is to be switched on, rather than the paper, one needs to use the knowledge of the world that printers (and not paper) are the sort of thing one is likely to switch on.

Lexical mismatches

English chooses different verbs for the action/event of putting on, and the action/state of wearing. Japanese does not make this distinction, but differentiates according to the object that is worn. In the case of English to Japanese, a fairly simple test on the semantics of the NPs that accompany a verb may be sufficient to decide on the right translation.

Example 1

A particularly obvious example of this involves problems arising from what are sometimes called lexical holes --- that is, cases where one language has to use a phrase to express what another language expresses in a single word. Examples of this include the `hole' that exists in English with respect to French ignorer (`to not know', `to be ignorant of'). .

Example 2

Some of the colour examples are similar, but more generally, investigation of colour vocabulary indicates that languages actually carve up the spectrum in rather different ways, and that deciding on the best translation may require knowledge that goes well beyond what is in the text, and may even be undecidable. In this sense, the translation of colour terminology begins to resemble the translation of terms for cultural artifacts (e.g. words like English cottage, Russian dacha, French château, etc. for which no adequate translation exists, and for which the human translator must decide between straight borrowing, neologism, and providing an explanation). In this area, translation is a genuinely creative act which is well beyond the capacity of current computers

Example 3

Example where one language has to use a phrase to express what another language expresses in a single word.
Se suicider (`to suicide', i.e. `to commit suicide', `to kill oneself')
The problem raised by such lexical holes have a certain similarity to those raised by idiom s: in both cases, one has phrases translating as single words.

Structural mismatches

One kind of structural mismatch occurs where two languages use the same construction for different purposes, or use different constructions for what appears to be the same purpose.
Cases where the same structure is used for different purposes include the use of passive constructions in English, and Japanese . In the example below, the Japanese particle wa, which we have glossed as `TOP' here marks the `topic' of the sentence --- intuitively, what the sentence is about.

Example 1

Example indicates that Japanese has a passive-like construction, i.e. a construction where the PATIENT, which is normally realized as an OBJECT, is realized as SUBJECT. It is different from the English passive in the sense that in Japanese this construction tends to have an extra adversive nuance which might make rather odd, since it suggests an interpretation where Mr Satoh did not want to be elected, or where election is somehow bad for him. This is not suggested by the English translation, of course. The translation problem from Japanese to English is one of those that looks unsolvable for MT, though one might try to convey the intended sense by adding an adverb such as unfortunately. The translation problem from English to Japanese is on the other hand within the scope of MT, since one must just choose another form. This is possible, since Japanese allows SUBJECTs to be omitted freely, so one can say the equivalent of elected Mr Satoh, and thus avoid having to mention an AGENT

Example 2 :

This example shows how English, German and French choose different methods for expressing `naming'. The other two examples show one language using an adverbial ADJUNCT ( just, or graag(Dutch) `likingly' or `with pleasure'), where another uses a verbal construction.

Example 3 :

These representations are relatively abstract (e.g. the information about tense and aspect conveyed by the auxiliary verb have has been expressed in a feature) , but they are still rather different. In particular, notice that while the main verb of (a) is see, the main verb of (b) is venir-de.

Example 4:

A slightly different sort of structural mismatch occurs where two languages have `the same' construction (more precisely, similar constructions, with equivalent interpretations), but where different restrictions on the constructions mean that it is not always possible to translate in the most obvious way. The following is a relatively simple example of this.

Idiomatic expressions

Idioms are expressions whose meaning cannot be completely understood from the meanings of the component parts. Example 1 `If Sam dies, her children will be rich'
kick the bucket is an idiom

Example 2

A typical way in which idioms can vary is in the form of the verb, which changes according to tense , as well as person and number. For example, with bury the hatchet (`to cease hostilities and becomes reconciled', one gets He buries/buried/will bury the hatchet, and They bury/buried/shall bury the hatchet.
Notice that variation in the form one gets here is exactly what one would get if no idiomatic interpretation was involved --- i.e. by and large idioms are syntactically and morphologically regular --- it is only their interpretations that are surprising.

Example 3

A second common form of variation is in the form of the possessive pronoun in expressions like to burn one's bridges (meaning `to proceed in such a way as to eliminate all alternative courses of action'). This varies in a regular way with the subject of the verb: More examples about idioms in : Idioms.html

Collocations

Example 1

Here the meaning can be guessed from the meanings of the parts. What is not predictable is the particular words that are used.
For example, the fact that we say rancid butter, but not * sour butter, and sour cream, but not * rancid cream does not seem to be completely predictable from the meaning of butter or cream, and the various adjectives

Example 2

The choice of take as the verb for walk is not simply a matter of the meaning of walk (for example, one can either make or take a journey).

One may think that make, in make an attempt has little meaning of its own, and serves merely to `support' the noun (such verbs are often called light verbs, or support verbs). This suggests one can simply ignore the verb in translation, and have the generation or synthesis component supply the appropriate verb. For example, in Dutch , this would be doen, since the Dutch for make an attempt is een poging doen (`do an attempt').
One way of doing this is to have analysis replace the lexical verb (e.g. make) with a `dummy verb' (e.g. VSUP). This can be treated as a sort of interlingual lexical item, and replaced by the appropriate verb in synthesis (the identity of the appropriate verb has to be included in the lexical entry of nouns, of course --- for example, the entry for poging might include the feature support_verb=doen. The advantage is that support verb constructions can be handled without recourse to the sort of rules required for idioms (one also avoids having rules that appear to translate make into poging `do').
More examples about collocations in : Collocations.html

6. MACHINE TRASLATIONS II

The most usual interpretations of the term "Machine trslation" (MT)

The term machine translation (MT) is normally taken in its restricted and precise meaning of fully automatic translation.
We can consider the whole range of tools that may support translation and document production in general, which is especially important when considering the integration of other language processing techniques and resources with MT.

Fully Automated Machine Translation (FAMT):The interpretation of MTperformed without the intervention of a human being during the process
Human-Assisted Machine Translation (HAMT):is the style of translation in which a computer system does most of the translation, appealing in case of difficulty to a (mono- or bilingual) human for help.
Machine-Aided Translation (MAT):is the style of translation in which a human does most of the work but uses one of more computer systems, mainly as resources such as dictionaries and spelling checkers, as assistants.

The meaning of FAQTH and ALPAC in the evolution of MT

FAQTH

It became widely assumed that the goal of MT must be the development of fully automatic systems producing high quality translations. The use of human assistance was regarded as an interim arrangement: post-editing should wither away as systems improved. The emphasis of research was therefore on the search for theories and methods for the achievement of 'perfect' translations.

There were of course dissenters from the dominant 'perfectionism'. Researchers at Georgetown University and IBM were working towards the first operational systems, and they accepted the long-term limitations of MT in the production of usable translations. More influential was the well-known dissent of Bar-Hillel. In 1960, he published a survey of MT research at the time which was highly critical of the theory-based projects, particularly those investigating interlingua approaches, and which included his demonstration of the non-feasibility of fully automatic high quality translation (FAHQT) in principle.

ALPAC

While the ALPAC report brought to an end many MT projects, it did not banish the public perception of MT research as essentially the search for fully automatic solutions. The subsequent history of MT is in part the story of how these is this mistaken emphasis of the early years has had to be repaired and corrected. The neglect of the translation profession has been made good eventually by the provision of translation tools and translator workstations The best known event in the history of machine translation is without doubt the publication thirty years ago in November 1966 of the report by the Automatic Language Processing Advisory Committee (ALPAC 1966). Its effect was to bring to an end the substantial funding of MT research in the United States for some twenty years.
The impact of ALPAC is undeniable. Such was the notoriety of its report that from time to time in the next decades researchers would discuss among themselves whether "another ALPAC" might not be inflicted upon MT.
More information of ALPAC in:
http://ourworld.compuserve.com/homepages/WJHutchins/Alpac.htm
See:
References of Alpaca.html

The major methods, techniques and approaches of FAQTH and ALPAC

The main methods that machine traslations use, can be divided in two groups:

1. Rule-based Machine Traslation, RBMT
2. Analogy-based Machine Traslation. ABMT
Analogy-based MT .HTML
Ruled-based MT.html

In the Rule-based Traslation there are two methods that use intermediate representations. Because of that, they are known like indirect methods in Rule-based Machine Traslation
Direct methods don´t use intermediate representations and the traslation is done in a step
They are two approches of indirect methods:

The change of focus in nineties was seen becoming flattered by the coming down in prices of computers and storage units. In the 1990s, the corpus-based paradigm with stochastic and example-based methodologies is the focus of much activity. After 1990 the Ruled-based traslation start losing supporter in benefit of Analogy-based Traslation. Four aspects of this focus are:

Since 1994, a new generation of research MT systems is investigating various hybridizations of statistical and symbolic techniques

The place where MT was 9 years ago

Within the last ten years, research on spoken translation has developed into a major focus of MT activity. Of course, the idea or dream of translating the spoken word automatically was present from the beginning (Locke 1955), but it has remained a dream until now. Research projects such as those at ATR, CMU and on the Verbmobil project in Germany are ambitious. But they do not make the mistake of attempting to build all-purpose systems. The constraints and limitations are clearly defined by definition of domains, sublanguages and categories of users. That lesson has been learnt. The potential benefits even if success is only partial are clear for all to see, and it is a reflection of the standing of MT in general and a sign that it is no longer suffering from old perceptions that such ambitious projects can receive funding.

Assessment of the Contribution of Machine Traslation to Linguistic Diversity on the Internet

New directions and foreseeable breakthroughs of MT in the sort term.

Some of the most ambitious projects at present are those involving spoken language translation. These systems are inevitably highly restricted in domain and range. The Japanese ATR project, underway already for seven years and set to continue into the next century, is a system for registration by telephone at international conferences and for hotel booking by telephone. The German Verbmobil project aims to develop a transportable aid for face to face English-language commercial negotiations by Germans and Japanese who do not know English fluently. The JANUS project - a collaboration involving ATR, Carnegie Mellon and Karlsruhe University - is also restricted to conference registration negotiations. Each group is developing speech recognition and speech synthesis modules for their own languages (Japanese, English, German) and the translation programs linking their language to the other two. A successful public demonstration of an early prototype of JANUS was given in January 1993.

A further feature of the last five years is the recognition of a demand for types of translations which have not previously been considered. In the past, systems were built generally for bilingual users, for translators and for those knowing both source and target languages. The needs of those not knowing the target language were neglected, e.g. businessmen engaged in foreign trade needing to communicate fairly simple standard messages in an unknown language (e.g. confirmation of an order, booking of accommodation, etc.) In recent years, there have been experiments on `dialogue-based MT' where the text to be translated is composed in a collaborative process between man and machine, i.e. another approach to `control' of input. In this way it is possible to construct a text which the system is known to be capable of translating without further reference to the author, which needs no revision, and for which good quality output can be assured.

An important limitation affecting the possible uses of Machine Translation technology, already referred to in the first section, concerns the quality of the translations it is able to provide. "Quality of translation" refers to how accurately an MT application performs a number of technically distinguishable tasks which together make up the complex process of translation: appropriate interpretation of the source text, production of a correct, intelligible target text, and of course perfect, or at least satisfactory, equivalence of meaning between the original text and the translation. For most purposes it is fair to say that the quality we would ideally like to obtain in a machine translation would be such that this is indistinguishable from what an expert human translator would produce

7. ASSESSMENT OF THE CONTRIBUTION OF MACHINE TRASLATION TO LINGUISTIC DIVERSITY ON THE INTERNET

Internet's essential features

The Internet is a channel allowing information to be transmitted or stored. The essential features of this channel can be summed up in four points: its operation is efficient; its extension is global; its use is flexible; and its form is electronic:

The role of minority languages on the Internet (Catalan, Basque...)

They become Internet more multilingual , and they influence on machines are to be developed to facilitate communication between speakers of different languages

The Internet in its present form has many advantages for minority language communities as indeed for all small communities.We should remember, however, that some minority-language areas coincide with European Objective 1 areas - that is to say they are the poorest in the EU in terms of average income, and therefore likely to have fewer personal computers

There are many courses teaching minority languages on the Internet. The most ambitious is likely to be HABENET, a three-year project for teaching Basque on the Internet (28) and costing some 1.8m euros. Internet courses in minority languages have new possibilities but also face new challenges. Most face-to-face courses and course materials for learning minority languages assume a knowledge of the local majority language and this is undoubtedly where the main demand will be, on and off the Internet. But it seems to us that there would also be room to develop a multi-media language-learning package that was language-independent or language-adaptable so far as the language of instruction went. Such a course would make each language approachable from any other language at least at an elementary level.

The ways what Machine Translation can be applied on the Internet

In the case of synchronous communication (e.g. chat in real time) to be possible across languages, translation itself should be synchronous. This narrows down the options regarding translation procedure to two: an on-line human interpreter (theoretically feasible, but extremely expensive) or fully automatic machine translation. With MT technology at present and for the foreseeable future, the quality of automatic translation performance in such an environment will be imperfect except in those cases where the language or domain of communication is restricted and the translation system appropriately specialized

We shall need to make distinctions when speaking of the Internet or Machine Translation. The Internet, unlike traditional media, serves a number of quite different purposes. Machine Translation, too, is not a single process which either succeeds or does not succeed by some single absolute standard. Different systems of machine translation may be suited to different user requirements. The use that can be made of Machine Translation on the Internet in turn depends on the underlying range of language resources available to a given language. The ultimate purpose of making these distinctions is so as to match particular uses and technologies with the particular needs of the smaller language communities. But first we must give an overview of the present state of the Internet and of Machine Translation, together with our estimate of future developments - for these are fast-moving fields.

Non-specialists thinking about Machine Translation (MT) often apply one of two widespread but mistaken notions concerning the nature of translation itself. According to the "naive fallacy", translation is a straightforward matter of substituting for each word in the source language the corresponding word in the target language; thus the ability to translate merely consists of "knowing all the words". According to the "erudite fallacy", on the other hand, translation is such a dauntingly complex and subtle task that accurate translation is almost beyond even the expert human, while for a machine to translate reliably is inconceivable.

CONCLUSION

We have seen the developing of the Information Society and the importance of having hability and tools to management and arrange the knowledge.
It is clear that word processors, on-line dictionaries and all sorts of background documentation, such as concordances or collated texts, besides e-mail or other ways of network interaction with colleagues anywhere in the world may substantially help the literary translator's work, but on the other hand, we haven´t forget that traslating is a difficult task , and if a traslate is done by a machine some problems can appear like lexical ambiguity, collocations ...etc
In my opinion in a good traslation in the development of a real machine traslation, there are three conditions to follow:

An impotant topic the report have treat is the multilinguality and minority languages and the form that documentation is becoming electronic. Nowadays the Internet in its present form has many advantages for minority language communities as indeed for all small communities
I believe that minority languajes become Internet more multilingual , and they influence on machines are to be developed to facilitate communication between speakers of different languages. In my opinion knowing a language is to improve a person´s culture

REFERENCES

http://www.bib.uc3m.es/~mendez/politicas/europaframe.htm Feb 2003
Information.html Feb 2003
http://www.slis.indiana.edu/TIS/contact/history.html Feb 2003
http://www.slis.indiana.edu/TIS/ Feb 2003
Topics.html Feb 2003
http://www.cordis.lu/ist/ Feb 2003
http://europa.eu.int/information_society/newsroom/documents/catalogue_en.pdf Feb 2003
http://www.ics.uci.edu/~kling/tistopic.html Feb 2003
Language Technology.html March 2003
http://www.unesco.org/webworld/observatory/indexs.html Feb 2003
http://www.slis.indiana.edu/TIS/contact/history.html Feb 2003
http://europa.eu.int/iformation_society/index
Information overload.html Feb 2003 ( In this page we can see how Peter Guilford, a spokesman for the Eurepean Commission in Brussels, sometimes fells that he is suffering from information overload.)
Information fatigue syndrome.html
Strategies for dealing with information and solution.html
http://www.hltcentral.org/page-842.0.shtml Feb 2003
Human Languaje Technologies.html Feb 2003
Projecs.html Feb 2003
http://www.hltcentral.org Feb 2003
Globalization.html
http://www.globalization.com/index.cfm?MyCatID=3&My SubcatID=11&pageID=1312l
http://java.sun.com/docs/books/tutorial/i18n/intro/index.html
http://cslu.cse.ogi.edu/HLTsurvey (1996)
Lexical Ambiguity.html March 2003 Editorial Board:Ronald A. Cole, Editor in Chief,Joseph Mariani,Hans Uszkoreit,Annie Zaenen,Victor Zue
Managing Editors:Giovanni Battista Varile,Antonio Zampolli
Idioms.html
Collocations.html
http://www.ilc.pi.cnr.it
Proyecto financiado por la Comisión Europea
http://people.vanderbilt.edu/~i.fernandez/lino/index.html.
http://www2.echo.lu/langeng/en/infoage.html
References of Alpaca.html
Analogy-based MT .HTML
Ruled-based MT.html
http://ourwold.compuserve.com/homepages/WJhuychins/alpac.html

On paper: