Can you imagine you entertaining with the computer?
This report tries to summarize the importance of the new technologies in our society, and at the same time the develop of them. Instead of give a lot information in general the main topic is the speech recognition and speech syntesis.
The objective is to approach the reader to the new technologies, by know directly proyects that are developing in the present in the Universities or in serious industries.
This is an interesting current topic for all the people because by link in the web demonstrations that are suggest here, they will make a vision of the future technologies.
Enjoy the travel!
The internet is bringing to people the need to manage information around all over the world in many different languages. Some people know some languages, anyway in the internet there are too much and all of them are important.
Actually there are new technology machines developing for multilingual information access. For this reason it can appears some ethical and moral questions about the native culture of each user.
However this is not a trouble for the reason that multilingual technology allow us to share knowledge and resources of the world and also to preserve our inividual human qualities.
The PC became a powerful general-purpose machine when the Graphical User Interface matured. Now, youngsters are growing up with an intuitive understanding of pointing and clicking, menus, and the desktop metaphor.
The Voice User Interface—powered by speech recognition, text-to-speech synthesis, and speaker authentication—will be the way we all think of telephones and wireless devices in a few years. The unfriendly image of the touch-tone pad will dissolve into the accommodating image of a virtual assistant at the other end of the line.
Human Language technologies are based on written or spoken language, and that facilitate communication between humans, machines and information sources.
This communication is not only by spoken or written language, it also include combinations of language and speech with other means (pictures, sounds, facial expresions, gestures and many others, depending of the technological system).
The text to speech conversion is a linguistic technology that makes possible that a computer converts a writing text in sounf form, as similar as possible to the voice of a person when is reading a text.
This allows for example to accede by telephone to any digital text, like a web page or email. At the same time it is very helpful for the people with special needs using the computing systems.
For the interaction between people and computers three components are necessary: a system to prosecute the verbal elocution, another which gives verbal information to the computer and last component that operates the comunication.- For the first task it uses the voice recognition. A computing system converts it in a symbolic representation. (For example you can talk with the machine and in the screen will appear the handwriten text).
- In the second component, is exactly the adverse operation. It converts a writed text into speech form. (text-to-speech).
- Finally the "conversation" between the person and the machine goes from the dialog manager which supervise the secuence of questions and answers.
Those technologies are integrated in dialog systems or voice recognition systems that allows obtain information or make transactions only with the voice and the help of an computing system.
The Macintosh system:
It´s not necessary to be a scientist to know that the computer of the future will talk, listen and understand. However, today there are some new technologies like Apple Macintosh that are able to mantain a "conversation" with the machine. It is based on give your voice commands to the computer, and he speaks back to you in English.
This is a very good example of speech recognition. You can ask "What time is it?" and hear if you´re late for dinner. You can open your spreadsheet by saying "open the September forecast" . Or use voice command to control the keyboard in your favourite game. There are multiple possibilities, as many as the human imagination.
Apple Speech Recognition lets your Macintosh understand what you say, and it´s allow you see the computer different, a litlle bit different that another machines, due to you are the leader, and you give all the orders only with your voice.
This system is develop with detail. It mains that you can speak naturally without stopping and Apple will understand you perfectly. (This is a recency because with the old systems it was neccesary speak very clearly, doing a lot of pauses, and the computer didn´t understand you since the first word).
Speech recognition in Microsoft:
Actually the voice recognition is one of the most computing fields that are in investigation. His develop can be very important in the future fot the communication with the computer.
Microsoft has been working in it and the result is "Microsoft .NET Speech SDK.".
This system will alow to the computing developers create web applications with voice and images which is easy to mantain or change. Those applications will be profits and industry opportunities. The industries will be able to:
-Reduce the price of build systems of voice answer. (IVA)
-Spread telephones applications. (GUI)
-Make use of new web technologies, to generate new industry opportunities.
-Increase the buyers allowing them to accede to information in the web with a telephone, a computer or a mobile.
-Increase the employees productivity and the buyers satisfaction.
The develop of the new technologies :
There are a lot of examples of new technologies in internet. I have choose some proyects, products and web pages which explain all the details in each system.
The develop of the translations of texts or voice recognition is an actuality topic in our society, so I have tried to found an interesting interactive demonstrations, and his explanations.
* Eutrans Proyect:
The Eutrans Proyect is based in the use of techniques to develop translation sistems of the machines which need handwriting texts and/or speechs input.
The proyect is divided in two main phases. The first one has been already completed. Here a prototypes were developed for text and speech input. The demonstration lend to be integrated with acoustic-phonetic, lexical and syntactic models.
"Verbmobil is a speaker-independent and bidirectional speech-to-speech translation system for spontaneous dialogs in mobile situations." Using this software system is unnecessary to know the language of your listener.When the newscaster is speaking this system recognizes the language (spoken input), analyses and translates it for the other person. With a mobile phone users obtain a simultaneous dialog interpretation services for some topics.The bidirectional translation is realise in three of the most important languages: German, English and Japanese. More or less Verbmobil is a 80% successfull doing correct translations and a 10% more for dialog tasks.
Actually Verbmobile try to translate texts understanding the context, in front of another proyects who translate sentence by sentence. This dialog is perfect to resolve ambiguities and to make a good translation depending of the situation.
Is available a presentation of the proyect here-> http://www.dfki.de/~wahlster/vm-final/ppframe.htm
This proyect creates an automatic traduction system between some languages like Spanish and catalán.
Is available a demonstration of the proyect here-> http://www.torsimany.ua.es/
The Janus proyect is based in translate spoken languaje from one language to another, like a real interpreter. It´s very advantagious for many bureaucracy for example in hotel reservations or travel planing.
His objective if to make easier the human communication. Moreover Janus can provide to the user aditional information about the questions that interest him (train stations, bus stop, maps...).
The Janus proyect is created by systems to work with the voice (the keyboard is unnecessary), although a speech recognition sometimes can be better using another patterns such as handwriting and gesture recognition.
Many factors play in the translation like the noise, the fluent languaje...so the system always try to interpret the sentences the best possible form (not literally).
More or less in each speech appears between 3000 and 5000 words.The system runs in less than two times real time.
Concretely Janus proyect apply English, German, Japanese, Korean and Spanish languages.
A group of the University of Vigo and Santiago de Compostela have develop a bilingual text to speech system for Galician and Spanish languajes.
In the web it´s accesible a little demonstration that converts the handwriting text into speech language. -> http://www.gts.tsc.uvigo.es/cotovia/
A group of investigation in the Granada´s University has develop an Automatic Orders System in natural languaje called SAPLEN. This system is based in some rules and purposes which try to be as similar as a dependent in a restaurant of fast food.
An ostentation of the proyect -> http://ceres.ugr.es/info/~ramon/saplen.htmln
This proyect studies topics like the design methodology and the develop of a dialog system fot the information access with spontaneous speech in different places.
It tries to set up in the develop of design option that favor the system steadiness in front of the spontaneous speech, different telephone channels and places without telephones like the cars.
Particularly Dihana´s proyect claims to improve the speech knowledge and the develop of technologies for make better the elocution, the speech comprehension and the dialog management.
*Global Strike Team (GST):
In a web page called "Con ciencia y tecnología" there is an article about Videogames with voice recognition:
Vivendi Universal Games (VU Games) put the voice recognition system of ScanSoft in his more recently SWAT game: Global Strike Team (GST) for the PlayStation 2.
The jokers can use the voice to give orders to the computer instead of use manual checkings. Inside the game there are a lot of orders depending of each action. There are some categories:
-Teams commands: To give orders to the game members.
-Order commands: They say the missions that the jokers must do.
-Compliment commands: They are to get hostrages and suspicious and withdraw their weapons.
Advantages: "ScanSoft ASR 1600 SDK" increase the jokers interactivity and makes better the all game experience. The system of voice recognition is perfect for develop complex operations only speaking.
Systram products offer state of the art translation technology, from personal to business needs. Depending of the version it has different helpfulness. Nevertheless in general it can translate texts, web pages, messages, emails, documents...The main features of Systram is that it makes a fast translations, it is a user-friendly interface, an intuitive dictionary manager and it translates side by side the original document.
It is avaibable a web adress to try on the Systram product -> http://www.systranbox.com/systran/box
The voice recognition, a good solution in echographies:
Philips has developed a neew products generation for the picture diagnosis using ultra-sounds. The iU22 system offers a lot of advanteges; for example the diagnosis in 4 dimensions in real time, control, records manges by voice recognition and technologies of images improvement.
Although the voice recognition systems there aren´t checked properly (because this is neccesary to make with the users), all the doctors who used it are very satisfied.
In "Entradas.com" is very easy buy tickets for a variety of entertainmet (cinema, theatre, football, bullfights...) through automatic call centres and the internet. This web also indicates real-time information of the last changes in the programme, showing the times, the positions of the recliners, the city in wich is available the spectacle...
Go in to www.entradas.com
This web page have a lot of programs which can be very useful in your computer. I was in this page and it is very easy to search what you need, find it and then you can download it (paying or freeware). There are some interesant programs about the voice recognition, or the text translation from one language to another.
If you like to download some of those programs go to www.softonic.com
There are many web pages about banks. I have choose the BBK web page because i am a user of it. Ii is a very clear example of the develop of new technologies.
It is too homy to see your credit card money only with his number and with a secret pasword. The bank gives you it to allow you to make several operations with your money (like see the last movements in the shops or to do a reception to another count).
To enter to BBK go to www.bbk.es
That is a very complete page, it´s the paradise of the affectionates of the cinema. When you are not sure about the film that you are going to see, here you will obtain the abstract of each film, the time that it keeps on, the actors, the nearest cinemas in your city, the days of the issue...
Visit it in www.queponen.com
In a forum of Deusto´s University some weeks ago a Man was talking over the advantages and disadvantages about internet develop.
I found a lot of possitive things in the develop of internet, and in the same way the new technologies. Always we must be conscious that bad people is going to do illegal walks in the web, but i rely that the police make his labour the best as possible.
On the other hand, the improvements of the internet + new technologies are infinites. You can obtain information about any topic by voice recognition, or you can understand any languages in a short time with the traductions by phone, or you can have the results of your doctor as soon as possible in your hands, you don´t have to go to the bank to know about your money, or go to the theatre to buy tickets...
I think it´s exciting the fast develop of the technologies in those last years. And for this reason it is very important the work of the groups in the actually, due to...Can you imagine you speaking with your computer?
Thankfulness to all the internet compartments to provide actual information about those topics. And thank you too to the Eside classrooms for allow us the computers to make the report.
http://sirio.deusto.es/abaitua/ Facultad de Filosofía y Letras - Universidad de Deusto (2003)
http://www.tmaa.com TMA Associates (2003)
http://www.apple.com/ Apple Computer, Inc.(2004)
http://liceu.uab.es/ Joaquim Llisterri, Universitat Autònoma de Barcelona (2003)
http://www.hltcentral.org/ Eutrans Proyect (2003)
http://verbmobil.dfki.de/ww.html Wolfgang Wahlster, DFKI GmbH, Saarbrücken, Germany (2000)
www.dlsi.ua.es/ InterNostrum, Depto. de Lenguajes ySistemas Informáticos (2004)
http://www.is.cs.cmu.edu/js/janus.html Janus, Céline Morel
http://www.gts.tsc.uvigo.es/cotovia/ Cotovía, Campus Universitario Lagoas - Marcosende
http://ceres.ugr.es/ Saplen, Grupo Investigación en Señales, Telemática y Comunicaciones (2003)
http://www.ayacnet.com/ GST, ConCiencia y Tecnología - AyacNet México (2004)
http://www.systranbox.com/ Information and translation Technologies (2002)
www.dihana.upv.es Sistema de Diálogo para el Acceso a la Información mediante habla espontánea en diferentes entornos (2003)
http://www.microsoft.com Microsoft Ibérica S.R.L. (2003)
http://www.diariomedico.com/ Grupo de comunicación (2003)
Janire Salvador López 2004