In July of 2003, Microsoft released its first public beta of its Speech Server which allows computers to better handle human oral commands. This technology will be implemented not only into computers but also into phones, mp3 players, and cars. Oral commands are already used in many cellular devices, but with the updates Speech Server will allow less flaws in human to computer interaction not only on these devices, but also on personal PC’s.
This new system is anything but cheap. Kai-Fu Lee, vice president of Microsoft’s speech technologies group stated, “Automated response systems such as those used by many airlines can cost as much as $1 million – too expensive for the bulk of the business market”. Regardless, several institutions such as IBM are now letting customers perform simple banking transactions through speech requests to help advance technology in their businesses and make customer banking experiences easier.
This technology is also used in language transition. There are now computers that can facilitate conversation between languages. These types of computers have generally been used in the business market. For example, Americans doing business with the Chinese would have the capabilities to translate business deals from language to language. Additionally, software maker ScanSoft is expanding its deal with IBM by adding RealSpeak text to speech software. This program can understand and say aloud entered text in 20 different languages. Another IBM project is a system that can transcribe speech into text more accurately than humans can. At the moment, machines’ error rate when translating language is 10 times higher than that of the average human being.
One challenge for these types of computers is how humans talk to each other. For example, humans can use words such as yep, ya, and uh-huh all meaning yes. Slang and other variables such as speaking speed and accent can also cause problems for the computers. Therefore, when the computers respond to people they speak a version of broken English.
Sources: “Web speech spec gets tongue-tied”
Paul Festa, news.cnet.com, January 29, 2003
“IBM, ScanSoft pair up for speech software”
Lisa M. Bowmann, news.cnet.com, April 25, 2003
“Talking Computers Nearing Reality”
Michael Kanellos, news.cnet.com, July 9, 2003
Student Researchers: Emily Bichler, Jordan Niespodziany, JB McCollum
Faculty Instructor: Kevin Howley Ph.D.
Evaluator: Scott Thede Ph.D.