DOCOMO is about to launch an on-demand translator-phone service, showing their increasing ability to deliver high value services though cloud solutions. Through this unique mobile cloud service, a customer will simply speak into their phone, and in the other end the receiver will hear the message interpreted promptly into the language of their choice.
Trials have shown that the average processing time takes just about two seconds, fast enough for a reasonably natural conversation under most circumstances. This will in theory make it possible for two people to have a conversation without really understanding each other’s language. DOCOMO and some 400 monitors are currently testing the service in Japan now through March 2012, and tourist facilities, retail companies and hospitals are also participating. The trial system, which interprets Japanese and English, has about 90 percent accuracy in understanding what users say in Japanese. Accuracy for English is currently at about 80 percent.
“Since neither voice recognition nor interpretation is at 100%, we still want to improve accuracy. Nevertheless, we are already considering scenarios in which customers could accept a certain level of inaccuracy”, said Hideharu Suzuki, a manager at DOCOMO’s research and development center. Suzuki hopes that someday the service will interpret conversations instantaneously, for now however, DOCOMO intends to propose applications and observe how customers themselves apply mobile auto-interpretation in their daily and professional lives.
Even though it is difficult to say when this new technology will be available internationally, it makes things like the Babel Fish in The Hitchhikers Guide to the Galaxy and Translator Microbes in Farscape, that previously sounded like pure science fiction, a reality that might not be that far away. If all goes well, a commercial service will be offered to customers after fiscal 2012. Chinese and Korean services are already available to customers, and other languages will be introduced sequentially. Potential applications are envisioned in fields such as tourism, retail, health care and education.
The interpretation system is packaged in a cloud. It uses the cloud to integrate technologies for voice recognition, machine translation and voice synthesis, as well as mobile communication. Given the relatively limited processing power of mobile devices, the system leverages the cloud to connect to powerful interpretation machines and other server-based data-processing resources.