Helping Save Languages By Teaching Machines to Speak
Go ahead, teach the Microsoft Translator Hub a new language (perhaps Hmong, a language in danger of extinction) or teach it some specialized words in Spanish (say, agricultural or engineering terms) – it’s entirely customizable.
And, just like a person learning a language, the Microsoft Translator Hub learns and improves its translation as it receives feedback, says Vikram Dendi, director of product strategy and marketing for Microsoft Translator.
“We wanted to push the envelope and change the conversation on how machine translation is perceived in business and enterprise today,” Dendi says. “People look at machine translation as being great for general purpose use, but don’t trust it for business docs or specific tasks. What we’re doing is changing all of that.”
At the Worldwide Partner Conference today, Microsoft announced the commercial availability of its Microsoft Translator Hub, which enables people — such as those who speak the Hmong language — businesses, developers and partners to not only translate content, but build their own customized machine translation system.
Whether it’s a business looking for a highly customizable translation tool or someone looking to preserve their native language, the Windows Azure-powered Microsoft Translator Hub – used in combination with Microsoft Translator service – can help with translation across the Web, PCs and mobile devices.
Though machine translation has encountered a number of milestones and transitions over the years at Microsoft, it was one particular moment two years ago that reminded some employees of the power translation tools could have.
Shortly after the 7.0-magnitude earthquake that devastated Haiti in January 2010, the Microsoft Translator team got a call from colleagues who were there trying to help with rescue and recovery. In addition to the challenge of helping millions of displaced Haitians and navigating the country’s destroyed infrastructure, the aid workers were finding another major obstacle: language.
“These people were there looking to help, and encountering millions of Creole speakers who the helpers had a hard time communicating with,” Dendi says.
None of the world’s machine translation services supported Haitian Creole, and the aid workers were hoping the Microsoft Translator team could help. Five days (and nearly sleepless nights) later, the Microsoft Translator team had built a basic but effective Creole translator that aid workers then used to help people, and to help rebuild the devastated country.
“There has always been the theory that if we had enough data we could build machine translation overnight,” Dendi says. “It was very rewarding to see Creole become available so quickly. In a way, I see that as a key turning point in our thinking.
“It shouldn’t have taken an earthquake to create support for one of these small languages, but it did. We needed to change that.”
A Shrinking Planet
The technology called “machine learning” relies on a large amount of data, and on algorithms to help the machine create a model and learn about the data – in this case, learning a language. At Microsoft, the technological threads of machine translation go back nearly 20 years to the company’s Natural Language Processing Group.
Machine translation got a push a decade ago when then-CEO Bill Gates read a paper on machine translation written by a group at Microsoft already working on it, and decided it was an important area – one in which Microsoft should further invest.
Microsoft’s Translator is now translating billions of words a day and being used in more and more ways, inside and outside the company.
Internet Explorer users have the option to translate any page they’re visiting with the click of a button. In Lync, people speaking different languages can message each other with instant translation. Windows Phone users can take a photo of a sign or a café menu in a foreign language and have the words translated.
In addition to being used in some of Microsoft’s most popular products, Microsoft Translator is used by businesses including Facebook, Trip Advisor and Twitter to help those businesses communicate with customers in many languages (and to help their customers communicate with each other).
Developers and partners are incorporating Microsoft Translator’s application programming interface (API) to build even more translation tools and services.
The translation business is brisk, says Ben Enosh, CEO of PLYmedia.
PLYmedia uses a combination of human and machine translation services to provide localization – captions, subtitles and transcripts – for live and on-demand video. In the nearly three years since the company started its video localization services, demand for that service has grown 10,000 percent.
“Translation is significantly shrinking the planet,” Enosh says.
According to Enosh, in addition to the surge in demand for the company’s localization services, there’s been a high increase in demand for East Asian languages such as Mandarin, Cantonese, Chinese and Korean.
PLYmedia has been using the Microsoft Translator API not only for its wide array of available languages, but because it’s faster and more cost effective than human translation. It can be a rigorous, time-consuming process translating video using humans – one that involves a language speaker watching the video, translating it verbatim aloud, and another language speaker recording the translation with a stenograph machine.
“The Microsoft Translator Web service increases the business opportunity for us because it reduces cost,” Enosh says. “Prior to using the service we could not have provided even the basic, lower-level translation that we can now provide.”
Machine translation is still not as accurate as human translation, particularly for jargon-heavy industries such as medicine. But Microsoft Translator works well for giving people the basic gist of a message.
“I think it could significantly lower the financial challenge for different suppliers of content, to the level where you could start deploying it in mass format around the world,” Enosh says. “With the Microsoft platform, every audio file could be translated to all different languages, and it could be at zero or close to zero cost. There are cases where accuracy is essential, and would be worth the cost of human translation, but for the masses I think it’s a good solution.”
The Microsoft Translator API may not initially recognize medical terminology to be able to translate it to Cantonese, or be able to translate environmental engineering terms into Spanish, but one of the key features of the new Translator Hub is its ability to adapt.
Companies like PLYmedia do not have to wait any longer to have that “custom” quality translation that Enosh talks about. While the general purpose Microsoft Translator API may not initially recognize medical terminology for translation into Cantonese, or environmental engineering terms to Spanish – the key feature of the new Microsoft Translator Hub is its ability to tune and adapt the API.
“We made it really simple,” Dendi says. “This is customizable, collaborative translation. And just like we put translation control in the hands of businesses, we’ve done the same thing for language communities.”
Stories Untold, Songs Unsung
Fewer than 100 of the world’s presently spoken languages have automated translation systems – mainly because, until now, communities have not been able to build their own.
Phong Yang, a professor and Hmong community leader in California, says his language was dying with its elders. A first-generation Hmong in the United States, Yang worked with Microsoft to help preserve his language using Microsoft Translator.
Because the Microsoft Translator Hub is in the cloud, Hmong speakers all over the world – including the culture’s elders – needed nothing more than Internet access to collaborate on the project.
“Working with elders from the community was an experience of a lifetime,” Yang says.
When Yang showed Hmong elders the rough, early versions of the language engine, they were astonished that a machine was translating Hmong back to them. The first meetings were in November 2011, and by February 2012 the Hmong community had released its language on Microsoft Translator for the world to see – and use.
“Anyone anywhere throughout the world is able to access this tool and actually use it – people like my father-in-law, or elder people in the community, can actually go to any news site or read news in Hmong,” Yang says. “They don’t feel secluded as much. Imagine being able to read the news for the first time in 30 years.”
Chris Wendt, principal program manager for Microsoft Translator, says he’s tremendously proud to work on something that helps people communicate better, get to know each other better, and be exposed to a wider variety of cultural backgrounds.
“Take, for example, any publication in Greece, Russia or Japan,” Wendt says. “You can translate it and read how they talk about world events in those cultures, versus how they talk about things in other parts of the world, and you can see the variety of views that exist. I see that as an expansion of the mind, and I think it’s a valuable service.”
As technology helps the world become more connected, Microsoft Translator can help people talk to each other, Dendi says.
“No matter who you are, or what business you are, you will encounter different languages,” Dendi says. “There are going to be innumerous scenarios where this technology will change how people do business, how they interact with each other, and how they communicate with each other. We’re excited about helping to influence that future.”