Microsoft has been working on a translation tool that’s designed to work similarly to the human brain, offering near instantaneous translations while preserving the cadence and tone of the speaker. This would be a big leap over currently available translation tools which are often inaccurate, slow and don’t relate how things were said.
The problem, says Microsoft chief research officer Rick Rashid, was that translation programs are still using a pattern matching system, a method he described as “extremely fragile.” 20-25 per cent of words are still misheard and features like sarcasm are almost impossible to recognise.
Going back to the drawing board, Microsoft came up with a technique called Deep Neural Networks, that uses human brain behaviour as a software model.
The results speak for themselves:[yframe url=’http://www.youtube.com/watch?v=Nu-nlQqFCKg&t=7m30s’]
It’s not quite instant, but the delay is only a second or two, between Mr Rashid speaking and his voice coming through very clearly in Chinese, using a similar intonation as he did. Apparently Microsoft used a big pool of sample Chinese speakers to be able to create a standardised audio language and then use modulation recorded from his own words, to then adjust the template.
Perfection has yet to be achieved and there is still an error rate of around 15 per cent. However this is a big jump over other programs that perform a similar function.
KitGuru Says: Potentially this could help improve relations between countries by making everyone feel more comfortable, hearing their native tongue. It could allow world leaders to converse directly without the need for translators, or at least let them not wonder what is being said when not being directly addressed.
I’m sure some of you guys make use of services like Siri already. How do you think voice technologies like this could help change the world?