Why Machines Alone Cannot Solve the World’s Translation Problem
Sixty years ago this week, scientists at Georgetown and IBM lauded their machine translation “brain,” known as the 701 computer. The “brain” had successfully translated multiple sentences from Russian into English, leading the researchers to confidently claim that translation would be fully handled by machines in “the next few years.”
Fast forward six decades, and MIT Technology Review makes a remarkably similar proclamation: “To translate one language into another, find the linear transformation that maps one to the other. Simple, say a team of Google engineers.”
Simple? Not exactly.
Even in the 1950s, IBM acknowledged that to translate just one segment “necessitates two and a half times as many instructions to the computer as are required to simulate the flight of a guided missile.” It’s also highly doubtful that the scientists at Google see anything “simple” about their new method, which relies on vector space mathematics.
Granted, there is a beautiful simplicity in statistical machine translation, such as Google Translate. Essentially, the more data you have, the better the probability of a high-quality translation as an end result. But what do you do when you don’t have enough data? Or in the case of Google, what do you do when the data might be out there somewhere, but it isn’t part of the free and public web that you’re designed to mine?
That’s when you come up with new techniques, just as Google has done. Their new method — one that is meant to complement, but not replace their statistical approach — automatically creates dictionaries and phrase tables without help from humans. The new technique uses data mining in order to compare the structure of one language to another, and then generates phrase tables and dictionaries accordingly. This means that Google won’t have to rely exclusively on documents available in two languages to improve its translation quality. It will have other methods, such as this new one, to add to the mix.
What does this mean? Even Google isn’t satisfied that statistical machine translation will move things along quickly enough. That method has its limitations, just like all methods do.
What’s fascinating is that every few months, starry-eyed and often misinformed journalists herald a new era for language translation, announcing a “groundbreaking milestone” related to a technology that has been around for 60 years.
And their claim is always the same: “The translation problem is solved!”
Unfortunately, equating such minor machine translation accomplishments with “solving the translation problem” is like assuming that because we’ve walked on the moon that we can all just pack up and move there. We can’t, and we may never be able to. But that doesn’t stop us from trying.
Source: Huffington Post