As a lover of language and languages, I was intrigued but bothered by the opening lines of an article I read this week at The Hot Word (dictionary.com’s blog). “Back in the 1940s, mathematician Warren Weaver made an audacious suggestion: what if translation was not a feat of literary theory and linguistics, but one of cryptography?” The rest of the article indicates that Weaver was on the right track, as evidenced by both Google Translate, and the recent success of some cryptographers in decoding the Copiale Cipher.
I think computers are great tools, and it wouldn’t surprise me if eventually they could be programmed to understand and use human languages fairly well. But to do it by the tools of mathematics rather than linguistics? Besides, even humans often do a poor job of translation (Charles Berlitz gives some very amusing examples in his book Native Tongues) – how could a computer possibly do better?
I decided to check out Google Translate. I took a sentence from the article I had just been reading, and pasted it into Google Translate. It didn’t matter much which language I translated it into, since my aim was to re-translate it to English and see how this compared with the original. I chose Russian. The result was not perfect, but better than I expected.
Original sentence: By making a machine-readable version of the text, a team of computational linguistics were able to run the characters through a software program that found patterns in the text, which were otherwise inscrutable.
Russian translation: Делая машиночитаемой версии текста, команда компьютерной лингвистики смогли запустить персонажей через программное обеспечение, которое обнаружили закономерности в тексте, которые в противном случае неисповедимы.
Back to English: Making the machine-readable version of the text, Computational Linguistics team were able to run characters through software, which found a pattern in the text, which otherwise inscrutable.
By way of comparison, Babel Fish produced this when translating the same sentence to Russian and then back to English: “With way to make machine-readable the version from the text, the command of computational linguistics could break into a run natures to the program of software which it found the pictures in the text, which were otherwise inscrutable.” Yes, definitely inscrutable.