
While online translators are great to use as dictionaries or for giving users the gist of the meaning of a word or a phrase in another language, I'm afraid they don't get my vote as “translation” tools as such. I'd be rather downhearted to say the least if I'd spent 4 years of university studying for a job which an online tool can do in a matter of seconds. Google Translate, the company's own free translation tool, is said to be the best, most advanced online translator. However, that doesn't exactly say much if you consider previous attempts like Babelfish.
So, how does it work?
Essentially, Google Translate uses a huge database of documents that have already been translated into various languages. It scans these documents looking for linguistic patterns, a process called “statistical machine translation” and uses these patterns and rules to produce translations. In essence, the machine is built using translations that humans have spent thousands, if not millions, of hours translating.
According to Google itself: “Google Translate can make intelligent guesses as to what an appropriate translation should be” and admits, “not all translations will be perfect”.
One of the key reasons why Google Translate, or any other machine translation tool for that matter, will never (in my opinion) be able to guarantee you natural and accurate translations is because it cannot determine the context within which words are used; it cannot infer. When explaining my reasoning behind this, I often use an example quoted by one of my lecturers in Germany. Take the following two sentences as an example:
Original: Ordinai un caffé, lo buttai giù in un secondo ed uscii dal bar.
Translated into English as: I ordered a coffee, swilled it down in a second and went out of the bar.
Now, even if we didn't know what the source language was, it's highly likely that this sentence refers to an Italian coffee. The holistic perspective of translation uses our cultural knowledge of coffee and how it is consumed in different environments. “Swilled it down” tells us something about the quantity of the coffee as well as the temperature of the coffee i.e. it would have to be warm, not hot, and in a very small quantity. We can infer that the author is talking about an espresso, rather than a standard American coffee, say.
Also, while machine-based translations may well be able to translate words or certain short phrases very well, they can't localise translations. I recently worked with one of our freelancers at QueryClick to translate a text into French. The text dealt with electricity and mentioned “power switches”. This term had to be removed from the French text as power switches are virtually non-existent in France and may even be hard for a French audience to understand. Would Google Translate have recognised this localisation problem? I think not.
Privacy Issues
On a different note, aside from the quality of the translations, Google Translate also presents a significant privacy problem for translators and their clients when used together with translation software. The more recent versions of the software Trados allow you to activate Google Translate within the software. Again, this is useful as quick, online dictionary. However, many translators are unaware that any source segments they submit there are recorded, to be used by Google for future translation unit matches. This could pose huge problems in terms of the level of confidentiality translators are bound to:
By submitting, posting or displaying the content you give Google a perpetual, irrevocable, worldwide, royalty-free, and non-exclusive license to reproduce, adapt, modify, translate, publish, publicly perform, publicly display and distribute any Content which you submit, post or display on or through, the Services.
If you ever use Google Translate, use it as a dictionary. And never use it to translate into a foreign language which I have no idea about; only use it for a language in which you will be able to realise nonsense!
I'll finish my Google Translate blog with an sentence I took randomly from a German blog on the topic of machine translations.
German: über die qualität der übersetzungen lässt sich streiten, selten sind sie perfekt oder wenigstens korrekt
English: about the quality of the translations can be argued, they are rarely perfect, or at least properly
Says it all really.
Good read?
Want to get the new posts in your inbox?
Get a monthly digest (as well as free search engine marketing tips and guides)
Comments
This was a very interesting article! As a translator myself I know first-hand how important privacy issues are when it comes to translation and it is worrying to think that using such a translation “tool” could expose the translator to serious legal ramifications by granting Google access to what is often highly confidential material. I strongly agree with you, Google Translate will never be as good as the real thing!
Google translate takes (English to Arabic) "Israel will vanish" and alters it to "Israel will not vanish." Substitute Israel with France or any other country, and the problem mysteriously goes away. http://www.youtube.com/watch?v=h6AwePEJRx4 Translations certainly are complicated.
"One of the key reasons why Google Translate, or any other machine translation tool for that matter, will never (in my opinion) be able to guarantee you natural and accurate translations is because it cannot determine the context within which words are used; it cannot infer." A few years ago, nobody imagined that even this level of quality could be achieved. Looking up a translation for a word in a dictionary is easy for a program, but what machine translation does is picking a translation based on the context. So, all machine translation does is basically determining the context. This context can be the last few words, last paragraph or the whole document. All you need to do for the translation algorithm to produce better translations is to model the context better and feed the program with more data to learn extract patterns that detect those contexts. The future of science is not predictable, so statements such as the quoted one is, to put it mildly, unscientific.
SMT systems like our SmartMATE technology do _not_ search for linguistic patterns. Some of the extracted phrase pairs may look like rules written by a linguist, but the vast majority do not conform to any linguistic norms at all. There are already some pretty good ways of integrating contextual models into SMT, so don't write off the whole paradigm just yet. Note also that there are many case studies which demonstrate clearly that MT + human postediting can be cheaper than and outperform human translation. Don't expect MT to be perfect: it won't be, ever. But for most tasks for which it can be used (and you need to know when _not_ to use MT), it's already useful enough ...