Page tree

Executive summary

Automated translation, and GoogleTranslate in particuloar offer great promise in terms of time and resource to undertake the translation process. At present however, research has proven that the accuracy of translation still falls short of the accuracy required to translate clinical content with a level of certainty required from a clinical risk perspective. GoogleTranslate in particular continues to develop its algorhyms, which will improve its accuracy over time. However, current requirements for translation, cannot be met purely using automated translation alone.


Automated translation

Automated translation or Machine translation can be defined as the study of designing the systems that can translate one human language into another. These systems take input in one natural language and convert it into another human language. The language that is given as an input is called Source Language and the language in which we get the output is called Target language. 

With the increasing global use of SNOMED CT, there is an increasing need to provide translations in languages other than English. This requirement is applicable to both the content of SNOMED CT, its derivatives and other documentation. The resources required to undertake these endeavours are high, and therefore various approaches are considered driven by cost reduction and quality/accuracy of the translated artefacts. It is therefore, natural that one such approach would be to use some form of automated translation which has the advantage of speed and relatively low cost. This document sets out to explore the use of automated translation through published research, and also through the experience of SNOMED International in translating a subset of SNOMED CT and a small number of documents.

The most readily available source of automated translation is through the use of Google Translate, and indeed a large number of organisations who provide translation services base their offering on Google translate. In 2014, an article was published in the British Medical Journal “Use of Google Translate in medical communication: evaluation of accuracy” BMJ 2014;349:g7392. In it the authors undertook an evaluation of 10 commonly used medical terms, which were translated into 26 languages, then sent to native speakers who translated the text back to English. These were then compared to the original text. The results were as follows:

  • 260 translated phrases
  • 150 (57.7%) - correct
  • 110 (42.3) - wrong.
  • African languages scored lowest (45% correct)
  • Asian languages (46% correct)
  • Eastern European (62% correct)
  • Western European languages (74% correct)

The conclusion of the authors of the study was that Google Translate has only 57.7% accuracy when used for medical phrase translations and should not be trusted for important medical communications


Linguistically, automated translation is challenging due to the structure and content of global laguages, and a machine ability to make sense of these differences.


  • Two languages may have completely different structures, eg. English has SVO (Subject- Verb- Object) structure and Tamil has SOV (Subject- Object- Verb) structure. This difference may make the translation process tedious.
  • The way in which the sentences are put together in different languages may differ.
  • All words in one language may not have equivalents in other language. In some cases a single word in one language is expressed as a group of words in some other language. Such words are difficult to translate.
  • There are certain words in some languages where transliteration is required before translation e.g. word in a language is used as noun.
  • Ambiguity can have adverse effect on translation process. Ambiguity means a word or a whole sentence in a language has entirely different meaning in some other language.
  • The natural language is open and keeps on changing from time to time. So complete automatic simulation of natural language is almost impossible.

International Journal of Science, Engineering and Technology Research (IJSETR) Volume 2, Issue 3, March 2013


Specifically relating to Google Translate, “An Analysis of Google Translate Accuracy” by Milam Aiken and Shilpa Balan (http://translationjournal.net/journal/56google.htm)

concluded that “Although Google Translate provides translations among a large number of languages, the accuracies vary greatly. This study gives for the first time an estimate of how good a potential translation might be using the software. Our analysis shows that translations between European languages are usually good, while those involving Asian languages are often relatively poor.

A further review in 2012: Literature Review: Machine Translation by Aurelia Drummer concluded that:

There has been a vast progress in the field of machine translation over the last 60 years. Statistical systems are showing promise as the increasingly global and digital world provides co- pious amounts if readily available parallel text

https://studylib.net/doc/8890466/literature-review--machine-translation

“Assessing the Accuracy of Google Translate to Allow Data Extraction From Trials Published in Non-English Languages” by  Balk.E.M et al (https://www.ncbi.nlm.nih.gov/pubmed/23427350)

Reviewed 10 randomised control trials, translating research articles into 5 languages French, German, Chinese, Japanese and Spanish. The study concluded that in terms of translation, the most accurate translations were from Spanish articles and least accurate from translated Chinese articles

SNOMED International has undertaken its own review of automated translation techniques, by comparing the French translation of the SNOMED CT Starter set, with a translation of the same content created using Google translate.  The results found a 32.91% accuracy rate.


In practice automated translation solutions such as Google Translate continue to improve, as algorithms are included which learn from previous translation activity. The nature of the underlying natural langauge capabilities and improvements in artifical intelligence continue to develop. Therefore in the future, the use of automated translations may become more viable. In the meantime, and for the purposes of the focused translation of SNOMED CT, Google Translate or similar does not provide a viable solution to translation alone. However, it can be used a part of a approach, offering an intial translation, which can then be quality assured through human review. This approach does offer some time and resource savings, which should not be underestimated.


"Comparing different methods to obtain an initial translation" - Feikje Hielkema-Raadsveld, Senior Terminologist, Nictiz / SNOMED CT NRC Netherlands 

https://drive.google.com/a/ihtsdo.org/file/d/0B6uXPjNSaa82Um1Vcng1OXNNbzg/view?usp=sharing


SNOMED Internationals own experience of using different approaches to translation are similar to those found in research. Translation of a series of information leaflets into Chinese, Japanese, French and German used a commerial translation company using automated translation techniques. The accuracy of the translation was 50—60% for French and German, and 30% for Chniese and Japanese. In comparison, the starter set of SNOMED CT contains 6,300 terms which were translated into the same languages. However, the approach for the starter set translation used human review by native speakers, who were clinically qualified, but lacked SNOMED CT/terminology experience. The accuarcy of the translation into Chinese proved to be 70% correct. At the time of writing this document, the Japanese, Gerna and French translations are still undergoing external quality review. But initial informal feedback would suggest between a 70-80% accuracy rate.


In terms of the use of Google Translate, it is perhaps best to leave the final words to Google themselves 

Google admits that its approach still has a ways to go. "GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms," Le and Schuster explain, "and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better." But soon, as Google’s products and services continue vacuuming up valuable corner 

  • No labels