Machine Translation (MT)

Machine Translation

Implementing machine translation (MT) can be a solution for companies that need a cost-effective way to translate a high volume of content that has low visibility or does not necessarily need to be of high quality. For instance, companies can consider applying MT on low-visibility content such as user-generated content on social media or product descriptions on e-commerce platforms.

MT Training

One way to implement MT is to simply connect your website, app, software, etc. to an already existing MT, such as Google MT or Microsoft Translator. If you have more specialized content and/or terminology, you could also make use of NMT and develop your own model. Most Language Service Providers have domain-specific MT models (e.g. finance, automotive, etc.) that can be further customized based on the clients’ needs.

The basic rundown of the MT training process is: 

  • The engine is usually trained, tuned, and tested using previously translated content (parallel texts) done by human translators (to ensure accuracy), and based on these segments, the engine uses algorithms to match target language segments as closely as possible to the source segments.
  • The more content is fed into the machine engine, the more it “learns” and improves
  • “Garbage in, garbage out”
    • The quality of the content fed to the engine is highly important for it to produce accurate and high quality translations

The main advantages of training your own MT are:

  • The more you train the MT, the more accurate it will become. 
  • The more specialized content you use to train the MT, the faster the MT will be able to learn your preferred style and terminology. 

MTPE

You can achieve better results from the MT when there are human translators who can post-edit the machine translation output (this process is called MTPE).

This is a common alternative to traditional human translation. Rather than have the human translator translate all the content, the human reviewer only needs to edit the target content that the MT produces to make it sound more fluent and natural to the target audience. This process is usually faster than human translation, unless the MT generates mostly “garbage” (e.g. content that has a lot of spelling and grammar errors, omission errors, meaning errors, style errors, etc.). In this case, it may actually take more time for the human reviewer to re-translate the content.

Pros and Cons of using MT

If you’re considering translating content on a large scale, MT is definitely a time-saving and cost-effective solution that’s worth looking into. However, technology still has its limits and MT is still not perfect. We will explore below the benefits and drawbacks of using MT below.

The main advantages of using MT are that it:

  • Produces translations very quickly 
    • You can publish translated content into the market a lot faster than if you were using the human TEP process.
    • In short, you can have faster turnaround times for a large volume of content into many different languages.
  • Cuts down on translation cost 
    • Machine translation is usually followed by post-editing from human editors, which is less costly than translators. However, as editors only check for major accuracy and fluency errors, the quality of the translated content might be lower than one that is 100% translated by human translators. 

However, MT is not an ideal solution for content that requires a higher quality output. For example, with marketing content, mistranslations can lead to ineffective marketing campaigns, misunderstandings regarding the product/service and/or company, and even a damaged brand reputation, all of which lead to a loss of profit. 

The main disadvantages of using MT are that it:

  • Produces mostly “word-for-word” translations
    • Thus, it is not that great at picking up linguistic and cultural nuances in a text.
  • Does not always produce accurate and/or fluent translations
    •  It sometimes cannot create proper sentence structures or pick up on the correct context (and thus generates a mistranslation)
  • Cannot do transcreation
    • Transcreation work still relies heavily on human translators.
  • Does not work well with long and/or complicated sentences

In short, MT is not a good option for content that has:

  • Complicated sentences or sentence structure
  • Subject matter/domain that is strictly regulated
    • E.g. life science, pharmaceutical or medical content 
  • Highly nuanced content (culturally, contextually, etc.)
    • E.g. marketing content, subtitles, literature, etc.
  • Local dialects
  • Slang
  • Profanity

Translation Management Systems (TMS)

A translation management system (TMS) is a software that automates and streamlines the manual and/or repetitive parts of the translation process, making it much more efficient and cost-effective. A TMS usually contains process management, project tracking, and linguistic technology (explained further down), and is usually integrated with a content management system (CMS). It is a good investment for companies that not only handle a high volume of content that needs to be translated, but also for running those translations projects in parallel.

Functions

Functions vary among TMS’s, and a few of them are listed below:

  • File storage/management
    • Files are stored within the system rather than in a third party cloud storage system or the like
    • Files can be sent through the system, and the vendors can access said files by logging into the system; or, the TMS generates an email template with the files automatically attached
    • Source files can also be analyzed within the TMS to determine word count, fuzzy matches, etc.
  • Creating project workflows
    • Workflows can be created and customized for specific clients and/or projects. 
  • Project Management
    • Project managers (PMs) can monitor project status and make sure the project is on track
    • PMs can monitor budgets to maintain a healthy gross margin
    • PMs can create Email templates for different stakeholders
  • Quote Generation
    • As mentioned above, the source text can be analyzed within the TMS based on aspects such as word count and fuzzy matches, and the analysis can be used to create a quote
  • Reporting
    • Translation performance metrics and vendor performance metrics can be incorporated in the translation evaluation reports
    • Reports for stakeholders will be sent out automatically on a regular basis
  • Sending reminders
    • Automatic reminders can be set up to remind project managers of upcoming, urgent deadlines, and/or sent to the vendors
  • Vendor management
    • May contain a vendor database, which may include information such as their roles, language pairs, rates, ratings/performance tracking, availability, etc.
  • Invoice creation and tracking
    • Or, some TMS’s are connected to an invoice management system that is used by the company’s accounting department

Advantages

The advantages of having a TMS include (but are not limited to):

  • Automation of much of the translation process 
    • E.g. file sending, etc. 
    • Automation saves time spent on projects and cuts down on management costs
  • Easy project tracking throughout each step of the translation workflow
  • Centralized linguistic assets
    • E.g. CAT tool, translation memories, glossaries, dictionaries, machine translation, references, context, etc.
  • Facilitation of collaboration among project members
    • E.g. translators, reviewers, project managers, accounting team (sometimes), etc.
    • Easy project assignment to translators and reviewers
  • Integration with other applications, such as:
    • CMS: The translation process can start immediately once the content is published or updated
    • API: TMS’s cannot do everything; thus, it is convenient to use APIs to add the desired functions

Disadvantages

However, there are a few disadvantages in incorporating a TMS into your workflow:

  • A TMS is not cheap and requires a large initial investment, especially for more advanced features
  • Also, some TMS’s UI is not very intuitive, especially if they have many advanced features. Hence, TMS buyers might need to ask their engineers to help set up the system. In addition, PMs and system admins need proper training to use the TMS.Therefore, some companies may resist using the TMS, as learning to use the system takes time that they can use on other projects. 
  • In the event that you decide to switch TMS’s, migration between TMS’s can become quite complicated
    • Migration of content can take up to months
    • The TMS company intentionally makes the migration difficult, or owns some of the content (check the contract)

Considerations

Given the above discussion, when determining whether or not to use a TMS, it is best to first analyze your company’s localization needs. Some questions to ask are:

  • How much content do you need translated? 
  • How often are you expecting to translate new content? Is your content updated frequently?
  • What is your localization budget?
  • Do you plan to work with more than one translation vendor? 
  • Will having a TMS eventually increase your profit margin, or simply financially hurt you?

In the event you decide that a TMS would make localization much more efficient and manageable, then when comparing the different TMS’s on the market, it’d be best to consider what functions in a TMS (and possibly API’s) would add value to your localization process. What are the possible integrations you can have with a specific TMS? It goes without saying that a price comparison should also be done. Different TMS’s offer different types of plans to fit a wide variety of localization needs.

CAT tools

What is Computer-Assisted Translation (CAT)?

One of the most important infrastructures in the localization workflow is Computer-Assisted Translation (CAT) tools. In short, these are a type of software that localization professionals (e.g. translators, editors, project managers, etc.) use to facilitate and support translation processes.

According to General Theory of the Translation Company by Renato Beninatto and Tucker Johnson, “CAT tools are able to take content from virtually any type of file and put them into a translation-friendly environment for translators to work on.” This usually saves file preparation time, as the project managers do not need to transfer the source text onto a Word or Excel file (for example) for the translator to access the text, and the translator can translate directly within the tool. When the translation is finalized, the CAT tool can export the translation in the same file format as the original source text.

Common features

Aside from file preparation, CAT tools usually have the following features:

  • Text alignment – Most CAT tools can align the source text and the target text automatically and compile the bilingual document into a translation memory file.
  • Translation Memory – This is a database containing the source segments and all target language segments (i.e. the translations). It is best to set up the translation memory before the translation project starts, and to keep building it up as more and more translation projects are completed. The more robust the translation memory becomes, the more it can be leveraged in future projects for the same clients. Confirmed translations cut down on time and translation cost, as translators can work more efficiently and LSPs charge lower rates for fuzzy matches than new words.
  • Term base – Term bases are essential for companies that have many set terminologies for their product, service, or industry. As translators translate in the CAT tool, the term base will generate terms that the translators can reference. This ensures that translators use the terms they should be using, and thus maintains consistency throughout all produced work.  
  • Project management function – The project management function allows the creation of workflows in advance of a project and allocate each project to translators. In addition, it helps to ensure the projects are on track to meet their respective deadlines. This function can sometimes even track the performance of specific vendors.
  • Automated Quality Assurance (QA) – This includes a spell checker, a grammar checker, and more. The built-in QA function streamlines QA checks, as it also catches errors in numbers (including decimal versus the thousands separator), mismatches with the translation memory and term base, inconsistencies between translations, and more. The tool can even be customized to check for certain aspects (e.g. using regex). However, automated QA is not a substitute for human reviewers. It cannot catch all error types, and it will sometimes generate false positives. 
  • Pseudolocalization – A testing method for the internationalization aspect of the project. It replaces the target language with textual elements when demonstrating how the finished product would look with the target language. This tests for aspects such as string length, language direction, character display, and UI.
  • Machine translation – There is usually a feature that allows the user to connect to an existing machine translation such as Google MT.

Special features

Some CAT tools have specialized features that might come in handy for certain types of project, such features including but not limited to:

  • In-context translation: In the case of video or website localization, for example, some CAT tools allow users to preview the translated subtitles/translations in videos or localized web pages during the translation process. This is very helpful for the translators and editors as they can determine the context and the proper length of the strings early on based on the preview.
  • Supporting specific file formats: Some CAT tools are able to support more file formats that others, and other CAT tools are built specifically to support only certain types of file formats (e.g., a CAT tool that is built to specifically translate software versus documents). Thus, these types of CAT tools may be more suitable for those types of translation projects.
  • Neural machine translation: In addition, or in replacement, of connecting to machine translations, some CAT tools have a neural machine engine built within it. As the translator translates and confirms segments, the machine learns from the confirmed data and generates more accurate/appropriate translation suggestions.

The benefits of using CAT tools

CAT tools are practically essential for developing a mature localization model, as they make localization processes more streamlined, manageable, and effective. However, it is best to conduct research, as there are a wide variety of CAT tools in the market. Some are free and open-source, while others need to be purchased or require a paid subscription. It is best to first analyze what type of localization projects your company will carry out, what requirements/functions are needed to smoothly carry out those projects, and how much budget will be needed to run them. Based on this analysis, a comparison of the CAT tools and their prices can then be carried out.

Sites DOT MIISThe Middlebury Institute site network.