Custom Translator

  1. To properly train an NMT, there needs to be ample data to feed into the engine. The suggested minimums for training, tuning, and testing can be seen in the chart below pulled from the Microsoft website.


    As we can see, the number of sentences required is quite large. It is true that the number of aligned sentences required is much less, however this comes at a price, which brings me to my next point.

  2. Aligned sentences generally work toward improving the Bleu score of the NMT engine. However, as we all know, in translation work alignment takes time. The more documents we have to align, the man hours required to tackle this task grows exponentially. If the team assembled to take on an NMT engine training project is not large enough, you will most likely run over budget.

  3. The training of an NMT engine and the eventual deployment of said engine must both meet the same criteria: scale. In order for the training of an NMT engine to make sense from a financial standpoint, the team assigned must be large enough to handle this task. After the NMT engine reaches a point that the team deems ready for deployment (or perhaps even before undertaking this project), the company then needs to decide whether the size of the company and the scale of the NMT engine are compatible. Basically, if the size of the company is not large enough to meet the demands of NMT applicability or maintenance, then it’s probably best for the company to forego using NMT.

For reference purposes on the finer details of this project, please refer to the links below.