SMT Training for Patent Laws (Using Microsoft Translator Hub)

This is one of the most adventurous projects I have done at MIIS! In the course Advanced Computer-Assisted Translation taught by professor Adam Wooten, we were asked to explore Microsoft Translator Hub, train a machine translation engine in the form of a pilot project. The topic we choose is to train an translation machine customized for patent laws. Our source language is Chinese, target language is English. After the pilot project, we were asked to give a proposal of the whole project, which is translating 40,000 characters.

Along the way, me and my groupmates met all kinds of frustrations and challenges in almost every aspect of this project. From the chart below you can know how the project goes:

As you can see, there are four major stages in our project. Before actually do the training, we need to estimate the cost and time of the project and calculate how fast/cost-effecient PEMT (machine translation + post-editing) can be compared with human translation. We also need to present the client how we can ensure the quality of PEMT, and what quality we can achieve.

However, we made a mistake at the very beginning while making these estimations. We assume that PEMT must be faster and cheaper than human translation, which is not necessarily true. The speed and cost really depends on a variety of factors, including the volume, text type, SMT engine condition etc. The most reliable way is to test it out. And our final esitmation of PEMT time including machine training, based on our experiment of the machine, can be 5% lower to 30% faster than human translation. The chart below is our metrics for time&cost estimation of the whole project.

What I also learned is never estimate the influence of assumptions, because they can also influence your mind and judgement! Because we assume PEMT is faster, when we are calculating post-editing speed, we came out a number that is later proved not even close to the real situation.

In terms of the quality of machine translation, we let human translators review the texts translated by machine and deduce points based on selected MQM metrics. Our group members reviewed around 300 strings and we came up with a conclusion that the satisfying translation should achieve 700 out of 1000 scores. We also timed ourselves while doing post-editing, and found out we are way slower than we thought. Because the SMT engine is not fully trained yet, there are some major adjustments that need an editor do.

Other Lessons Learned:

1. A low BLUE score does not equal to bad quality. Some of the translations from low score engines are pretty decent, as we observed.

2. Garbage in, garbage out. Some of the training data we put in contains mis-spelling words and fragmented texts, which results in similar issues of the machine translation.

3. Microsoft Translator Hub is not that reliable. The training results are not that predictable. During the process, the training platform malfunctioned for a day or two, thus our systems just failed. Sometimes we failed for no reason.

4. The data input must be large volume. When we did not put in enough data, the system warned us about the minimum requirements for engine training. Our group got a comparatively very high BLUE score and decent translation result, grealty because our training data is huge.

Deliverables:

Proposal for Pilot Project

Proposal for Whole Project (based on lessons learned from pilot)

Presentation on Whole Project Proposal and Lessons Learned

 

 

 

 

 

Sites DOT MIISThe Middlebury Institute site network.