Collaborate with the MT, don’t compete with it

Dr. Mike Dillinger’s lecture builds up a macroscopic picture of the Machine Translation (MT). With more than 20 years of immersion in the industry and as Past President of the Association for Machine Translation in the Americas (AMTA), he is the person who can explain what has happened and will happen to MT industry.

Machines won’t replace you

The term “machine translation” is so contentious that translators tend to resist and boycott it, as the term, as shown literally, despises human wisdom and the “traditional” translation profession. Actually the concept will be more acceptable if using “machine-assisted human translation” (MAHT), “human-assisted machine translation” (HAMT), or “MT+PE.”

These terms looks quite different, but the boundaries among them are vague. For example, the MAHT and HAMT are reversal concepts, but in practice, the differences can’t be quantized. As translation is a complicated process involving human efforts, translation software, database, memories, corpus, etc., it is becoming tricky to simply use any terms mentioned above to describe your practice.

So what the term MT really emphasizes is adding the machine component to translation and the trend of adopting translation tools to facilitate the process.

For example, machines are able to suggest the closest translation options after searching its database. It takes a few seconds to do it. But for human search, it requires ultra-excellent memory and undetermined time span—from a few minutes to hours (if you like).

Therefore, translators don’t need to see MT as a competitor—don’t ask “will machine replace me” or “will I lose my job?” Machines can’t translate. It is not machines that will replace you, but translators who know machines.

How to use MT effectively?

MT is not omnipotent. MT can help saving time and energy, but can’t compete with human on quality. This is a fact that MT users, or translators should bear in mind.

The success of using MT depends on the person who use it—you should know the following things very well:

1. The type of source texts: unrestricted text, restricted text or in between. For restricted texts, the MT is a good help. Dr. Dillinger gives an ideal work flow that the MT can find solutions for the good match (75 percent) words and phrases, and make the best guesses for fuzzy match ones. So what translators need to do is to accept or tweak the good match results and do post-editing or translate the guessing results from scratch.

For unrestricted texts like literature, the MT is incapable to process it. But things are getting complicated when dealing with texts in between the two extremes (where most of texts are). Translators should be able to notice the variations and decide how much human efforts like post-editing should involve in.

2. The requirements of clients, especially about quality and time. Not all clients require high-quality translation. For some, time is more critical and the quality can be sacrificed. Therefore, it is important to make clear what clients expect for.

If they want high-quality deliverables, we’d better to increase the human component; while if they only want fast and understandable results, don’t bother inputting too much human talents into it.

Dr. Dillinger shows a surprising statistics that most post-editing changes only 1-2 words, a result that is completely acceptable and much better than expectation. I think partly because clients don’t require high-quality on these texts.

3. The characteristics of tools: complexity, availability, collaboration. There is no best tool but the right tool. Dr. Dillinger introduced two mainstream MT—Google Translate Toolkit and Microsoft Translation Hub. Google Toolkit is more accessible and easy to use, but it has no tailored TM. While Microsoft Hub is on the opposite. It takes some time to get used to it, but you can have your own TM.

For small and collaborated projects, Google Toolkit might be good enough. But for people who want to establish their own translation assets, Microsoft probably is a better solution.

4. Know where errors come from so as to better steer your tool and the whole process. There are three major sources. First is the source files. Speaking frankly, many source texts are poorly written with ambiguous meanings. And some have formatting problems.

Second is mismatch. The TM doesn’t cover everything, so the situation is usually seen that terms can’t be found by MT or some sentence types are not covered by MT, and of course some words should not be translated like brands’ names.

The third issue is about MT itself. The MT might choose wrong words from the database or incorrectly analyze the sentence structures.

It is common to have errors in the translation, especially the MT version without post-editing. Don’t waste time to blame your tool or complain about it. The more efficient way is to do post-editing and, if needed, review and proofreading.

 

In one words, the MT is only a tool and the decision maker is people. Of course, you need to know well the tools and your work to make a wise decision. The MT is our friend, not rival.

 

07. April 2015 by Phoebe
Categories: CAT | Tags: , | Leave a comment

[Ted]The Power of Quietness

In an era that extrovertion is greatly appreciated, the power of introvertion has been ignored, intentionally or not. The word “introvert” is regarded as the synonyms of solitude, anti-group work, social failure. This trend echoed the process of puffing salesmenship, entrepreneurship and leadership.

But Susan Cain listed the introvert figures that truely changed the world like Theodore Roosevelt, Ghandi, Darwin and major religious founders who brought their deep thoughts out from their contemplation and made a revolution on the world.

The three tips from Susan Cain are very interesting: stop the madness of constant group work, retain some time for yourself and open yourself and try to speak softly to the rest of the world.

see the video here

 

 

 

06. January 2015 by Phoebe
Categories: Uncategorized | Leave a comment

[Coursera] Xbench: QA Tool

When my CAT professor mentioned QA for the first time, I thought it was time for question and answer. Then I found it is one of the numerous L10n terms. But still, I supposed it referred to human proofreading — that’s what I did in my group project.

At last I realized QA can be done by tools when I watched a Coursera video made by Peking University. The tutorial video demonstrated how to use Xbench to examine the quality of a translation file, step by step.

The key step is adding the translation file and also a terminology file into the software. Of course, you should choose the right file type, source and target languages in the setting and properties section.

The software can analyze the translation and spot errors like untranslated or key term mismatch. Post-editors can export the QA report as a HTML file and send it to translators for reference.

The QA tool will be very efficient to process a large quantity of translations.

 

 

09. December 2014 by Phoebe
Categories: CAT | Tags: , | Leave a comment

CAT Utility Review

I watched the tutorial video about Anycount made by Yanan Chen and Elizabeth Crowell (A09), and found that the small tool is very useful. As the translation is usually charged by words, the accuracy of counting is important.

The Office word counter is easy to access, but it will not work when there are a lot of comments, pictures and captions, footers and headers, or embedded documents in a file. And also you need to open each file to calculate. Anycount supports a bunch of file formats including PDF! I used to estimate the number of words of PDF files, which usually had two outcomes — my income was decreased or I need try very hard to convice my clients. Now I have a better solution.

09. December 2014 by Phoebe
Categories: CAT | Tags: | Leave a comment

CAT Individual Peer Review

The peer review always gives my surprise. The researches my classmates did are what I am very interested about but dare not to try, as these topics seem big and complicated for me.

For example, we usually say that it’s difficult to apply CAT tools for translation between English and Asian languages.  But why? Helen Lee analyzed the differences between Korean and English, which gave me clues to the differences between Chinese and English and also helped me to better understand when tools might not work and why.

Cira Ortiz researched the MT in the perspective of cognitive science. Her acute observations on both human and machine helped me see the direction — let the machine help but not control. For CAT beginners, it’s easy to go from one extreme (staying away from tools) to another — keep looking for new tools and try to find an omnipotent one. But do remember, every tool has its limit, and human is the core.

Another interesting project was done by Vanesa Cao. She listed useful tools for freelance translators. In some degree, every translator is freelancer who can work independently and be responsible for their own work. So no matter if I work as a freelancer, it will be very helpful to learn some skills to prepare myself to “be a solution.”

Tingting Xu translated a political speech with Google Toolkit and the result was very satisfied. It made me think that machine translation should perform very well for the “dead” language like political speeches, government documents, and regulations and laws. In some way, they are very controlled language in terms of emotion and grammar. 

I also learned a lot from several other videos, like the Google Translate Community, Memsource and the post-editing of machine translation. Many thanks to my smart and diligent classmates:

A04 CAT Tools for Freelancers: Vanessa Cao
A19 MT Comparison Google SYSTRAN Microsoft Babylon: Helen Lee
A35 Google Translator Toolkit: Tingting Xu
A31 Post-Editing in Memsource vs Google Translator Toolkit: Jingxiao Wang
B02 Memsource Trial vs Wordfast Anywhere: Ryan Baker
B18 Undestanding Google Translate MT: Xian Lu
B24 Cognitive Science & MT – Cira Ortiz    Cognitive Science & MT    Cira Ortiz

09. December 2014 by Phoebe
Categories: Uncategorized | Leave a comment

CAT Groupwork Peer Review

As I did’t have much experience on CAT tools, it is really helpful and eye-opening to watch presentations of my fellow classmates about their projects.

I watched seven presentations and the first lesson I learned is — a good name is half the success. Besides the assigned videos, I chose those with eye-catching names like Unamed Group, LIE and Team Awesome. And also I chose those made by my familiar names like Joanne Wang Tiantian Dai, and Wen Ou Jingxiao Wang. So you should choose a cool name for your comany, otherwise you make a lot of friends.

Many teams used Memsource in their projects. I am not familiar with it as I only used Google Translator’s Toolkit and Wordfast Anywhere. I found the interface of the Memsource is very user friendly, and the collaboration function is good enough for a small team. But the workflow is very restrict which means you can’t start the next process till finish the current one. It is not very realistic.

The words I heard the most frequently is “collaboration.” As this is a group work, cooperation is the key for the success. One team said that the two members trusted and helped each other a lot; several teams mentioned that the team members had to edit and proofread translations for each other so they worked very closely and well connected.

Keep consistency is very important for team work. In my own group, we worked face in face so we could consult with each other any time when necessary. But in real life, it is impossible to get all translators together, not to mention managers and terminologists. From the classmates project, I found that writing a style guide may be a good solution. The glossary can help us to keep consistency of terms, and style guide can keep consistency of word using, format and many other things.

This is a wrap-up reflection on the following videos; many thanks to my classmates:

Las Políglotas: Anelix Diaz, Virginia Ruiz Ugalde, Suhey Tapia
Les coquelicots: Elizabeth Crowell, Anna Bialostosky
LIE: Lora Qiao, Xinhui Du (Emily), Kexin Pei (Isabel)
Joanne Wang Tiantian Dai: Joanne Wang, Tiantian Dai
Team Awesome: Helen Lee, Youngmin Choi
Wen Ou Jingxiao Wang: Wen Ou, Jingxiao Wang
Unamed Group: Amber Li, Jessica Lu, Monica Liang

09. December 2014 by Phoebe
Categories: CAT | Leave a comment

Obama Likes Old Interpreter while Xi Likes New

In the Beijing APEC summit last November, the interpreters for Obama and Xi attracted people’s attention as the two are in sharp contrast. Jim Brown, interpreter for Obama, in gray hair, used to interprete for President Reagan in 1984. At that year, Sun Ning, Xi’s interpreter, was only three-year-old.

According to a report by NBC News Producer Adrienne Mong in 2007, Jim Brown, or James W. Brown shown on the diplomatic list, is called “only one Jim” by Brenda Sprague, the Director of the Office of Language Services for the Department of State.

So the question is — where are young American interpreters, and where are senior Chinese interpreters?

http://video.sina.com.cn/p/news/c/v/2014-11-13/090764244791.html

06. December 2014 by Phoebe
Categories: Translation | Tags: , | Leave a comment

Translate Those Fit Your Style

Admit it, you can’t translate every style of literary works.

I had a brief but interesting talk with Professor Balcom yesterday. He is one of the best literary translators from Chinese to English. But he carefully chose works for translation. In other words, he translated those fitting his style.

We usually try hard to remain the authentic style of a work when translating the language. In fact, however, if your writing style (though not as signaling as writers) is concise and straightforward, it is not wise to translate works of Bernard Shaw. No matter how hard you can try, the translation will turn out to be a strange mixture. But translating Hemingway might not be a bad idea.

If it’s hard to change your own style, then find works of your style instead.

 

 

06. December 2014 by Phoebe
Categories: Translation | Tags: , | Leave a comment

[Ted] What Makes a Word Real?

As a translator, I concern about all mistakes (aka the words I can’t find in a dictionary) in a text. As a non-native English speaker, I always doubt if I use a word correctly in my translation. So I am picky to others and to myself as well — a little paranoid.

Watching this video eased my nerve. A word is called a “word” not because a dictionary collects it, but because people use it and keep using it. So if a “mistaken” word is really bad, it will be discarded soon; so I don’t need to worry about it. If I make a mistake, I just keep using it till everyone loves it, including the dictionary editors! LOL

03. December 2014 by Phoebe
Categories: Translation | Tags: , , , | Leave a comment

Computer/Human? Relax, translators!

I like translating.

I was thrilled when finding a perfect equivalent word. I enjoyed the tick-tack rhythm when keyboarding so fast. I felt a strong sense of achievement when finishing a long article or book.

That’s why I was so reluctant to learn machine translation or anything related to computer.

But after taking the Computer-Assisted Translation course this semester, I found that the “computer or human” is not even a problem. Because we don’t have to choose one, we can have both! Like computer and map drawers, computer and novelists, computer and musicians, we can have computer AND translators!

Translators, you will not lose your job. Just let the computer do the dirty things for you!

 

03. December 2014 by Phoebe
Categories: CAT | Tags: , , | Leave a comment

Sites DOT MIISThe Middlebury Institute site network.