Regular expressions can be used in the QA stage of a translation in Trados, for example, in order to catch and/or convert parts of the text that were not properly localized for the target audience, i.e., incorrect punctuation or date formats. We have come up with five regular expressions that may be of use when working between Russian and US English.
Placement of period with quotation marks
\"\.
\.\"
The first would be a rule for periods located outside of quotation marks:
“Hello there”.
which would be used for Russian (or non-US English). The second would be for periods inside quotation marks:
“Hello there.”
which we use for US English.
Quoted speech
\,\"\s\w
\,\s\—
Russian does not close quotation marks until the end of a quote, even if you interrupt it in the text, and instead uses an em dash.
“I thought that was you,” he said, (first example, English: comma, quotation mark, white space, letter)
“I thought that was you, — he said, (second example, Russian: comma, white space, em dash)
Date Punctuation
\d{2}\/\d{2}\/\d{4}
\d{2}\.\d{2}.\d{4}
A forward slash is one way that US English writes dates. Russian uses periods. If you already have the date in the correct order (day month year) and with the full year, as is the norm in Russian, these expressions show forward slash in a date to period.
12/01/1999 (first example: two digits, followed by a slash, followed by another two digits, followed by a slash, followed by four digits)
12.01.1999 (second example: two digits, followed by a period, followed by another two digits, followed by a period, followed by four digits)
Phone Numbers
\(\d{3}\)\-\d{3}\-\d{4}
\(\d{3}\)\s\d{3}\-\d{2}\-\d{2}
Russian phone numbers are written with a space after the area code and the last four digits are broken up into two groups. If you had an English text where the phone numbers were written as (555)-555-1234, as in the first example, you could then use the second to end up with a phone number that looks like (555) 555-12-34.
Numbers Greater Than 100
(\d{1,3})\,
(\d{1,3})\s
For numbers larger than 100, Russian uses white space, rather than a comma (commas are used to indicate decimals). The first regular expression here would be for something like 100,000,000, and the second would produce 100 000 000.