Founder at OCR Craft
Over 15 Years of Experience in the Translation Industry

Imagine this: 

Your client has sent you a translation project consisting of scanned documents that contain mathematical or chemical expressions, equations, or formulas. However, these equations contain not only numbers, operators, functions, or logarithms, but also words and phrases that require translation. What’s more, the target output must be formatted as .docx.

What to do if you want to:

  1. Translate the entire text (including any text within the equations) using your favorite CAT tool?
  2. Make sure the translated document in .docx is recreated to perfectly match (1-to-1) the layout of the original document? 

There are 2 ways to prepare such documents for translation. Let’s discuss them and choose the one that works the best.

Method 1: Processing the text in the equation as a textbox

What does it mean? 

The OCR specialist extracts the equation as an image, erases the text within the equation, and replaces it with a textbox that contains the typed-in text needing to be translated.

Therefore, even though the equation is formatted as an image, the text in the equation becomes “translatable,” as shown below.

Using this method to process an equation takes approximately 4 minutes.

Pros: The text will be available for CAT translation.

Cons: After translation, the word or character count may increase significantly (depending on the language), and the output may not display correctly:

To fix this, you will need to decrease the font. And if there are any subscript or superscript characters, the output may be hard to read.

Now, imagine you have some 30–40 more equations to go. How much time would it take you to finalize the entire project?

Method 2: Recreating the equation using the MS Equation Tool (Office Built-In Equation Editor)

The OCR specialist retypes the equations in MS Word using a special built-in editor, which means that the output will be 100% editable.

And timewise, this will take the same 4 minutes.

The Office Equation Editor has a library of built-in equations that will help you quickly recreate even the most complex multilevel formulas.

Pros: 

1) The text will be available for translation in CAT. *

*Below are our step-by-step instructions to help you configure the Project Settings in SDL Trados Studio to make the expressions in your file available for translation.
2) Should this result in a higher or lower word or character count, the equations can be automatically adjusted as needed. You won’t need to do anything else!

Cons: None!

The choice seems pretty clear 🙂

And now, as we have promised, our step-by-step instructions for the configuration of the Project Settings in SDL Trados Studio to help you translate files with expressions and formulas using Method 2 (see the screenshot below):

  1. To configure the Project Settings for your project, go to the Projects view and on the Home tab, select Project Settings.
  2. Expand the File Types list and select the Microsoft Word 2007–2016 option.
  3. Select the Common page.
  4. Then check the box next to the Extract mathematical expressions option to make mathematical expressions available for translation.
  5. Select OK.

I would be very grateful if you could provide your feedback and tell us whether you or your colleagues have found this information useful.

If you occasionally receive noneditable PDFs from your clients for translation, we would gladly help you with any tasks involved in 1-to-1 PDF to DOC conversion, however difficult they may be!