A Proven Workflow for Technical PDFs Without Source Files
When a client submits a technical PDF document for CAT translation—such as a patent with engineering drawings, a technical proposal with data reports, or an installation manual with specifications and SOPs—without providing any source files (like AutoCAD, InDesign, etc.), Project Managers face a complex challenge with several possible solutions.
Here is what they have to consider:
- How to prepare the file for CAT translation while maintaining its exact page layout?
- How to make sure that the entire file, including all the technical drawings, benefits from all the available TMs and other CAT features?
- Finally, how can you help your DTP team avoid confusion when working with the translated text and ensure they produce a perfectly recreated target file?
We could suggest at least five different ways to handle such a file. However, after extensive testing, one method has become our favorite. And this is what we’d love to share with you.
Here is our quick guide for handling such files:
First, we run the file through OCR, or pre-DTP text extraction, as we refer to this step in-house.
So, for the general text, we perform pre-DTP text extraction, whereas for the drawings, we do the following:
- Our OCR specialist creates a two-column table. The text embedded in drawings is placed in the left column, with each phrase in a separate cell. The contents of the left column are then copied into the right column.

2. After the table is completed, the OCR specialist hides the left column with the source text using the MS Word Hidden option, as shown in the screenshot:

3. This technique allows the text in the right column to be translated by a CAT tool, without making any changes in the left column. After translation, the PM or DTP specialist generates the target translation and unhides the previously hidden text in the source column:

4. In the end, when recreating the original page layout, the DTP specialist can refer to the bilingual table, thus keeping the likelihood of errors to a minimum. This greatly speeds up the turnaround and the quality of the formatting.
