The Price of “Fast” Conversion: Why Automatic PDF-to-Word Services Ruin Your Translation Projects
In the translation industry, time is the most valuable resource. When a client sends a complex PDF file that was needed yesterday, the temptation to use a free online converter or the built-in “save as MS Word” function is very strong. It seems you can get a document ready to work with within seconds.
But is that really a saving? Using the example of a typical complex document, a medical research article, let us look at how automatic conversion turns into a “time bomb” that explodes during translation and final desktop publishing.
The anatomy of a perfect storm: What is inside a PDF?
Imagine a standard medical article: text in two columns, headers and footers, complex tables, footnotes, and images with captions. After automatic conversion, this file looks “decent” only visually. The problems begin the moment you upload it into a CAT tool such as Trados, memoQ, or Phrase.
1. Broken sentences and a parade of tags
Automatic converters often insert hard returns at the end of every line or column.
The consequence in a CAT tool: Instead of one complete sentence, the translator sees two or three separate segments. Translation memory becomes useless, and translation quality drops because the context is broken. On top of that, the converter generates hundreds of unnecessary tags, making it harder for the translator to even see the text.
2. The illusion of tables
For a converter, a table is just a set of lines and text. Very often, instead of a real Word table, you get a construction made of lines as Shape objects and text aligned with tabs.
The problem: Even when technically converted to a table, it often has fixed row heights. Since translated text, for example, from English into German or Ukrainian, is usually 20–30% longer, it simply disappears beyond the cell boundaries, and you may not even notice it after export.
3. Structural chaos: Headers, footers, and footnotes
Converters often fail to recognise header, footer, and footnote areas. They insert them directly into the main body text.
The consequence: The main text is interrupted by technical content. This not only distracts the translator but also prevents the text from reflowing automatically from page to page after editing.
4. Graphic objects and text boxes
Text on medical diagrams or photographs, after auto-conversion, becomes either a non-editable image or a chaotic collection of scattered text boxes.
The problem: These boxes do not have auto-fit enabled. When the translated text becomes longer, it spills outside the box or overlaps with another element. They are also rarely grouped, so any formatting change can make them fly apart across the page.
5. Formatting junk
Double spaces, tabs used instead of indents, and spaces used to align text across the page are all things converters do in order to visually imitate the original. For a CAT tool, this turns into irrelevant tags and segmentation errors.
What happens after export from the CAT tool?
The most painful stage for a project manager is Export / Clean up.
When you export the translated file, you get a mess. The text expands, the columns shift, the tables overlap the captions, and part of the translation disappears into invisible areas of text boxes.
The main risk: At this stage, errors appear that cannot be fixed without a linguist. For example, the text may have shifted so badly that it is no longer clear which caption belongs to which figure. You then need a translator with professional DTP skills, which is rare and expensive, to put this puzzle back together.
Professional DTP preparation: Why is it worth it?
Professional document preparation, OCR and formatting before translation, changes the whole process:
- Clean segments: The translator works with complete sentences without unnecessary tags.
- Real styles: We use proper heading and paragraph styles. The text flows freely while preserving the logic of the document.
- Dynamic tables and text boxes: Table cells expand automatically to fit the translated text. Text boxes are grouped and set to auto-fit.
- Proper headers, footers, and footnotes: They remain in the correct Word system areas and do not interfere with the translation of the main text.
- 100% correspondence to the original: You receive a 1:1 document that is easy to edit.
Conclusion
Automatic conversion is a free service that you pay for with your project managers’ time, your translators’ nerves, and your clients’ loyalty.
We specialize in turning any complex PDF into a clean, professionally prepared Word file. You receive a document adapted for CAT tools, where every tag is in its proper place and final post-translation formatting takes minimal time.
Trust us with the technical side so you can focus on what matters most: translation quality.