Home/Blog/Translation Memory Explained: What It Is, How It Works, and When You Need One
Guide

Translation Memory Explained: What It Is, How It Works, and When You Need One

Translation memory stores previously translated segments for reuse. This guide explains how TM works, when it saves money, when it gets in the way, and how it fits into modern machine translation workflows.

Thomas van Leer· Content Manager, LangblyFebruary 18, 20269 min read

Translation memory (TM) is one of those concepts that sounds simple, works well in specific situations, and gets oversold as a universal solution. At its core, TM is a database that stores previously translated text segments so they can be reused later. Translate a sentence once, and the next time that sentence (or something similar) appears, the TM suggests the previous translation.

That's it. The concept is straightforward. But the details matter: how matching works, when TM actually saves money, when it creates more problems than it solves, and how it fits alongside machine translation.

How translation memory works

A translation memory stores pairs: a source segment and its approved translation. When new content needs translating, the TM compares each source segment against its database and returns matches.

Match types

Exact match (100%): The source segment is identical to a stored segment. The TM returns the previous translation directly. For repetitive content like software UI with recurring phrases, exact matches save real time.

Context match (101% or ICE match): The segment matches, and the surrounding segments also match. This is even more reliable than a basic exact match because the context confirms the translation is appropriate. The segment "Save" translated differently depending on whether it appears after "Would you like to save your changes?" or "Save to favorites."

Fuzzy match (70-99%): The source segment is similar but not identical to a stored segment. The TM returns the closest match with a percentage indicating how similar it is. A 95% match might differ by one word. A 75% match might need significant editing.

No match (below threshold): The segment is too different from anything in the TM. The translator works from scratch.

Segmentation

TM splits content into segments, usually by sentence. This is where things get tricky. The quality of segmentation directly affects match quality.

Consider these two sentences:

  • "Click the Save button." (Segment 1)
  • "Click the Save button to save your changes." (Segment 2)

Segment 2 is a fuzzy match for Segment 1 despite containing it entirely. A translator still needs to review and potentially translate from scratch. Long sentences produce fewer matches than short ones because there's more room for variation.

Some TM systems offer sub-segment matching, breaking sentences into smaller fragments. This increases match rates but decreases reliability. A matched phrase pulled from a different context might be grammatically correct but semantically wrong.

When translation memory saves money

TM works best when your content is repetitive and updates are incremental. The economics are straightforward: stored translations cost nothing to reuse, so the more repetition in your content, the more you save.

High-TM-value scenarios

  • Software UI updates: You change 50 strings in a 5,000-string app. The other 4,950 are exact matches from TM. You only pay for translating 50 new strings.
  • Technical documentation updates: A product manual is revised annually. 80% of the text stays the same. TM covers the unchanged portions.
  • Legal and regulatory content: Standard clauses reappear across documents. Consistent translation of legal language is both a cost saver and a compliance requirement.
  • Product variants: You have 10 similar products with 70% overlapping descriptions. TM reuses the shared content and translators focus on the unique parts.

Low-TM-value scenarios

  • Marketing content: Creative copy changes significantly between campaigns. Match rates are low, and fuzzy matches often need complete rewriting anyway.
  • News and editorial: Each article is unique. TM catches standard phrases but the bulk of content is new.
  • First-time translation: Obviously, if you've never translated your content before, TM is empty. The value comes from the second translation forward.
  • Small projects: If you're translating 500 strings into 2 languages once, the overhead of setting up and maintaining a TM exceeds the savings.

Translation memory vs. machine translation

TM and machine translation (MT) are different tools that solve different problems. TM retrieves past human translations. MT generates new translations algorithmically. They can work together, but they're not interchangeable.

Where TM wins

Exact matches from TM are by definition pre-approved translations. If "Save changes" was reviewed and approved last release, the TM version for the next release is guaranteed correct (assuming context hasn't changed). Machine translation gives you a new output every time, which requires review.

For consistency, TM is unbeatable. The same segment always gets the same translation. MT might translate the same phrase differently depending on surrounding context, which is sometimes better (contextual accuracy) and sometimes worse (inconsistent terminology).

Where MT wins

For new content with no TM matches, machine translation provides an instant first draft. Modern context-aware translation engines produce output that's often 85-95% correct for software content. A reviewer cleans up the remaining issues faster than translating from scratch.

MT also handles fuzzy match situations better than TM in many cases. A 75% TM match might need so much editing that starting from MT output is faster. The crossover point depends on content type and language pair, but generally, below 85% match quality, MT is more efficient.

The combined workflow

Most professional translation workflows use both:

  1. Check TM for exact and high-fuzzy matches (above 85%)
  2. Use MT for everything else
  3. Translator reviews all output, spending less time on TM matches and more on MT output
  4. Approved translations feed back into TM for future use

This is the workflow that TMS platforms like Crowdin and Phrase implement. The translation engine (whether it's Google Translate, DeepL, or Langbly) provides the MT component. The TMS manages the memory.

TM file formats

If you're working with translation memory, you'll encounter these formats:

  • TMX (Translation Memory eXchange): The standard interchange format. Every TMS can import and export TMX. If you switch tools, export your TM as TMX.
  • XLIFF (XML Localisation Interchange File Format): Used for both TM and in-progress translations. Common in enterprise workflows.
  • TBX (TermBase eXchange): For terminology databases (glossaries), not translation memory. Often confused with TMX but serves a different purpose.
  • Proprietary formats: Each TMS has its own internal format. SDL Trados uses SDLTM, MemoQ uses its own format. These are typically not interchangeable, but all support TMX export.

Keep your TM portable. Even if you're committed to one TMS, export a TMX backup regularly. TM data is an asset. If you've invested $100,000 in human translations over five years, that TM represents $100,000 of reusable work.

TM maintenance

Translation memory degrades over time if you don't maintain it. Common problems:

  • Outdated entries: Product terms change, features get renamed, and old translations reference things that no longer exist. Regular cleanup matters.
  • Inconsistencies: Multiple translators over time introduce different translations for the same term. Periodic alignment against your glossary fixes this.
  • Pollution: Machine-translated content that was never reviewed gets stored in TM. Future matches retrieve unreviewed, potentially incorrect translations. Only store approved translations in your production TM.
  • Size bloat: Very large TMs slow down matching. Segments from deleted products or deprecated features add noise. Archive or remove irrelevant entries.

Schedule TM maintenance quarterly. Remove entries for discontinued products, realign terminology with your current glossary, and purge any entries that were stored without human approval.

Do you need translation memory?

Honest answer: maybe. TM adds value if you're translating continuously updated content across multiple releases. It adds overhead if you're doing one-off translations or working with highly creative content.

You probably need TM if:

  • You update translations regularly (monthly or more often)
  • Your content has high repetition (software UI, documentation, product catalogs)
  • Consistency matters (same terms must be translated the same way everywhere)
  • You work with professional translators who use CAT tools
  • You support 5+ languages (TM savings multiply with each language)

You probably don't need TM if:

  • You're translating one-off content (a single marketing campaign, a one-time document)
  • Your content changes completely between versions
  • You're using a good machine translation API and reviewers for a small number of languages
  • You're in the early stages of localization and don't have existing translations to build on

If you're just starting out, focus on getting a reliable translation workflow in place first. Use a translation API (our quickstart guide gets you running in minutes) for the initial pass, have reviewers approve the output, and store those approved translations. That collection of approved translations becomes your TM naturally, without requiring a separate TM tool from day one.

The future of translation memory

TM was invented in the 1990s when machine translation quality was poor and human translation was the only reliable option. The concept made sense: storing expensive human work for reuse was economically obvious.

Now that machine translation quality approaches human level for many content types, the economics are shifting. If a translation API costs $1.99-$3.80 per million characters, and the output needs minimal editing, the cost savings from TM are smaller than they were when the alternative was $50-$100 per million characters for human translation.

TM still matters for consistency and for specialized domains where machine translation struggles. But for many teams, the combination of a good translation engine and a quality review process delivers results faster than maintaining a TM system.

Related reading

Translation MemoryTMLocalizationTranslation Technology

Context-aware translation without the TM overhead

Langbly produces consistent translations using contextual understanding rather than segment matching. Try it with 500K free characters.