Mitigating Memorization in Language Models: The Goldfish Loss Approach

  • Researchers introduce the “goldfish loss” technique to mitigate memorization in large language models (LLMs).
  • This method excludes random tokens from loss computation during training, preventing exact data reproduction.
  • Benefits include reduced privacy risks from PII exposure and compliance with copyright restrictions.
  • Goldfish-trained models show minimal memorization impact on performance, albeit requiring longer training times.
  • Innovative token masking enhances privacy by preventing leakage of entire data passages.
  • Market implications involve improved data security and compliance, supporting responsible AI deployment.

Main AI News:

Researchers from the University of Maryland, the ELLIS Institute Tübingen, and the Max Planck Institute for Intelligent Systems have devised a novel approach to mitigate the memorization tendencies of large language models (LLMs). Known as the “goldfish loss” technique, this method strategically excludes a random subset of tokens from the loss computation during model training. By doing so, it prevents LLMs from memorizing and reproducing exact sequences from their training data. This is particularly crucial in commercial settings where privacy and copyright risks abound, such as the inadvertent reuse of verbatim code snippets or the exposure of sensitive information like personally identifiable data (PII).

Extensive experiments using large Llama-2 models have demonstrated that the goldfish loss technique effectively reduces memorization without significantly compromising model performance. While models trained with this approach may require slightly longer training times, they exhibit resistance to verbatim reproduction and are less vulnerable to data extraction attacks. This method represents a proactive step towards addressing memorization issues directly during the initial training phase, rather than relying solely on post-training adjustments like “unlearning” or model editing.

Innovative approaches such as consistent token masking further enhance the efficacy of the goldfish loss technique. By hashing a localized context of preceding tokens, the model avoids the risk of leaking entire passages from its training data while still learning essential language patterns effectively. This approach is particularly beneficial in scenarios involving web documents with duplicate content variations due to different attributions, headers, or other contextual differences.

Overall, the goldfish loss technique offers a promising solution to the challenge of memorization in LLMs, ensuring robust performance across various training conditions while mitigating the risks associated with unauthorized data reproduction. This method not only safeguards privacy and intellectual property but also supports the responsible development and deployment of AI technologies in commercial and sensitive data environments.


The introduction of the “goldfish loss” technique represents a significant advancement in addressing memorization risks in language models. This approach not only enhances data privacy and copyright compliance but also strengthens trust in AI technologies within commercial applications. By mitigating the risks of unauthorized data reproduction, businesses can confidently leverage advanced AI capabilities while safeguarding sensitive information and intellectual property.