How much do LLMs learn from negative examples?

Published in arXiv, 2025

Recommended citation: D. Yuret, S. Hamdan, "How much do LLMs learn from negative examples?," on arXiv https://arxiv.org/abs/2503.14391

This work investigates how large language models learn from near-miss negative examples, exploring the subtle differences between correct and incorrect examples that are almost correct.

Using only negative examples for fine-tuning, we find that we can improve the model’s performance on tasks much better than using only positive examples (SFT). Combining both positive and negative examples yields a slight improvement, indicating that negative examples provide the main useful signal compared to positive examples.

Access

Access paper on arXiv

  • D. Yuret, S. Hamdan, “How much do LLMs learn from negative examples?,” on arXiv