04/06/2026 – Understanding diffusion models requires rethinking (again) generalization (by Yu-Han Wu)

Date : 4 juin 2026, 14h

Lieu : Amphi Copernic

Orateur : Yu-Han Wu, https://pojoowu.github.io/

Abstract : We argue that understanding generalization in diffusion models requires fundamentally new theoretical frameworks that go beyond both classical statistical learning theory and the benign overfitting paradigm developed for supervised learning. In diffusion models, unlike in supervised learning, memorization of training data and generalization to novel samples are incompatible: a model that has fully memorized its training set generates copies rather than novel data. Several theoretical explanations for why practical diffusion models nevertheless generalize have been proposed, based on capacity limitations, implicit regularization from optimization, or architectural inductive biases, but their interactions remain unclear. We argue that the field should pivot from explaining why the diffusion models do not memorize to investigating what the model actually learns during the pre-memorization phase.

https://arxiv.org/abs/2605.06077