Abstract
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. Such information is coined as guidance. For example, in text-to-image synthesis, text input is encoded as guidance to generate semantically aligned images. Proper guidance inputs are closely tied to the performance of diffusion models. A common observation is that strong guidance promotes a tight alignment to the task-specific information, while reducing the diversity of the generated samples. In this paper, we provide the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models. Under mild conditions, we prove that incorporating diffusion guidance not only boosts classification confidence but also diminishes distribution diversity, leading to a reduction in the differential entropy of the output distribution. Our analysis covers the widely adopted sampling schemes including those based on the SDE and ODE reverse processes, and leverages comparison inequalities for differential equations as well as the Fokker-Planck equation that characterizes the evolution of probability density function, which may be of independent theoretical interest.
Original language | English (US) |
---|---|
Pages (from-to) | 53291-53327 |
Number of pages | 37 |
Journal | Proceedings of Machine Learning Research |
Volume | 235 |
State | Published - 2024 |
Event | 41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria Duration: Jul 21 2024 → Jul 27 2024 |
Funding
Mengdi Wang is partially supported by NSF IIS-2107304, NSF CPS-2312093, ONR 1006977 and Genmab. Yuting Wei is supported in part by the NSF grants DMS-2147546/2015447, CAREER award DMS-2143215, CCF-2106778, and the Google Research Scholar Award.
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability