Optimization over distributions is a class of infinite-dimensional optimization problems in which the optimization variable is a probability distribution. Many problems fall into this category, including Bayesian inference where the distribution describes the belief, and distributional robust machine learning where the distributions captures the real data scattering. Moreover, any non-convex optimization problem in Euclidean space can be reformulated as a convex optimization over distributions. These instances have been intensively studied separately so far and not much effort has been made to consider these types of problems in a unified framework. The goal of this research project is to systematically investigate optimization problems over distributions in a unified framework and develop scalable algorithms that are suitable for large scale applications in data sciences. The proposed framework is based on optimal transport theory, which endows the space of distributions a natural geometry. The proposed algorithm utilizes gradient flow of the objective with respect to this geometry. To achieve scalability, the optimization variable is approximated by a collection of particles; the algorithm essentially describes the collective dynamics of the particles. A novel variational approach will be developed to approximate the gradient descent direction. In this project, the theoretical properties of this algorithm including convergence rate, statistical properties will be examined thoroughly. Case studies will be carried out by specializing this unified framework to special cases such as Bayesian inference. Finally, extensions of the algorithm such as acceleration, variance reduction will also be investigated.
|Effective start/end date||10/1/20 → 9/30/23|
- National Science Foundation (CCF-2008827)