CMU-ML-19-114
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-19-114

Estimating Probability Distributions and their Properties

Shashank Singh

August 2019

Ph.D. Thesis

CMU-ML-19-114.pdf


Keywords: Nonparametric Statistics, Density Estimation, Density Functional Estimation, Entropy Estimation, Information Estimation, Divergence Estimation, Smoothness Estimation, Kozachenko-Leonenko Estimator, Wasser-stein Distance, Integral Probability Metric, IPM, Holder Space, Sobolev Space,Besov Space, Nonparanormal Model, Gaussian Copula Model, Generative Adversarial Network, Implicit Generative Model, Explicit Generative Model, Statistical Minimax Theory


This thesis studies several theoretical problems in nonparametric statistics andmachine learning, mostly in the areas of nonparametric density functional estimation (estimating an integral functional of the population distribution from which the data are drawn) and nonparametric density estimation (estimating the entire population distribution from which the data are drawn). A consistent theme is that, although nonparametric density estimation is traditionally thought to be intractable in high-dimensions, several equally (or more) useful tasks are relatively more tractable, even with similar or weaker assumptions on the distribution.

Our work on density functional estimation focuses on several types of integral functionals, such as information theoretic quantities (entropies, mutual informations, and divergences), measures of smoothness, and measures of (dis)similarity between distributions, which play important roles as subroutines elsewhere in statistics, machine learning, and signal processing. For each of these quantities, under a variety of nonparametric models, we provide some combination of (a) new estimators, (b) upper bounds on convergence rates of these new estimators, (c) new upper bounds on the convergence rates of established estimators, (d) concentration bounds or asymptotic distributions for estimators, or (e) lower bounds on the minimax risk of estimation. We briefly discuss some applications of these density functional estimators to hypothesis testing problems such as two-sample (homogeneity) or (conditional) independence testing.

For density estimation, whereas the majority of prior work has focused on estimation under 𝔏 2 or other 𝔏 p losses, we consider minimax convergence rates under several new losses, including the whole spectrum of Wasserstein distances and a large class of metrics called integral probability metrics (IPMs) that includes, for example, 𝔏 p, total variation, Kolmogorov-Smirnov, earth-mover, Sobolev, Besov, and some RKHS distances. These losses open several new possibilities for nonparametric density estimation in certain cases; some examples include

  • convergence rates with no or reduced dependence on dimension
  • density-free distribution estimation, for data lying in general (e.g., non-Euclidean) metric spaces, or for data whose distribution may not be absolutely continuous with respect to Lebesgue measure
  • convergence rates depending only on intrinsic dimension of data
Our main results here are the derivation of minimax convergence rates. However, we also briefly discuss several consequences of our results. For example, we show that IPMs have close connections with generative adversarial networks (GANs), and we leverage our results to prove the first finite-sample guarantees for GANs, in an idealized model of GANs as density estimators. These results may help explain why these tools appear to perform well at problems that are intractable from traditional perspectives of nonparametric statistics. We also briefly discuss consequences for estimation of certain density functionals, Monte Carlo integration of smooth functions, and distributionally robust optimization.

193 pages

Thesis Committee:
Barnabás Póczos (Chair)
Ryan Tibshirani
Larry Wasserman
Bharath Sriperumbudur (Pennsylvania State University)

Roni Rosenfeld, Head, Machine Learning Department
Martial Hebert, Dean, School of Computer Science


SCS Technical Report Collection
School of Computer Science