A penalized nonparametric maximum likelihood approach to species richness estimation

Ji Ping Z. Wang*, Bruce G. Lindsay

*Corresponding author for this work

Research output: Contribution to journalReview article

57 Scopus citations

Abstract

We propose a class of penalized nonparametric maximum likelihood estimators (NPMLEs) for the species richness problem. We use a penalty term on the likelihood because likelihood estimators that lack it have an extreme instability problem. The estimators are constructed using a conditional likelihood that is simpler than the full likelihood. We show that the full-likelihood NPMLE solution given by Norris and Pollock can be found (with great accuracy) by using an appropriate penalty term on the conditional likelihood, so it is an element of our class of estimators. A simple and fast algorithm for the penalized NPMLE is developed; it can be used to greatly speed up computation of the unconditional NPMLE. It can also be used to find profile mixture likelihoods. Based on our goal of attaining high stability while retaining sensitivity, we propose an adaptive quadratic penalty function. A systematic simulation study, using a wide range of scenarios, establishes the success of this method relative to its competitors. Finally, we discuss an application in the gene number estimation using expressed sequence tag (EST) data from genomics.

Original languageEnglish (US)
Pages (from-to)942-959
Number of pages18
JournalJournal of the American Statistical Association
Volume100
Issue number471
DOIs
StatePublished - Sep 1 2005

    Fingerprint

Keywords

  • Mixture model
  • NPMLE computing
  • Number of classes
  • Penalized nPMLE
  • Species richness

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this