Estimating the species, phylogenetic, and functional diversity of a community is challenging because rare species are often undetected, even with intensive sampling. The Good-Turing frequency formula, originally developed for cryptography, estimates in an ecological context the true frequencies of rare species in a single assemblage based on an incomplete sample of individuals. Until now, this formula has never been used to estimate undetected species, phylogenetic, and functional diversity. Here, we first generalize the Good-Turing formula to incomplete sampling of two assemblages. The original formula and its two-assemblage generalization provide a novel and unified approach to notation, terminology, and estimation of undetected biological diversity. For species richness, the Good-Turing framework offers an intuitive way to derive the non-parametric estimators of the undetected species richness in a single assemblage, and of the undetected species shared between two assemblages. For phylogenetic diversity, the unified approach leads to an estimator of the undetected Faith's phylogenetic diversity (PD, the total length of undetected branches of a phylogenetic tree connecting all species), as well as a new estimator of undetected PD shared between two phylogenetic trees. For functional diversity based on species traits, the unified approach yields a new estimator of undetected Walker et al.'s functional attribute diversity (FAD, the total species-pairwise functional distance) in a single assemblage, as well as a new estimator of undetected FAD shared between two assemblages. Although some of the resulting estimators have been previously published (but derived with traditional mathematical inequalities), all taxonomic, phylogenetic, and functional diversity estimators are now derived under the same framework. All the derived estimators are theoretically lower bounds of the corresponding undetected diversities; our approach reveals the sufficient conditions under which the estimators are nearly unbiased, thus offering new insights. Simulation results are reported to numerically verify the performance of the derived estimators. We illustrate all estimators and assess their sampling uncertainty with an empirical dataset for Brazilian rain forest trees. These estimators should be widely applicable to many current problems in ecology, such as the effects of climate change on spatial and temporal beta diversity and the contribution of trait diversity to ecosystem multi-functionality.
© 2017 by the Ecological Society of America.
Chao A, Chiu CH, Colwell RK, Magnago LF, Chazdon RL, Gotelli NJ. Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on Good‐Turing theory. Ecology. 2017 Nov;98(11):2914-29.