Catalyst Project: Modeling Count Data with the Conway-Maxwell-Poisson Distribution
North Carolina Central University, Durham NC
Investigators
Abstract
Catalyst Projects provide support for Historically Black Colleges and Universities to work towards establishing research capacity of faculty to strengthen science, technology, engineering and mathematics undergraduate education and research. It is expected that the award will further the faculty member's research capability, improve research and teaching at the institution, and involve undergraduate students in research experiences. This project at North Carolina Central State University will investigate mathematical modeling with probability distributions and will provide an opportunity for undergraduate students to enhance their statistical knowledge through research experiences. The researcher has established a strong collaboration with faculty at Georgetown University. Count data arise in a variety of situations and are often modeled with a Poisson distribution. In many cases, however, the data do not satisfy the Poisson distribution's property of equal mean and variance. This project will examine the two-parameter Conway-Maxwell-Poisson (CMP) distribution that accounts for over-dispersion where the variance is greater than the mean and under-dispersion where the variance is less than the mean. The CMP distribution captures the Poisson, Bernoulli, and geometric distributions as special cases. The first goal of this project is to conduct a simulation study which compares properties of the CMP distribution to those of mixed Poisson probability distributions, such as the Poisson-lognormal, Poisson-Lindley and Poisson-inverse Gaussian, that have traditionally been used to model over-dispersed data. The second goal is to derive and develop inferential methods for a bivariate CMP distribution. To demonstrate the flexibility of the CMP distribution and its extensions, applications to real-world data will be considered. This research will expand the class of probability distributions that are available to model count data.
View original record on NSF Award Search →