Smoothing Grouped Bivariate Data to Obtain the Incubation Period Distribution of AIDS

2011 
STATISTICS IN MEDICINE, VOL. 13,969-981 (1994) SMOOTHING GROUPED BIVARIATE DATA TO OBTAIN THE INCUBATION PERIOD DISTRIBUTION OF AIDS JEREMY M. G . TAYLOR AND YUN CHON Department of Biostatistics. UCLA School of Public Health. Los Angeles, C A 90024, U.S.A. SUMMARY We use a penalized likelihood approach to obtain a smooth estimate of a bivariate distribution from grouped data where each observation consists of a region in a plane. The purpose of the analysis is to estimate the incubation period distribution of AIDS from the Multicenter AIDS Cohort Study, a prevalent cohort of homosexual men. In this article we illustrate the usefulness of the penalized likelihood approach. We also discuss the use of a cross-validation and a Bayesian scheme to choose the smoothing parameters and bootstrap samples to assess uncertainty. INTRODUCTION This paper describes an analysis of a specific AIDS-related data set, and follows a similar analysis of an older published version of this data set.' We develop in more depth the statistical issues in the analysis. The scientific problem is estimation of the incubation period distribution of AIDS from a cohort study of homosexual men recruited in Los Angeles in 1984-5. The incubation period is the time interval from infection with the AIDS virus (HIV) to the onset of clinical symptoms (AIDS). Its distribution is important both as a summary of the natural history of the disease and for its utility in predicting the future course of the epidemic. It has been shown' that to estimate the AIDS incubation period with data from a cohort study, one must model jointly both the incubation period and the date of HIV infection. Because of the nature of the study, however, for most subjects the exact values of these two variables are unknown but there is some information concerning their possible values. In statistical terms, the problem is that of estimating the joint bivariate distribution of two random variables when the observed data are grouped, that is each observation consists of a region in the plane. In the estimation scheme, we make minimal assumptions concerning the bivariate distribution and use a penalized likelihood approach to obtain smooth marginal distributions. The methods used also incorporate truncation in the sampling scheme and we discuss how we can introduce covariates that influence the joint distribution. Previous work in this area using related methodology to estimate the incubation period distribution of AIDS has been performed by others, using both parametric model^^-^ and semi-parametric and non-parametric approaches. ' * ' , 6 * STATISTICAL DESCRIPTION OF THE PROBLEM Because the methodology applies to situations other than AIDS, we describe it first in general terms. There is a sample of n subjects; the observation on subject i consists of a known region Bi in CCC 0277-67 15/94/090969-13 0 1994 by John Wiley & Sons, Ltd. Received August 1992 Revised June 1993
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []