A historical overview of multiple imputation as a method for dealing with missing data, with reflections from experience with its application in a large cohort study

The method of multiple imputation was first proposed by Donald Rubin in the late 1970's as an approach for dealing with nonresponse in large surveys.

He was particularly motivated by the requirement for U.S. federal agencies such as the Census Bureau to make national survey data available for public use. Use of the method was restricted to a small number of large-scale applications until the development of software tools in the mid to late 1990s and since then there has been something of an explosion of applications, generally in a very different vein to those originally envisaged by Rubin.

I started to use the method in the early 2000s to solve problems in the analysis of a large repeated measures adolescent health cohort study.

This talk will review the history, underlying theory, and current state of play with respect to practical application of this powerful method, illustrating some of the issues with a case study from the Victorian Adolescent Health Cohort Study.

Professor John Carlin

Prof. John Carlin

John holds appointments with the Murdoch Children’s Research Institute and The University of Melbourne.

Since completing a PhD in Statistics at Harvard University he has been engaged as a collaborator in a wide range of medical and public health research, including clinical trials and large-scale epidemiological studies.

His biostatistical research interests have focussed recently on methods for dealing with missing data using multiple imputation.

