15 Feb 2021 01:30pm to 19 Feb 2021 05:00pm

Summer School 2021 (Week 1): Multiple Imputation

Cattram completed a PhD in biostatistics at the University of Melbourne in 2014. Her statistical areas of interest include the design and analysis of clinical trials and methods for missing...
Margarita completed a PhD in Biostatistics at Université Paris-Sud in 2014. She is currently an ARC DECRA Fellow and conducts methodological research in the areas of causal inference, missing data...
Katherine (Kate) is a biostatistician at the Murdoch Childrens Research Institute. She also holds an honorary appointment at the University of Melbourne and is the Associate Director (biostatistics) of the...
Julie is Head of the Biostatistics Unit at the Melbourne School of Population and Global Health, The University of Melbourne and Director of the Melbourne Clinical and Translational Sciences platform...
Prof John Carlin
John holds appointments with the Murdoch Children’s Research Institute and The University of Melbourne. Since completing a PhD in Statistics at Harvard University he has been engaged as a collaborator...
Ghazaleh Dashti
Ghazaleh completed her PhD in Epidemiology at the University of Melbourne in 2020. Her current research focuses on methods for handling missing data in the context of causal inference methods...
Rheanna completed a PhD in statistics at La Trobe University in 2017, looking at the effect of preliminary data-based model selection on confidence intervals. She is currently researching multiple imputation...

Multiple imputation has become a de facto standard for handling missing data in epidemiological and clinical research. With a combination of lectures and computer practicals (Stata and R), this workshop will cover advanced topics in multiple imputation that are critical in modern research studies.

Please join us for this series of online, half-day workshops. For further information please contact vicbiostat@mcri.edu.au or series convenor Dr Cattram Nguyen cattram.nguyen@mcri.edu.au.




Introduction to multiple imputation for missing data (15 & 16 February)

An introduction to multiple imputation and the practical issues faced by researchers wishing to apply this method. In particular, the course focuses on understanding when multiple imputation is likely to produce substantial gains over a standard complete case analysis, and on the decisions faced when developing an imputation model, once it has been decided that multiple imputation is appropriate.

We provide a detailed introduction, with practical computing exercises on how to perform analyses using multiple imputation in Stata and R. The application of multiple imputation is illustrated with two case studies, in which the decisions required for implementation of the method are examined, highlighting the potential benefits as well as limitations of multiple imputation.


Sensitivity analyses to departures from the ‘missing at random’ assumption (17 February)

Standard implementations of multiple imputation are only guaranteed to provide unbiased results under the so-called “missing at random” (MAR) assumption. This roughly means that the chance of a value being missing does not depend on the value itself, given other observed data. It is therefore important to assess the plausibility of this assumption and, given that it is not testable, to perform sensitivity analyses considering scenarios where MAR does not hold (“missing not at random”—MNAR—scenarios). This workshop discusses approaches to examining the plausibility of the MAR assumption, and describes an extended multiple imputation strategy that can be used to conduct such sensitivity analyses.


Multiple Imputation for Longitudinal data (18 & 19 February)

Longitudinal studies, collecting data from individuals over time, are central in modern health and medical research. However, the prolonged observation of individuals exacerbates the risk of missing data. While multiple imputation methods for handling missing data in multiple variables are widely available in mainstream statistical software, there are important considerations, both computational and conceptual, regarding their use in the longitudinal setting. Furthermore, specialised approaches have recently been developed. Over two days, we will review the concepts and methods available for multiple imputation of longitudinal data and provide guidance on good practice.

Day 4 will provide an overview of longitudinal data analysis and methods for imputing longitudinal data in “wide” format (Stata/R). Day 5 will focus on multiple imputation methods for longitudinal data in “long” format (available in R only).



Participants will require a sound working familiarity with Stata or R, and with statistics to the level of multivariable logistic regression models.