In Progress


Addressing new challenges with missing data in complex epidemiological studies: methods, guidance and software

There are large and growing investments in life-course epidemiological studies, which are central to understanding disease aetiology and progression, and thus to developing interventions to improve population health. The broadening scope and complexity of such studies generate numerous statistical challenges.

Firstly it is important that the analysis of data from these studies accounts for their complex design features, such as non-equal probability sampling and multilevel structures. Secondly researchers are posing increasingly intricate research questions around the causal relations between exposures and outcomes, giving rise to a range of sophisticated new analytic methods. Missing data are inevitable in all research studies, but this is especially important in longitudinal studies, where there are multiple opportunities for drop-out and sporadic non-response. It is critical that missing data are handled appropriately in the analysis to minimise the risk of biased findings and maximise precision. Multiple imputation (MI) is now widely used – and called for by editors and reviewers – for dealing with missing data, but in the context of complex designs and modern causal methods, it is not yet clear how to implement MI, or whether it is the best approach.

The aims of this research are to develop and evaluate novel approaches for the implementation of MI, in comparison with alternative approaches such as inverse probability weighting and direct Bayesian methods, and to provide software and guidance for handling missing data in the context of:

1) complex study designs, including weighted sampling and multilevel data, and
2) modern causal modelling approaches, namely mediation analysis, marginal structural models, principal stratification and instrumental variable methods.
Our research spans a comprehensive range of approaches, from examining the underlying mathematics and performing simulation experiments to empirical validation through application to case studies.


  • Prof. Julie Simpson (Centre for Epidemiology & Biostatistics, Melbourne School of Population and Global Health)
  • Prof George Patton (Adolescent Health Research, MCRI)
  • Prof Melissa Wake (Centre for Community Child Health, MCRI)
  • Missing Data, Imputation & Analysis (MIDIA) group, U.K. (including researchers at the London School of Hygiene & Tropical Medicine, MRC Biostatistics Unit Cambridge, and University of Bristol).



    Related People

    Prof Katherine Kate Lee
    Lead Investigator
    Lead Investigator
    Prof John Carlin
    Lead Investigator
    Ghazaleh Dashti
    Post-doctoral Biostatistician
    Post-doctoral Biostatistician
    Post-doctoral Biostatistician
    Post-doctoral Biostatistician
    Post-doctoral Biostatistician
    Anneke Grobler
    Affiliated Investigator