Seminar

greta: simple and scalable statistical modelling in R

Thursday, 23 May 2019
Time: 
9.30am - 10.30am
Monash University, Dept of Epidemiology and Preventive Medicine
553 St Kilda Rd Conference Rooms 1 & 2, Ground Floor
Melbourne 3004

General purpose MCMC software packages like WinBUGS, JAGS, and STAN enable users to define and fit almost any statistical model without having to worry about implementation details and have enabled significant progress in applied Bayesian modelling. However, these existing tools are largely unable to make use of recent advances in hardware and software for high performance computing so they often scale very poorly to large datasets. In addition, the need to specify models using a compiled, domain-specific language is a significant hurdle to potential users andmakes it hard for the wider community to extend and build upon these tools. greta is a new software package for flexible statistical modelling that aims to overcome these limitations. greta uses Google's TensorFlow high-performance automatic differentiation library, so it scales well to massive data sets (millions of observations), can run across many CPUs or on GPUs. greta models can be fitted using efficient gradient-based MCMC samplers like Hamiltonian Monte Carlo, black- box variational Bayes methods, or maximum likelihood/empirical Bayes methods. greta models are written directly and interactively in R, so greta is easy to learn and straightforward to extend with new R packages or use as a backend for more specific software.

I will demonstrate greta and some extension packages for modelling with Gaussian processes, generalised additive models and dynamical systems. If you want to know more now, see the website: https://greta-dev.github.io/greta.

 

 

Dr Nick Golding

School of Biosciences
University of Melbourne

Nick is a DECRA fellow in the School of BioSciences. He develops statistical models and software to predict the distributions of species and human diseases. He’s particularly interested in improving these models with information about traits, mechanistic relationships and population dynamics.