23 May 2019 09:30am to 10:30am

greta: simple and scalable statistical modelling in R

Seminar
Event Location
Conference Room 1
553 St Kilda Road Melbourne Australia
Melbourne VIC 3004
Australia
Speakers
Dr Nick Golding
University of Melbourne, School of Biosciences

General purpose MCMC software packages like WinBUGS, JAGS, and STAN enable users to define and fit almost any statistical model without having to worry about implementation details and have enabled significant progress in applied Bayesian modelling. However, these existing tools are largely unable to make use of recent advances in hardware and software for high performance computing so they often scale very poorly to large datasets. In addition, the need to specify models using a compiled, domain-specific language is a significant hurdle to potential users andmakes it hard for the wider community to extend and build upon these tools. greta is a new software package for flexible statistical modelling that aims to overcome these limitations. greta uses Google's TensorFlow high-performance automatic differentiation library, so it scales well to massive data sets (millions of observations), can run across many CPUs or on GPUs. greta models can be fitted using efficient gradient-based MCMC samplers like Hamiltonian Monte Carlo, black- box variational Bayes methods, or maximum likelihood/empirical Bayes methods. greta models are written directly and interactively in R, so greta is easy to learn and straightforward to extend with new R packages or use as a backend for more specific software.

I will demonstrate greta and some extension packages for modelling with Gaussian processes, generalised additive models and dynamical systems. If you want to know more now, see the website: https://greta-dev.github.io/greta.

 

Nick is a DECRA fellow in the School of BioSciences. He develops statistical models and software to predict the distributions of species and human diseases. He’s particularly interested in improving these models with information about traits, mechanistic relationships and population dynamics.