Department of Biomedical Engineering and Computational Science BECS

GPstuff - Gaussian process models for Bayesian analysis 4.5

Can be used with Matlab, Octave and R (see below)
Corresponding author: Aki Vehtari

Reference

If you use GPstuff, please use the reference (available online):

  • Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, and Aki Vehtari (2013). GPstuff: Bayesian Modeling with Gaussian Processes. Journal of Machine Learning Research, 14(Apr):1175-1179.

Latest release

Mailing lists

  • To get release announcements, you can subscribe to the GPstuff Announcement Mailing List.
  • Or subscribe to announcements at mloss.org by clicking the tiny letter symbol on the second line showing the last update date and time.

About

The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods.

The GPstuff toolbox works (at least) with Matlab versions r2009b (7.9) or newer (older versions down to 7.7 should work also, but the code is not tested with them). Most of the functionality works also with Octave (3.6.4 or newer, see release notes for details). Most of the code is written in m-files but some of the most computationally critical parts have been coded in C.

The GPstuff-toolbox has been developed by BECS Bayes group, Aalto University. The coding of the GPstuff-toolbox started in 2006 based on the MCMCStuff-toolbox (1998-2006), which was based on Netlab-toolbox (1996-2001). The main authors of the GPstuff have been Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen and Aki Vehtari, but the package contains code written by many more people. In the Department of Biomedical Engineering and Computational Science at Aalto University these persons are (in alphabetical order): Toni Auranen, Pasi Jylänki, Jukka Koskenranta, Enrique Lelo de Larrea Andrade, Tuomas Nikoskinen, Tomi Peltola, Eero Pennala, Heikki Peura, Ville Pietiläinen, Markus Siivola, Arno Solin, Simo Särkkä and Ernesto Ulloa. People outside Aalto University are (in alphabetical order): Christopher M. Bishop, Timothy A. Davis, Matthew D. Hoffman, Kurt Hornik, Dirk-Jan Kroon, Iain Murray, Ian T. Nabney, Radford M. Neal and Carl E. Rasmussen. We want to thank them all for sharing their code under a free software license.

License

This software is distributed under the GNU General Public License (version 3 or later); please refer to the file License.txt, included with the software, for details.

Using GPstuff from R

Instructions for using GPstuff from R.

Features of the toolbox

User guide (version 4.4).

Covariance and mean functions

  • Several covariance functions (e.g. squared exponential, exponential, Matérn, periodic and a compactly supported piece wise polynomial function)
  • Sums, products and scaling of covariance functions
  • Euclidean and delta distance
  • Several mean functions with marginalized parameters

Likelihood/observation models

  • Continuous observations: Gaussian, Gaussian scale mixture (MCMC only), Student's-t, quantile regression
  • Classification: Logit, Probit, multinomial logit (softmax), multinomial probit
  • Count data: Binomial, Poisson, (Zero truncated) Negative-Binomial, Hurdle model, Zero-inflated Negative-Binomial, Multinomial
  • Survival: Cox-PH, Weibull, log-Gaussian, log-logistic
  • Point process: Log-Gaussian Cox process
  • Density estimation and regression: logistic GP
  • Monotonicity information (EP only)
  • Other: derivative observations (for sexp covariance function only)

Priors for parameters (theta)

  • Several priors, Hierarchical priors

Sparse models

  • Sparse matrix routines for compactly supported covariance functions
  • Fully and partially independent conditional (FIC, PIC)
  • Compactly supported plus FIC (CS+FIC)
  • Variational sparse (VAR), Deterministic training conditional (DTC), Subset of regressors (SOR) (Gaussian/EP only)
  • PASS-GP

Latent inference

  • Exact (Gaussian only)
  • Laplace, Expectation propagation (EP), Parallel EP, Robust-EP
  • Marginal posterior corrections (cm2 and fact)
  • Scaled Metropolis, Hamiltonian Monte Carlo (HMC), Scaled HMC, Elliptical slice sampling
  • State space inference (1D for some covariance functions)

Hyperparameter inference

  • Type II ML/MAP
  • Leave-one-out cross-validation (LOO-CV), Laplace/EP LOO-CV
  • Metropolis, HMC, No-U-Turn-Sampler (NUTS), Slice Sampling (SLS), Surrogate SLS, Shrinking-rank SLS, Covariance-matching SLS
  • Grid, CCD, Importance sampling

Model assessment

  • LOO-CV, Laplace/EP LOO-CV, Integrated IS-LOO-CV, k-fold-CV
  • WAIC, DIC
  • Average predictive comparison

Contents of the toolbox

The contents of the toolbox can be examined here.

Demos

There are many demos in the toolbox. Here are few of them:

  • demo_regression1: A regression demo for full GP, compact support GP, FIC and PIC.
  • demo_classific: A classification problem.
  • demo_spatial1: A disease mapping problem with FIC sparse GP approximation.
  • demo_births: Demonstration of analysis of birthday frequencies in USA 1969-1988 using Gaussian process with several components.
  • demo_lgcp: Demonstration of point process intensity estimation using discretized nonhomogenous Poisson process also known as Log Gaussian Cox process.
  • demo_lgpdens: Demonstration of 1D and 2D density estimation and density regression using logistic Gaussian process
  • demo_monotonic2: Demonstration of the use of monotonicity information with Gaussian processes

References

If you use GPstuff, please use the reference (available online):

  • Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, Aki Vehtari (2013). GPstuff: Bayesian Modeling with Gaussian Processes. Journal of Machine Learning Research, 14(Apr):1175-1179.

GPstuff has also been used, for example, in the following publications:

  1. Ville Tolvanen, Pasi Jylänki and Aki Vehtari (2014). Expectation propagation for nonstationary heteroscedastic Gaussian process regression. In Proceedings of IEEE International Workshop on Machine Learning for Signal Processing, accepted for publication. Preprint
  2. Jaakko Riihimäki and Aki Vehtari (2014). Laplace approximation for logistic Gaussian process density estimation and regression. Bayesian analysis, 9(2):425-448. Online 3 February, 2014.
  3. Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari and Donald B. Rubin (2013). Bayesian Data Analysis, Third Edition. Chapman and Hall/CRC. Publisher's webpage for the book. Home page for the book.
  4. Arno Solin and Simo Särkkä (2014). Hilbert space methods for reduced-rank Gaussian process regression. arXiv:1401.5508.
  5. Heikki Joensuu, Peter Reichardt, Mikael Eriksson, Kirsten Sundby Hall and Aki Vehtari (2013). Gastrointestinal stromal tumor: A method for optimizing the timing of CT scans in the follow-up of cancer patients. Radiology, 271(1):96-106. Online 18 November, 2013. Preprint of the statistical appendix.
  6. Aki Vehtari and Heikki Joensuu (2013). A Gaussian processes model for survival analysis with time dependent covariates and interval censoring. Poster presented at The Third Workshop on Bayesian Inference for Latent Gaussian Models with Applications.
  7. Jarmo Rantonen, Aki Vehtari, Jaro Karppinen, Satu Luoto, Eira Viikari-Juntura, Markku Hupli, Antti Malmivaara and Simo Taimela (2013). Face-to-face information in addition to a booklet versus a booklet alone for treating mild back pain, a randomized controlled trial. Scandinavian journal of Work Environment & Health. Online.
  8. Mari Myllymäki, Aila Särkkä and Aki Vehtari (2013). Hierarchical second-order analysis of replicated spatial point patterns with non-spatial covariates. Spatial Statistics, in press. Online 13 August, 2013. PDF.
  9. Aki Vehtari, Karita Reijonsaari, Olli-Pekka Kahilakoski, Markus V. Paananen, Willem van Mechelen, and Simo Taimela (2013). The Influence of Selective Participation in a Physical Activity Intervention on the Generalizability of Findings. Journal of Occupational and Environmental Medicine, 56(3):291 297. Online 13 January 2014
  10. Jaakko Riihimäki, Pasi Jylänki and Aki Vehtari (2013). Nested Expectation Propagation for Gaussian Process Classification with a Multinomial Probit Likelihood. Journal of Machine Learning Research, 14(Jan):75-109. Available online. Part of GPstuff v4.1 and later.
  11. Lari Veneranta, Richard Hudd and Jarno Vanhatalo (2013). Reproduction areas of sea-spawning Coregonids reflect the environment in shallow coastal waters. Marine Ecology Progress Series, 477:231-250.
  12. Jarno Vanhatalo, Laura Tuomi, Arto Inkala, Inari Helle, and Heikki Pitkänen (2013). Probabilistic Ecosystem Model for Predicting the Nutrient Concentrations in the Gulf of Finland under Diverse Management Actions. Environmental Science & Technology, 47(1):334-341.
  13. Sourav Bhattacharya, Santi Phithakkitnukoon, Petteri Nurmi, Arto Klami, Marco Veloso, Carlos Bento (2013). Gaussian process-based predictive modeling for bus ridership. In Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, 1189-1198.
  14. Ji-Eun Kang, Young-Jin Kim, Ki-Uhn Ahn and Cheol-Soo Park (2013). Gaussian process emulator for optimal operation of a high rise office building. In Proceedings of BS2013: 13th Conference of International Building Performance Simulation Association, Chambéry, France, 2225-2231.
  15. Young-Jin Kim, Ki-Uhn Ahn, Cheol-Soo Park and In-Han Kim (2013). Gaussian emulator for stochastic optimal design of a double glazing system. In Proceedings of BS2013: 13th Conference of International Building Performance Simulation Association, Chambéry, France, 2217-2224.
  16. Zhuang Tian, Dongdong Weng, Jianying Hao, Yupeng Zhang and Dandan Meng (2013). A data driven BRDF model based on Gaussian process regression. In Proc. SPIE 9042, 2013 International Conference on Optical Instruments and Technology: Optical Systems and Modern Optoelectronic Instruments, 904211.
  17. Mahdi Biparva (2013). Novel multistage probabilistic kernel modeling in handwriting recognition. Master's thesis, Concordia Univeristy, Canada.
  18. Jarno Vanhatalo, Lari Veneranta and Richard Hudd (2012). Species Distribution Modelling with Gaussian Processes: a Case Study with the Youngest Stages of Sea Spawning Whitefish (Coregonus lavaretus L. s.l.) Larvae. Ecological Modelling, 228:49-58.
  19. Teppo Juntunen, Jarno Vanhatalo, Heikki Peltonen and Samu Mäntyniemi (2012). Bayesian spatial multispecies modelling to assess pelagic fish stocks from acoustic- and trawl-survey data. ICES Journal of Marine Science, 69: 95-104.
  20. Perry Groot and Peter Lucas (2012). Gaussian Process Regression with Censored Data Using Expectation Propagation. In Sixth European Workshop on Probabilistic Graphical Models, Granada, Spain, 115-122.
  21. Posiva Oy (2012). Olkiluoto Site Description 2011, 1028 pages. ISBN 978-951-652-179-7. Online. (GPstuff was used to model the distribution of fracture groundwaters salinities at the Olkiluoto nuclear waste repository site.)
  22. Girma Kejela (2012). Short-term Forecasting of Electricity Consumption using Gaussian Processes. Master's thesis, University of Agder, Norway.
  23. Heikki Joensuu, Aki Vehtari, Jaakko Riihimäki, Toshirou Nishida, Sonja E Steigen, Peter Brabec, Lukas Plank, Bengt Nilsson, Claudia Cirilli, Chiara Braconi, Andrea Bordoni, Magnus K Magnusson, Zdenek Linke, Jozef Sufliarsky, Federico Massimo, Jon G Jonasson, Angelo Paolo Dei Tos and Piotr Rutkowski (2011). Risk of gastrointestinal stromal tumour recurrence after surgery: an analysis of pooled population-based cohorts. In The Lancet Oncology, 13(3):265-274. Published Online: 07 December 2011.
  24. Pasi Jylänki, Jarno Vanhatalo and Aki Vehtari (2011). Robust Gaussian Process Regression with a Student-t Likelihood. Journal of Machine Learning Research, 12:3227-3257 (available online). The EP implementation described in the paper is included in the GPstuff toolbox. See also a short demo on the regression examples described in the paper.
  25. Jorma Rantonen, Satu Luoto, Aki Vehtari, Markku Hupli, Jaro Karppinen, Antti Malmivaara and Simo Taimela (2011). The effectiveness of two active interventions compared to self-care advice in employees with non-acute low back symptoms. A randomised, controlled trial with a 4-year follow-up in the occupational health setting. Occupational and Environmental Medicine, oem.2009.054312 (Available online 20 May 2011)
  26. Zhihua Zhang, Guang Dai and Michael I. Jordan (2011). Bayesian Generalized Kernel Mixed Models. Journal of Machine Learning Research 12:111-139.
  27. J. Zico Kolter and Joeseph Ferreira Jr. (2011). A Large-Scale Study on Predicting and Contextualizing Building Energy Usage. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 1349-1356.
  28. Jarno Vanhatalo, Ville Pietiläinen and Aki Vehtari (2010). Approximate inference for disease mapping with sparse Gaussian processes. Statistics in Medicine, 29(15):1580-1607. online
  29. Jarno Vanhatalo, Pia Mäkelä ja Aki Vehtari (2010). Alkoholikuolleisuuden alueelliset erot Suomessa 2000-luvun alussa. Yhteiskuntapolitiikka, 75(3):265-273 (Available online in Finnish) (English translation) (Online maps in Finnish)
  30. Jaakko Riihimäki and Aki Vehtari (2010). Gaussian processes with monotonicity information. Journal of Machine Learning Research: Workshop and Conference Proceedings, 9:645-652, AISTATS2010 special issue. (abstract, PDF)
  31. Jarno Vanhatalo and Aki Vehtari (2010). Speeding up the binary Gaussian process classification. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010), AUAI Press. (Available online).
  32. Jarno Vanhatalo, Pasi Jylänki and Aki Vehtari (2009). Gaussian process regression with Student-t likelihood. In Bengio et al, editors, Advances in Neural Information Processing Systems 22, pp. 1910-1918, NIPS Foundation (Available online)
  33. Jarno Vanhatalo and Aki Vehtari (2009). Discussion to 'Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations' by Håvard Rue, Sara Martino and Nicolas Chopin. Journal of the Royal Statistical Society, Series B (Statistical Methodology)., 71(2):383 (Available online 6 April 2009)
  34. Jarno Vanhatalo and Aki Vehtari (2008). Modelling local and global phenomena with sparse Gaussian processes. Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence. (PDF)
  35. Jarno Vanhatalo and Aki Vehtari (2007). Sparse Log Gaussian Processes via MCMC for Spatial Epidemiology. JMLR Workshop and Conference Proceedings, 1:73-89. (Gaussian Processes in Practice) (PDF) (Slides related to the paper in PDF)