Overview
A practical, hands-on workshop teaching you to build hierarchical models for real sports analytics problems. Learn when partial pooling beats simple averages, how to model player-to-player and game-to-game variability, and how to communicate Bayesian results to coaches and stakeholders.
What you’ll build
Partial pooling that wins
- Player/team effects with domain-informed priors
- Centered vs. non-centered parameterization (when each shines)
- Reusable PyMC + ArviZ sports notebook
Diagnostics coaches trust
- Posterior predictive checks & calibration
- Coverage & model fit diagnostics that stand up in the room
- Stakeholder-ready visuals
Model comparison & ranking
- Predictive performance, LOO, PPCs
- Ranking under uncertainty for evaluation & forecasting
- Frameworks for competing hypotheses
Who is this for?
Data scientists, analysts, and quant-minded practitioners who want to apply Bayesian methods credibly on the job — especially in sports ops, performance, or strategy roles.
Python comfortable Bayes basics known PyMC/Bambi newbie OKPrerequisites
- Required: Python (pandas, matplotlib, basic scripting) and the Bayes basics (prior, likelihood, posterior).
- Nice to have (not required): PyMC or Bambi experience; any sports analytics background.
- Setup: You’ll receive access to a pre-auth AthlyticZ VM on GCP with PyMC, Bambi, ArviZ, datasets, and notebooks preinstalled. Local Jupyter works too.
What’s included
Live sessions + recordings
Attend live, code along, and rewatch at your pace.
Cloud lab (1-click)
Pre-configured GCP VMs with PyMC, Bambi, ArviZ, datasets, and notebooks.
LBS Discord access
Join Alex’s Learning Bayesian Statistics community for ongoing Q&A.
Course notes & templates
Annotated notebooks, exercises with solutions, modeling and plotting templates.
Course syllabus
Two focused sessions that ladder from fundamentals to a real sports analysis, with exercises in each.
The challenge: Model pitcher performance variability using MLB data.
Central question: Is a pitcher’s performance on any given night drawn from a fixed true talent, or does their “true ability” vary appearance-to-appearance?
- Modeling overdispersion in sports outcomes (Beta-Binomial models)
- Competing generative models: fixed talent vs. random game-by-game variation
- Model comparison using predictive performance (LOO, posterior predictive checks)
- Sequential Bayesian inference: updating beliefs in real time during games
- Ranking players under uncertainty
Hands-on: detect & model overdispersion; build two competing models of pitcher swinging-strike rates; compare models for ranking/evaluation/forecasting; generate stakeholder-ready visuals (calibration plots, rootograms, forest plots); perform in-game updates for current outing performance.
Exercises: Build a variety of Bayesian models on provided MLB data.
The core problem: When should you pool data across groups vs. treat each group separately?
- Why simple averages fail: the pooling vs. unpooling tradeoff
- Partial pooling: borrowing strength across groups intelligently
- When hierarchical models shine: nested data, small samples, extreme observations
- Regularization to the mean: extraordinary claims require extraordinary evidence
Hands-on: build your first hierarchical model from scratch in Bambi and PyMC; compare pooled, unpooled, and hierarchical approaches; choose centered vs. non-centered parameterization; posterior analysis and visualization with ArviZ; warm-up with biological data, then transition to sports thinking.
Exercises: Turn Session 1’s sports model into hierarchical variants.
Materials provided: annotated Jupyter notebooks (both sessions), exercise notebooks with solutions, MLB datasets, reusable plotting & modeling templates, and recommended resources for further learning.
About the instructor
Alex Andorra — Bayesian modeler, PyMC core contributor, sports analytics consultant, and host of the Learning Bayesian Statistics podcast.
Preview





FAQ
Do I need a GPU or special setup?
No. Your AthlyticZ cloud lab is pre-built on GCP. Click to launch, and you’re coding.
What if I can’t attend live?
Recordings are included. You’ll still have access to Discord Q&A.
Will there be homework?
Optional exercises are included with solutions and guidance.