The challenge: Model pitcher performance variability using MLB data.
Central question: Is a pitcher’s performance on any given night drawn from a fixed true talent,
or does their “true ability” vary appearance-to-appearance?
What you’ll learn
- Modeling overdispersion in sports outcomes (Beta-Binomial models)
- Competing generative models: fixed talent vs. random game-by-game variation
- Model comparison using predictive performance (LOO, posterior predictive checks)
- Sequential Bayesian inference: updating beliefs in real time during games
- Ranking players under uncertainty
Hands-on: detect & model overdispersion; build two competing models of pitcher swinging-strike rates;
compare models for ranking/evaluation/forecasting; generate stakeholder-ready visuals (calibration plots,
rootograms, forest plots); perform in-game updates for current outing performance.
Exercises: Build a variety of Bayesian models on provided MLB data.