Stan logo Becoming a BayeZian I — Foundations of Bayesian Analysis

Build Bayesian intuition with Stan

Start from first principles—uncertainty, probability, and priors—and progress to GLMs and introductory hierarchical models. Learn simulation-first thinking in R and implement models in Stan with cmdstanr under the guidance of Dr. Scott Spencer (Columbia University).

Uncertainty & Variation Probability & Distributions Priors → Likelihoods → Posterior Simulation in R Stan Basics (cmdstanr) GLMs & Intro Hierarchical
Enroll Now View Syllabus
Dr. Scott Spencer teaching Bayesian modeling with Stan

What You’ll Learn

A rigorous, practical introduction to Bayesian data analysis. Concepts map directly to real projects—so your models are calibrated, explainable, and decision-ready.

Uncertainty & Variation

Quantify uncertainty from the start and avoid overconfident point estimates.

Probability & Distributions

Bernoulli, Binomial, Poisson, Beta, Normal—learn when and why to use each.

Priors → Likelihoods → Posterior

Formulate priors, write likelihoods, and interpret posteriors with confidence.

Simulation in R

Prototype distributions and models to build intuition before fitting in Stan.

Stan Fundamentals

Code simple models, fit with cmdstanr, and check HMC diagnostics.

GLMs & Intro Hierarchical

Model counts and probabilities, extend to partial pooling with groups.

Model Evaluation

Use PPCs and LOOCV/ELPD to compare and validate your models.

Case Study: Soccer xG

Build and extend an expected-goals model with real soccer data.

What You’ll Learn

Go beyond plug-and-play ML. Learn how to quantify uncertainty, build interpretable models, and apply Bayesian reasoning with Stan—skills employers seek for high-stakes decision-making.

Foundations of Bayesian Thinking

  • Understand probability, priors, likelihoods, and posteriors.
  • Simulate distributions and run inference in R and Stan.
  • Develop intuition for variation, uncertainty, and shrinkage.
Impact: Stand out by explaining not just “what the model predicts” but “how sure we are.”

Stan Modeling Skills

  • Code regression models directly in cmdstanr.
  • Fit and diagnose models with HMC sampling.
  • Conduct posterior predictive checks and model comparisons (LOO/ELPD).
Impact: Employers value candidates who can evaluate reliability, not just accuracy.

Hierarchical & Generalized Models

  • Model binomial, Poisson, and categorical outcomes.
  • Extend to hierarchical structures with partial pooling.
  • Balance flexibility and interpretability with GLMs.
Impact: Bring scalable solutions to messy, multi-level real-world data.

Applied Case Study

  • Soccer expected goals (xG) with real play-by-play data.
  • Explore correlation between predictors & model extensions.
  • Use model estimates to drive strategic decision-making.
Impact: Show portfolio-ready work that mirrors sports teams, finance, and health analytics pipelines.

Feeling Overwhelmed by Bayesian Modeling?

You’re not alone. We turn equations into **working models**—so you gain intuition, write clean Stan, and make better decisions.

Common Roadblocks We’ll Solve Together

  • ?

    Translating probability theory into practical modeling choices.

  • ?

    Choosing sensible priors and interpreting posteriors with confidence.

  • ?

    Connecting GLMs to real sports problems (counts, probabilities, categorical outcomes).

  • ?

    Writing Stan code and debugging HMC diagnostics without getting stuck.

  • ?

    Evaluating models with PPCs, ELPD/LOOCV—so you trust what you ship.

Free Preview • Real Lesson

Example Four Using Soccer Data (Poisson Models)

What you’ll build

Model both Arsenal and the opponent goals as Poisson counts—then extend to more realistic game outcomes.

Files / Where to look

soccer_4.stan, lecture notes.r • Run lines 2–3 to set up; jump to line 993 to follow along.

Stan model (excerpt)
// soccer_4.stan
data {
  int N;          // observations
  int Y1[N];      // Arsenal goals
  int Y2[N];      // Opponent goals
}
parameters {
  real<lower=0> lambda[2]; // Arsenal, Opponent
}
model {
  lambda ~ exponential(1);
  Y1 ~ poisson(lambda[1]);
  Y2 ~ poisson(lambda[2]);
}
R workflow (snippet)
data_list <- list(Y1=arsenal_goals, Y2=opponent_goals, N=num_observations)
soccer4_model <- stan_model("soccer_4.stan")
fit_soccer4   <- sampling(soccer4_model, data=data_list)
Model extensions
  • Add team-level parameters to model dependence between opponents.
  • Explore the Skellam distribution for score differences.
  • Promote to GLMs with covariates and partial pooling.
Why it matters
  • Move beyond single-team goals to game-level outcomes.
  • Build intuition for priors/likelihoods on real sports data.
  • Learn patterns you’ll reuse in BayeZian II and in production.
Michael S. Czahor, PhD
“Scott Spencer brings Bayesian modeling alive. He not only explains the math, but shows how to implement models in Stan that are clear, scalable, and ready for research or production. If you want to truly master Bayesian analysis, Scott is the guide you need.”
Michael S. Czahor, PhD
President @ Athlyticz
Your Bayesian Analysis Coach for Becoming a Bayesian

MY PASSION LIES IN LEVERAGING DATA FOR GOOD CAUSES AND EXPLORING THE INTRICATE DYNAMICS OF PROFESSIONAL SPORTS THROUGH STATISTICAL MODELING.

SCOTT SPENCER

Dive into the world of Bayesian analysis and cutting-edge probabilistic programming alongside a Columbia Professor and Stan language collaborator with expertise in crafting intricate generative models. Scott's expertise spans from decoding human behavior to forecasting sea-level rise impacts on coastal property values, all while dissecting the statistical DNA of professional sports.

Scott's influence extends beyond academia, shaping decisions for tech giants like Amazon, healthcare leaders such as Johnson & Johnson, and entertainment moguls like Vevo. His knack for transparent storytelling through R packages ensures that even the most complex insights are accessible and actionable, making him a sought-after guide for data enthusiasts across industries.

Join the BayeZian revolution and unlock the true potential of Bayesian methods with Scott, where every analysis tells a compelling story and uncertainty is the key to innovation!

Scott Spencer teaching Bayesian modeling in Stan Scott Spencer explaining Stan model code

Course Overview

Build a rigorous Bayesian toolkit with Stan—from probability foundations to hierarchical GLMs and an applied xG case study—so you can quantify uncertainty, defend results, and deliver stakeholder-ready insights.

1) Introducing Bayesian Analysis for Sports
Why Bayesian? Course goals, deliverables, and how this maps to business value.
scopeuse-casesexpectations
  • Introduction
  • Course Topics
Outcome: align vocabulary and expectations; set a clear success path.
2) Exploring Uncertainty & Variation
Develop intuition for variation using sports examples and visuals.
variabilityvisual reasoning
  • Uncertainty & Variation
  • Example — 100 Meter Olympic Sprint
  • Visualizing the Example Data
  • Quiz: Exploring Uncertainty & Variation
Outcome: communicate uncertainty credibly to stakeholders.
3) Probability, Random Variables & Distributions
Bayesian building blocks with discrete & continuous families.
Bernoulli/BinomialPoissonNormal/Beta
  • Probability, Random Variables & Distributions
  • Random Variables; Bernoulli & Binomial; Poisson; Counts→Normal
  • Continuous Uniform; Beta; Normal; Summary Statistics
  • Joint, Marginal, Conditional; Independence; Getting to Bayes Rule
  • Quiz: Probability, Random Variables, and Distributions
Outcome: choose appropriate likelihoods & priors with confidence.
4) Priors, Likelihoods, and Posteriors
Connect priors to evidence; compute posteriors and use conjugacy where possible.
priorslikelihoodposterior
  • Priors, Likelihoods, and Posteriors; Likelihoods & Normalizing Constant
  • Conjugate Priors
  • Quiz: Priors, Likelihoods, and Posteriors
Outcome: defend modeling choices to technical & non-technical audiences.
5) Simulating Distributions in R
Gain intuition via simulation before touching Stan.
simulationR
  • Simulating Distributions in R Intro
  • Transforming Random Numbers to Distributions; Discrete Distributions
  • Quiz: Simulating Distributions
Outcome: faster learning cycles and sanity checks.
6) Random Variable Code Objects
Represent distributions as reusable code objects.
abstractionreusability
  • Representing Distributions with a Random Variable Code Object
Outcome: cleaner, testable modeling code.
7) Simulations & Models in Stan
Move into Stan: simulate values & fit a beta-binomial model.
cmdstanrbeta-binomial
  • Stan Documentation; Toy Example (simulate values)
  • Second Example: Beta Binomial
  • Quiz: Simulations and Models in Stan
Outcome: understand Stan program flow & results objects.
8) Posterior Simulation with Grid Approximation
Hand-build a posterior to cement intuition.
grid approx
  • Grid Approximation Example
Outcome: demystify Bayes before MCMC.
9) Approximate Posteriors with MH & HMC
See how modern HMC powers Stan.
MCMCHMC
  • Approximate Posteriors Intro; Hamiltonian Monte Carlo
Outcome: choose engines & read diagnostics confidently.
10) A Language for Describing Models
Create a consistent vocabulary for models.
model spec
  • A Language for Describing Models Intro
Outcome: smoother hand-offs & reviews.
11) Simple Normal Regression (Stan)
Code, fit, and diagnose a regression model in Stan.
compilationdiagnostics
  • Intro; Coding; Compiling & Fitting
  • Checking HMC Diagnostics; Reviewing Parameters
  • Quiz: Simple Normal Regression
Outcome: production habits for reliable inference.
12) cmdstanr Objects, Helper Functions & Evaluation
Posterior predictive checks + model comparison (ELPD / LOO).
PPCELPD/LOO
  • cmdstanr Objects & Helpers
  • Posterior Predictive Checks — three approaches
  • Model Comparison: ELPD & LOOCV
  • Quiz: Cmdstanr Model Objects
Outcome: defend models with calibrated, comparable metrics.
13) Extending Normal Regression
Add categorical & multiple predictors responsibly.
design matrices
  • Intro; Categorical Predictors; Multiple Predictors
  • Quiz: Extending Normal Regression
Outcome: scalable patterns for real datasets.
14) GLMs — Conceptual Introduction
Choose link functions that match outcomes.
logitlog
  • GLMs Intro; Logit Link; Log Link
  • Quiz: Generalized Linear Models
Outcome: interpretable probability modeling.
15) GLMs — Modeling Integer or Count Outcomes
Binomial & Poisson workflows with sports examples (basketball & soccer).
binomialpoisson
  • GLMs; Binomially-Distributed Counts — Basketball Examples 1–3
  • Poisson-Distributed Counts — Soccer Examples 2–4
  • Quiz: GLMs (Part 2)
Outcome: calibrated event probabilities for decision support.
16) More GLMs — Categorical Outcomes
Model multi-class outcomes; interpret coefficients & uncertainty.
multinomial
  • First, Second & Third Categorical Models
Outcome: richer labels without black-box confusion.
17) Hierarchical Models — An Introduction
Share strength across groups with partial pooling; reparameterize when needed.
partial poolingdiagnostics
  • Intro; Parameters Sharing Information
  • Diagnostics & Reparameterization
  • Quiz: Hierarchical Models
Outcome: robust estimates on messy, sparse, or imbalanced data.
18) Workflow Recap
A practical playbook: from problem framing → checks → reporting.
playbook
  • Workflow Recap
Outcome: repeatable, auditable analytics.
19) Case Study — Soccer & Expected Goals (xG)
End-to-end Bayesian modeling for xG with model extensions & decisions.
xGfeature effectsdecisions
  • Exploring Pitch Data; Bernoulli goals → expanded model
  • Add angle & body-part predictors; model correlation
  • Add hierarchical info; reparameterize; decision use
Outcome: portfolio-ready artifact aligned with industry practice.
20) Next Steps
Where to go next (BayeZian II): survival, multilevel GLMs, diagnostics at scale.
roadmap
  • Next Steps
Outcome: clear path to advanced modeling.

How We Conduct Our Course

At AthlyticZ, we design for flexibility and impact—mixing clear instruction, continuous assessment, and applied projects so you can learn at your pace and show real results.

Play icon for prerecorded lessons

Prerecorded Video Lessons

Learn on your schedule with concise, high-quality videos you can pause, replay, and revisit anytime.

Assignment icon

Assignments & Exercises

Reinforce concepts with practical exercises that build intuition and confidence in real workflows.

Open book icon for continuous assessment

Continuous Assessment

Low-friction quizzes and checkpoints give you immediate feedback and keep you on track.

Briefcase icon for projects

Hands-On Projects

Tackle real analytics problems—from quick wins to capstones—so you graduate with portfolio-ready work.

Notebook and resources icon

Supplementary Resources

Get curated readings, slides, and short references to deepen understanding and speed up execution.

Controls icon for interactive features

Interactive Features

Use lesson-embedded notes, preloaded IDEs, and live widgets for an engaging, hands-on experience.

Bill Geivett speaking about baseball operations

“Transformative Learning Experience”

AthlyticZ has completely transformed the learning approach to data science through the use of sports-based problems. The course structure is intuitive, the content is comprehensive, the instructors are the best of the best, and the practical projects have immediate impact to students.

Bill Geivett, M.Ed.
President @ IMA Team • Author, Do You Want to Work in Baseball?
Sr. VP: Colorado Rockies (2001–2014) • Asst. GM: Los Angeles Dodgers (1998–2000)

Frequently Asked Questions

Becoming a BayeZian I is the on-ramp to Stan-powered Bayesian analysis. Teams come away calibrated, confident, and ready to extend models into production.

Why should employers invest in Part I for their teams?
  • Shared vocabulary: Everyone learns the same Bayesian language (priors, likelihoods, posteriors) to improve collaboration.
  • Smarter decisions: Calibrated probabilities replace overconfident point estimates—critical in risk, product, and strategy teams.
  • Reusable templates: cmdstanr pipelines, posterior checks, and model comparison patterns you can deploy in-house.
  • Versatile applications: From marketing to medicine to sports analytics, models are immediately transferable.
What do I need before starting?

We recommend comfort with basic R. For a quick ramp-up, start with BreeZing through the Tidyverse. No prior Bayesian experience required.

How long is the course and how is it structured?
  • Duration: ~14–18 hours of guided content + optional labs.
  • Format: Self-paced with quizzes and applied case studies (soccer, basketball, Olympic sprinting).
  • Certification: 80%+ quiz performance + full module completion earns a certificate.
What topics does Part I cover?

Foundations every applied Bayesian needs:

  • Uncertainty & variation, probability & distributions.
  • Priors, likelihoods, posteriors.
  • Simulation-first thinking in R.
  • Posterior simulation, grid approximation, MH & HMC.
  • Simple regression in Stan, diagnostics, model comparison.
  • GLMs for binary, count, and categorical outcomes.
  • Intro to hierarchical modeling & partial pooling.
  • Case study: soccer expected goals (xG).
How does this course fit into a professional workflow?
  • Risk modeling: Replace deterministic thresholds with calibrated probabilities.
  • Product analytics: Use binomial/Poisson GLMs to model user behaviors.
  • Sports & operations: Apply hierarchical models for scouting, forecasting, and performance analysis.

Employers benefit from reproducible, explainable decisions that scale.

Do I need commercial software or special data?
  • No commercial tools—just R and Stan (via cmdstanr).
  • All examples use simulated or public sports datasets.
  • We provide Stan files, R scripts, and lab notebooks for every module.
Is there a refund policy?
  • Full refund within 3 days of course start.
  • No refunds after 3 days or once 25%+ of material is completed.

© 2025 AthlyticZ

Designed With ❤️ By Jackson Yew

Privacy Policy · Terms of Use · Terms of Sale ·  Email questions to : [email protected]