Becoming a BayeZian I — Foundations of Bayesian Analysis

Build Bayesian intuition with Stan

Start from first principles—uncertainty, probability, and priors—and progress to GLMs and introductory hierarchical models. Learn simulation-first thinking in R and implement models in Stan with cmdstanr under the guidance of Dr. Scott Spencer (Columbia University).

Uncertainty & Variation Probability & Distributions Priors → Likelihoods → Posterior Simulation in R Stan Basics (cmdstanr) GLMs & Intro Hierarchical

Enroll Now View Syllabus

Dr. Scott Spencer teaching Bayesian modeling with Stan

What You’ll Learn

A rigorous, practical introduction to Bayesian data analysis. Concepts map directly to real projects—so your models are calibrated, explainable, and decision-ready.

Uncertainty & Variation

Quantify uncertainty from the start and avoid overconfident point estimates.

Probability & Distributions

Bernoulli, Binomial, Poisson, Beta, Normal—learn when and why to use each.

Priors → Likelihoods → Posterior

Formulate priors, write likelihoods, and interpret posteriors with confidence.

Simulation in R

Prototype distributions and models to build intuition before fitting in Stan.

Stan Fundamentals

Code simple models, fit with cmdstanr, and check HMC diagnostics.

GLMs & Intro Hierarchical

Model counts and probabilities, extend to partial pooling with groups.

Model Evaluation

Use PPCs and LOOCV/ELPD to compare and validate your models.

Case Study: Soccer xG

Build and extend an expected-goals model with real soccer data.

Enroll Now View Syllabus

What You’ll Learn

Go beyond plug-and-play ML. Learn how to quantify uncertainty, build interpretable models, and apply Bayesian reasoning with Stan—skills employers seek for high-stakes decision-making.

Foundations of Bayesian Thinking

Understand probability, priors, likelihoods, and posteriors.
Simulate distributions and run inference in R and Stan.
Develop intuition for variation, uncertainty, and shrinkage.

Impact: Stand out by explaining not just “what the model predicts” but “how sure we are.”

Stan Modeling Skills

Code regression models directly in cmdstanr.
Fit and diagnose models with HMC sampling.
Conduct posterior predictive checks and model comparisons (LOO/ELPD).

Impact: Employers value candidates who can evaluate reliability, not just accuracy.

Hierarchical & Generalized Models

Model binomial, Poisson, and categorical outcomes.
Extend to hierarchical structures with partial pooling.
Balance flexibility and interpretability with GLMs.

Impact: Bring scalable solutions to messy, multi-level real-world data.

Applied Case Study

Soccer expected goals (xG) with real play-by-play data.
Explore correlation between predictors & model extensions.
Use model estimates to drive strategic decision-making.

Impact: Show portfolio-ready work that mirrors sports teams, finance, and health analytics pipelines.

Feeling Overwhelmed by Bayesian Modeling?

You’re not alone. We turn equations into **working models**—so you gain intuition, write clean Stan, and make better decisions.

Common Roadblocks We’ll Solve Together

?
Translating probability theory into practical modeling choices.
?
Choosing sensible priors and interpreting posteriors with confidence.
?
Connecting GLMs to real sports problems (counts, probabilities, categorical outcomes).
?
Writing Stan code and debugging HMC diagnostics without getting stuck.
?
Evaluating models with PPCs, ELPD/LOOCV—so you trust what you ship.

Free Preview • Real Lesson

Example Four Using Soccer Data (Poisson Models)

What you’ll build

Model both Arsenal and the opponent goals as Poisson counts—then extend to more realistic game outcomes.

Files / Where to look

soccer_4.stan, lecture notes.r • Run lines 2–3 to set up; jump to line 993 to follow along.

Stan model (excerpt)

// soccer_4.stan
data {
  int N;          // observations
  int Y1[N];      // Arsenal goals
  int Y2[N];      // Opponent goals
}
parameters {
  real<lower=0> lambda[2]; // Arsenal, Opponent
}
model {
  lambda ~ exponential(1);
  Y1 ~ poisson(lambda[1]);
  Y2 ~ poisson(lambda[2]);
}

R workflow (snippet)

data_list <- list(Y1=arsenal_goals, Y2=opponent_goals, N=num_observations)
soccer4_model <- stan_model("soccer_4.stan")
fit_soccer4   <- sampling(soccer4_model, data=data_list)

Model extensions

Add team-level parameters to model dependence between opponents.
Explore the Skellam distribution for score differences.
Promote to GLMs with covariates and partial pooling.

Why it matters

Move beyond single-team goals to game-level outcomes.
Build intuition for priors/likelihoods on real sports data.
Learn patterns you’ll reuse in BayeZian II and in production.

Start for $159 — Enroll Now View Syllabus

“Scott Spencer brings Bayesian modeling alive. He not only explains the math, but shows how to implement models in Stan that are clear, scalable, and ready for research or production. If you want to truly master Bayesian analysis, Scott is the guide you need.”

Michael S. Czahor, PhD

President @ Athlyticz

Your Bayesian Analysis Coach for Becoming a Bayesian

MY PASSION LIES IN LEVERAGING DATA FOR GOOD CAUSES AND EXPLORING THE INTRICATE DYNAMICS OF PROFESSIONAL SPORTS THROUGH STATISTICAL MODELING.

SCOTT SPENCER

Dive into the world of Bayesian analysis and cutting-edge probabilistic programming alongside a Columbia Professor and Stan language collaborator with expertise in crafting intricate generative models. Scott's expertise spans from decoding human behavior to forecasting sea-level rise impacts on coastal property values, all while dissecting the statistical DNA of professional sports.

Scott's influence extends beyond academia, shaping decisions for tech giants like Amazon, healthcare leaders such as Johnson & Johnson, and entertainment moguls like Vevo. His knack for transparent storytelling through R packages ensures that even the most complex insights are accessible and actionable, making him a sought-after guide for data enthusiasts across industries.

Join the BayeZian revolution and unlock the true potential of Bayesian methods with Scott, where every analysis tells a compelling story and uncertainty is the key to innovation!

Scott Spencer teaching Bayesian modeling in Stan

Scott Spencer explaining Stan model code

Course Overview

Build a rigorous Bayesian toolkit with Stan—from probability foundations to hierarchical GLMs and an applied xG case study—so you can quantify uncertainty, defend results, and deliver stakeholder-ready insights.

Start for $159 — Enroll Now View Syllabus

▶

1) Introducing Bayesian Analysis for Sports

Why Bayesian? Course goals, deliverables, and how this maps to business value.

scopeuse-casesexpectations

Introduction
Course Topics

Outcome: align vocabulary and expectations; set a clear success path.

Syllabus Enroll

▶

2) Exploring Uncertainty & Variation

Develop intuition for variation using sports examples and visuals.

variabilityvisual reasoning

Uncertainty & Variation
Example — 100 Meter Olympic Sprint
Visualizing the Example Data
Quiz: Exploring Uncertainty & Variation

Outcome: communicate uncertainty credibly to stakeholders.

▶

3) Probability, Random Variables & Distributions

Bayesian building blocks with discrete & continuous families.

Bernoulli/BinomialPoissonNormal/Beta

Probability, Random Variables & Distributions
Random Variables; Bernoulli & Binomial; Poisson; Counts→Normal
Continuous Uniform; Beta; Normal; Summary Statistics
Joint, Marginal, Conditional; Independence; Getting to Bayes Rule
Quiz: Probability, Random Variables, and Distributions

Outcome: choose appropriate likelihoods & priors with confidence.

▶

4) Priors, Likelihoods, and Posteriors

Connect priors to evidence; compute posteriors and use conjugacy where possible.

priorslikelihoodposterior

Priors, Likelihoods, and Posteriors; Likelihoods & Normalizing Constant
Conjugate Priors
Quiz: Priors, Likelihoods, and Posteriors

Outcome: defend modeling choices to technical & non-technical audiences.

▶

5) Simulating Distributions in R

Gain intuition via simulation before touching Stan.

simulationR

Simulating Distributions in R Intro
Transforming Random Numbers to Distributions; Discrete Distributions
Quiz: Simulating Distributions

Outcome: faster learning cycles and sanity checks.

▶

6) Random Variable Code Objects

Represent distributions as reusable code objects.

abstractionreusability

Representing Distributions with a Random Variable Code Object

Outcome: cleaner, testable modeling code.

▶

7) Simulations & Models in Stan

Move into Stan: simulate values & fit a beta-binomial model.

cmdstanrbeta-binomial

Stan Documentation; Toy Example (simulate values)
Second Example: Beta Binomial
Quiz: Simulations and Models in Stan

Outcome: understand Stan program flow & results objects.

▶

8) Posterior Simulation with Grid Approximation

Hand-build a posterior to cement intuition.

grid approx

Grid Approximation Example

Outcome: demystify Bayes before MCMC.

▶

9) Approximate Posteriors with MH & HMC

See how modern HMC powers Stan.

MCMCHMC

Approximate Posteriors Intro; Hamiltonian Monte Carlo

Outcome: choose engines & read diagnostics confidently.

▶

10) A Language for Describing Models

Create a consistent vocabulary for models.

model spec

A Language for Describing Models Intro

Outcome: smoother hand-offs & reviews.

▶

11) Simple Normal Regression (Stan)

Code, fit, and diagnose a regression model in Stan.

compilationdiagnostics

Intro; Coding; Compiling & Fitting
Checking HMC Diagnostics; Reviewing Parameters
Quiz: Simple Normal Regression

Outcome: production habits for reliable inference.

▶

12) cmdstanr Objects, Helper Functions & Evaluation

Posterior predictive checks + model comparison (ELPD / LOO).

PPCELPD/LOO

cmdstanr Objects & Helpers
Posterior Predictive Checks — three approaches
Model Comparison: ELPD & LOOCV
Quiz: Cmdstanr Model Objects

Outcome: defend models with calibrated, comparable metrics.

▶

13) Extending Normal Regression

Add categorical & multiple predictors responsibly.

design matrices

Intro; Categorical Predictors; Multiple Predictors
Quiz: Extending Normal Regression

Outcome: scalable patterns for real datasets.

▶

14) GLMs — Conceptual Introduction

Choose link functions that match outcomes.

logitlog

GLMs Intro; Logit Link; Log Link
Quiz: Generalized Linear Models

Outcome: interpretable probability modeling.

▶

15) GLMs — Modeling Integer or Count Outcomes

Binomial & Poisson workflows with sports examples (basketball & soccer).

binomialpoisson

GLMs; Binomially-Distributed Counts — Basketball Examples 1–3
Poisson-Distributed Counts — Soccer Examples 2–4
Quiz: GLMs (Part 2)

Outcome: calibrated event probabilities for decision support.

▶

16) More GLMs — Categorical Outcomes

Model multi-class outcomes; interpret coefficients & uncertainty.

multinomial

First, Second & Third Categorical Models

Outcome: richer labels without black-box confusion.

▶

17) Hierarchical Models — An Introduction

Share strength across groups with partial pooling; reparameterize when needed.

partial poolingdiagnostics

Intro; Parameters Sharing Information
Diagnostics & Reparameterization
Quiz: Hierarchical Models

Outcome: robust estimates on messy, sparse, or imbalanced data.

▶

18) Workflow Recap

A practical playbook: from problem framing → checks → reporting.

playbook

Workflow Recap

Outcome: repeatable, auditable analytics.

▶

19) Case Study — Soccer & Expected Goals (xG)

End-to-end Bayesian modeling for xG with model extensions & decisions.

xGfeature effectsdecisions

Exploring Pitch Data; Bernoulli goals → expanded model
Add angle & body-part predictors; model correlation
Add hierarchical info; reparameterize; decision use

Outcome: portfolio-ready artifact aligned with industry practice.

▶

20) Next Steps

Where to go next (BayeZian II): survival, multilevel GLMs, diagnostics at scale.

roadmap

Next Steps

Outcome: clear path to advanced modeling.

Start for $159 — Enroll Now View Syllabus

How We Conduct Our Course

At AthlyticZ, we design for flexibility and impact—mixing clear instruction, continuous assessment, and applied projects so you can learn at your pace and show real results.

Prerecorded Video Lessons

Learn on your schedule with concise, high-quality videos you can pause, replay, and revisit anytime.

Assignments & Exercises

Reinforce concepts with practical exercises that build intuition and confidence in real workflows.

Continuous Assessment

Low-friction quizzes and checkpoints give you immediate feedback and keep you on track.

Hands-On Projects

Tackle real analytics problems—from quick wins to capstones—so you graduate with portfolio-ready work.

Supplementary Resources

Get curated readings, slides, and short references to deepen understanding and speed up execution.

Interactive Features

Use lesson-embedded notes, preloaded IDEs, and live widgets for an engaging, hands-on experience.

Bill Geivett speaking about baseball operations

“Transformative Learning Experience”

AthlyticZ has completely transformed the learning approach to data science through the use of sports-based problems. The course structure is intuitive, the content is comprehensive, the instructors are the best of the best, and the practical projects have immediate impact to students.

Bill Geivett, M.Ed.

President @ IMA Team • Author, Do You Want to Work in Baseball?
Sr. VP: Colorado Rockies (2001–2014) • Asst. GM: Los Angeles Dodgers (1998–2000)

Frequently Asked Questions

Becoming a BayeZian I is the on-ramp to Stan-powered Bayesian analysis. Teams come away calibrated, confident, and ready to extend models into production.

Why should employers invest in Part I for their teams?

Shared vocabulary: Everyone learns the same Bayesian language (priors, likelihoods, posteriors) to improve collaboration.
Smarter decisions: Calibrated probabilities replace overconfident point estimates—critical in risk, product, and strategy teams.
Reusable templates: cmdstanr pipelines, posterior checks, and model comparison patterns you can deploy in-house.
Versatile applications: From marketing to medicine to sports analytics, models are immediately transferable.

What do I need before starting?

We recommend comfort with basic R. For a quick ramp-up, start with BreeZing through the Tidyverse. No prior Bayesian experience required.

How long is the course and how is it structured?

Duration: ~14–18 hours of guided content + optional labs.
Format: Self-paced with quizzes and applied case studies (soccer, basketball, Olympic sprinting).
Certification: 80%+ quiz performance + full module completion earns a certificate.

What topics does Part I cover?

Foundations every applied Bayesian needs:

Uncertainty & variation, probability & distributions.
Priors, likelihoods, posteriors.
Simulation-first thinking in R.
Posterior simulation, grid approximation, MH & HMC.
Simple regression in Stan, diagnostics, model comparison.
GLMs for binary, count, and categorical outcomes.
Intro to hierarchical modeling & partial pooling.
Case study: soccer expected goals (xG).

How does this course fit into a professional workflow?

Risk modeling: Replace deterministic thresholds with calibrated probabilities.
Product analytics: Use binomial/Poisson GLMs to model user behaviors.
Sports & operations: Apply hierarchical models for scouting, forecasting, and performance analysis.

Employers benefit from reproducible, explainable decisions that scale.

Do I need commercial software or special data?

No commercial tools—just R and Stan (via cmdstanr).
All examples use simulated or public sports datasets.
We provide Stan files, R scripts, and lab notebooks for every module.

Is there a refund policy?

Full refund within 3 days of course start.
No refunds after 3 days or once 25%+ of material is completed.

Enroll in BayeZian I View Syllabus Prep: Tidyverse

Designed With ❤️ By Jackson Yew

Privacy Policy · Terms of Use · Terms of Sale · Email questions to : [email protected]