Stan logo Becoming a BayeZian II — Advanced Stan Programming

Level up with Advanced Stan

Deepen your Bayesian toolkit with mixture models, ranking models, advanced hierarchical structures, survival & time-series models, splines/GPs, HSGP, and performance tuning. Learn production-ready patterns in R + Stan (cmdstanr) guided by Dr. Scott Spencer (Columbia University).

Mixtures & Overdispersion Rating/Ranking (Plackett–Luce) Advanced Hierarchical Survival & Time-to-Event Splines & Gaussian Processes Performance & Parallelism
Enroll Now View Syllabus
Dr. Scott Spencer — Advanced Stan session (RStudio + lecture)

What You’ll Learn

Practical, advanced Bayesian modeling you can ship: richer likelihoods, stronger diagnostics, faster sampling, and patterns that scale to real production projects.

Mixture & Zero-Inflated Models

Handle overdispersion and excess zeros with Poisson/NB mixtures and hurdle models.

Rating & Ranking

Pairwise, Plackett–Luce, and ordinal regression—code and compare in Stan.

Advanced Hierarchical

Build partial-pooling structures that propagate uncertainty correctly.

Time & Survival

AR processes (irregular gaps) and Weibull survival—simulate, fit, interpret.

Splines & GPs

B-splines, tensor products, and Gaussian Processes for flexible, explainable fits.

Speed & Stability

QR reparameterization, parallelization, and GPU/chain optimizations.

Physics-Constrained Bayesian Modeling — Golf Putting

From sailing to golf: see how geometry and physics sharpen our likelihoods and make posteriors more interpretable. This preview is pulled from Becoming a BayeZian II.

Common Roadblocks We’ll Solve Together

  • ?

    Bridging domain physics with Bayesian likelihoods (what actually goes into p(y|θ)?).

  • ?

    Working with grouped/binomial data via sufficient statistics without losing information.

  • ?

    Writing clean, stable Stan that samples fast and stays numerically sane.

  • ?

    Explaining parameters stakeholders understand (e.g., a golfer’s precision, σ).

  • ?

    Designing PPCs that actually validate model assumptions.

Free Preview • Physics-Constrained Demo

Golf Putting: Direction, Velocity, and Probability of a Make

What you’ll build

Binomial model for made/attempts with a physics-informed success probability: P(x) = 2·Φ(asin((R − r)/x) / σ) − 1, where σ encodes directional variability.

Files / Where to look

golf_angle.stan, bayes2_code_notes.r • Load data (x, n, y; radii r, R) and run the Stan model.

Stan model (excerpt)
// golf_angle.stan
data {
  int J;                 // number of observed distances
  vector[J] x;           // distances (ft)
  array[J] int n;        // attempts
  array[J] int y;        // successes
  real r;                // radius ball
  real R;                // radius hole
}
transformed data {
  vector[J] angle_rad = asin( (R - r) ./ x );
}
parameters {
  real<lower=0> sigma_rad;
}
model {
  // prior
  sigma_rad ~ std_normal();

  // likelihood
  vector[J] p = 2 * Phi( angle_rad / sigma_rad ) - 1;
  y ~ binomial(n, p);
}
R workflow (snippet)
data_list <- list(
  J = length(x),
  x = x,
  n = n_attempts,
  y = y_made,
  r = ball_radius,
  R = hole_radius
)

mod <- cmdstanr::cmdstan_model("golf_angle.stan")
fit <- mod$sample(data = data_list)

posterior::summarise_draws(fit$draws(c("sigma_rad")))
Why it matters
  • Physics + geometry → likelihoods that reflect the task, not just a curve fit.
  • σ is directly interpretable as golfer precision (radians).
  • Grouped/binomial sufficient statistics keep models fast without losing signal.
Next in the course
  • Base running (decision & physics).
  • Survival analysis for time-to-event models.
  • Performance tuning: QR reparam, parallelism, GPU.

Course Overview — Becoming a BayeZian II

Push beyond foundations into production-ready Bayesian modeling: mixtures and zero-inflation, ranking, advanced hierarchical structure, time-series & survival, splines & Gaussian Processes, physics-constrained models, and performance engineering for Stan.

1) Introduction
Orient to Part II and how it builds on Part I.
reviewroadmap
  • Welcome to the Course
  • High-Level Review of Part I
  • Roadmap of this Course
Outcome: shared expectations for advanced topics and deliverables.
2) Workflow
A rigorous, repeatable end-to-end Bayesian workflow.
modeling playbookcomparison
  • Before Fitting
  • Fit & Simulate
  • Evaluate & Use
  • Compare Multiple Models
Outcome: faster iteration, fewer mistakes, clearer reporting.
3) Mixture Models
Handle overdispersion & excess zeros with principled mixtures.
NBZIPZINB
  • Overdispersion; Baseball scores as Poisson/mixtures
  • Negative Binomial as Poisson-Gamma
  • Zero-Inflation & Hurdle Models
  • 3PA modeling with Poisson/NB; ZIP & ZINB
Outcome: better fit ↔ better decisions on rare/overdispersed events.
4) Rating & Ranking Models
Compare competitors and items with pairwise & list models.
Plackett-Luceordinal
  • Pairwise & set comparisons; Extended ranking
  • Fit Plackett-Luce in Stan; Expectation of position
  • Ordinal regression: principles, R & Stan, scout-score simulation
Outcome: defensible leaderboards & scouting evaluations.
5) (A Bit More) Advanced Hierarchical Models
Propagate uncertainty with multi-level structure, avoid information loss.
non-centeredpartial pooling
  • Common omissions; Multi-level structure & motivation
Outcome: stable estimates on sparse/imbalanced groups.
6) Sufficient Statistics
Compress data without losing information; faster fitting.
efficiency
  • Concepts, sports use, and practice
Outcome: scalable pipelines & simpler models.
7) (More About) Correlation
Model dependence structures beyond simple correlations.
copulas
  • Trivariate reduction; Marginal + conditional; Copulas
Outcome: better joint predictions and scenario analysis.
8) QR Decomposition
Re-parameterize to fight collinearity and accelerate HMC.
QRspeed
  • Correlated covariates; Math & implementation in Stan/R
Outcome: faster, more reliable inferences.
9) Autoregressive Processes
Model time-dependence with regular/irregular intervals and interactions.
AR
  • Equal-interval AR; Irregular-time AR; Stan coding & fitting
  • Multiple AR processes with interactions
Outcome: credible forecasts & effects over time.
10) Survival Analysis
Time-to-event modeling (Weibull & discrete-time) with real sports data.
hazardWeibull
  • Power-law intuition; Simulate TTE data; Hazard/survivor functions
  • Weibull (w/ & w/o covariates), priors for baseball, fit & inference
  • Discrete hazard/log-hazard; Stan model & PPC
Outcome: retention/injury/tenure analyses stakeholders trust.
11) Differential Equations (ODEs)
Embed dynamics directly in the likelihood.
ODEparallelization
  • Usain Bolt WR; Joint champions model; Using & refactoring for parallel likelihood
Outcome: physics-aware models for performance trajectories.
12) Difference Equations
Impulse-response style models for training and recovery.
state-space
  • Bannister impulse-response; Cycling power; Stan coding & use
Outcome: actionable training load insights.
13) Splines
Flexible structure with B-splines and tensor products (xG context).
B-splinetensor
  • Simulate & construct B-splines; Stan regression; Counterfactuals
  • Speed-ups; Tensor products & 2-D splines; Kronecker; Prediction & comparison
Outcome: smooth effects without losing interpretability.
14) Gaussian Processes
Nonparametric modeling with full GP priors.
Choleskyhyperparameters
  • Likelihood & GP prior; Cholesky use; Hyperpriors; Predictions at new x
  • Stan code for N-D GP; Tests in 1-D and 2-D
Outcome: calibrated uncertainty over complex functions.
15) Hilbert-Space Approximate GPs (HSGP)
Scale GP-like modeling with basis-function approximations.
spectralHSGP
  • Fourier refresher; Basis functions & frequencies; Spectral densities
  • Likelihood & priors; Full model; N-D implementation in Stan; Visual walk-through (R)
Outcome: GP benefits at practical cost.
16) Physics-Constrained Models
Fuse mechanics with data for Sail GP, golf, base-running, umpire calls.
mechanisticposterior checks
  • Load/explore; Encode physics in Stan; Check & review posterior
Outcome: more believable predictions in high-stakes settings.
17) Common Issues
Robustness, missing data, censoring/truncation, and transformations.
robust priorsimputation
  • Outliers; Missing data; Hit-tracking constraints & model
  • Censoring/truncation; Parameter-space transforms
Outcome: resilient models on imperfect data.
18) Computational Performance
Make Stan fly: code optimizations, parallelism, GPU, memory.
threadsvectorize
  • Optimizations; Within-chain parallel; GPU; Memory
Outcome: scale to bigger problems and tighter SLAs.
19) Next Steps
Where to specialize next (causal inference, state-space, advanced GP, etc.).
roadmap
  • Next Steps
Outcome: a plan to keep compounding capability.
Your Bayesian Analysis Coach for Becoming a Bayesian

MY PASSION LIES IN LEVERAGING DATA FOR GOOD CAUSES AND EXPLORING THE INTRICATE DYNAMICS OF PROFESSIONAL SPORTS THROUGH STATISTICAL MODELING.

SCOTT SPENCER

Dive into the world of Bayesian analysis and cutting-edge probabilistic programming alongside a Columbia Professor and Stan language collaborator with expertise in crafting intricate generative models. Scott's expertise spans from decoding human behavior to forecasting sea-level rise impacts on coastal property values, all while dissecting the statistical DNA of professional sports.

Scott's influence extends beyond academia, shaping decisions for tech giants like Amazon, healthcare leaders such as Johnson & Johnson, and entertainment moguls like Vevo. His knack for transparent storytelling through R packages ensures that even the most complex insights are accessible and actionable, making him a sought-after guide for data enthusiasts across industries.

Join the BayeZian revolution and unlock the true potential of Bayesian methods with Scott, where every analysis tells a compelling story and uncertainty is the key to innovation!

Scott Spencer teaching Bayesian modeling in Stan Scott Spencer explaining Stan model code

How We Conduct Our Course

At AthlyticZ, we design for flexibility and impact—mixing clear instruction, continuous assessment, and applied projects so you can learn at your pace and show real results.

Play icon for prerecorded lessons

Prerecorded Video Lessons

Learn on your schedule with concise, high-quality videos you can pause, replay, and revisit anytime.

Assignment icon

Assignments & Exercises

Reinforce concepts with practical exercises that build intuition and confidence in real workflows.

Open book icon for continuous assessment

Continuous Assessment

Low-friction quizzes and checkpoints give you immediate feedback and keep you on track.

Briefcase icon for projects

Hands-On Projects

Tackle real analytics problems—from quick wins to capstones—so you graduate with portfolio-ready work.

Notebook and resources icon

Supplementary Resources

Get curated readings, slides, and short references to deepen understanding and speed up execution.

Controls icon for interactive features

Interactive Features

Use lesson-embedded notes, preloaded IDEs, and live widgets for an engaging, hands-on experience.

Bill Geivett speaking about baseball operations

“Transformative Learning Experience”

AthlyticZ has completely transformed the learning approach to data science through the use of sports-based problems. The course structure is intuitive, the content is comprehensive, the instructors are the best of the best, and the practical projects have immediate impact to students.

Bill Geivett, M.Ed.
President @ IMA Team • Author, Do You Want to Work in Baseball?
Sr. VP: Colorado Rockies (2001–2014) • Asst. GM: Los Angeles Dodgers (1998–2000)

Frequently Asked Questions

Built for teams using Stan in production. Practical skills, employer-ready outcomes, and a workflow you can standardize across projects.

Why should employers sponsor this course?
  • Faster time-to-insight: Standardized Bayesian workflow (priors → diagnostics → PPCs → model comparison).
  • Production patterns: cmdstanr pipelines, reproducible environments, documented model cards.
  • Lower tooling cost: Open-source stack (Stan, R/Tidyverse, posterior/loo/bridgesampling).
  • Transferability: Methods apply across risk, bio/pharma, marketing, ops research, and sports analytics.
What are the prerequisites for Part II?

Two pathways keep learners successful:

If you’re experienced in R and basic Stan, you can jump straight into Part II.

How long does it take? Self-paced or cohort?
  • Time commitment: ~12–16 hours of core content + optional labs.
  • Format: Self-paced with checkpoints and quizzes; optional live Q&A windows during promo periods.
  • Certificate: Awarded at 80%+ quiz average and 100% required modules.
What advanced topics are included in Part II?
  • Mixture models & zero-inflation (Poisson/NB; hurdle models).
  • Rating & ranking (Plackett-Luce, ordinal regression) with Stan.
  • Advanced hierarchies, sufficient statistics, correlation structures & copulas.
  • QR reparameterization, AR processes, survival analysis (Weibull & discrete-time).
  • Splines, Gaussian Processes & HSGP approximations.
  • Physics-constrained models (sailing, golf putting, base running).
  • Performance: within-chain parallelism, GPU, memory/compute trade-offs.
Can we roll this out to a team? Do you support POs and invoices?
  • Team licenses: volume pricing and manager dashboards upon request.
  • Payment: Purchase orders & invoicing supported for approved organizations.
  • Scheduling: Staggered start dates and private office hours available for ≥ 10 seats.
What about data security and proprietary code?
  • All examples ship with public or simulated data.
  • Templates encourage redacted artifacts (model cards, diagnostics) so teams can mirror workflows without exposing IP.
  • Optional bring-your-own-dataset guidance focuses on patterns—not your private data.
What support do learners receive?
  • Discussion threads on each lesson plus instructor responses on common blockers.
  • Environment setup guides for macOS/Windows/Linux and containerized fallbacks.
  • Downloadable notebooks, Stan files, and solution keys for key labs.
What will my team be able to do after Part II?
  • Ship calibrated Bayesian models with explainable parameters and robust PPCs.
  • Use mixtures/zero-inflation, survival, AR, splines/GPs, and physics-aware likelihoods.
  • Harden Stan programs (non-centered params, QR, vectorization) and diagnose/fix sampling pathologies.
  • Document decisions with reproducible scripts and model cards that leadership can trust.
What is the refund policy?
  • Full refund within 3 days of course start.
  • No refunds after 3 days or once ≥ 25% of content is completed (whichever comes first).

© 2025 AthlyticZ

Designed With ❤️ By Jackson Yew

Privacy Policy · Terms of Use · Terms of Sale ·  Email questions to : [email protected]