Short Courses

Enhance your expertise with hands-on, expert-led short courses at SAE 2026.
These sessions combine cutting-edge methodology with real-world applications, providing participants with tools directly applicable in official statistics, research, and data science.

Entity Resolution

Ted Enamorado — Washington University in St. Louis

Monday, 15 June 2026

4 hoursBasic levelHands-on

Course summary

This course focuses on the common task of identifying and merging records from diverse data sources that correspond to the same entities. Whether you are working with large-scale databases, healthcare records, or customer datasets, the ability to accurately link records is crucial for data integration, analysis, and decision-making. Our goal is to provide you with a comprehensive introduction to both the theoretical foundations and practical applications of record linkage, equipping you with the skills to address real-world data challenges effectively.

Learning outcomes to be covered

Throughout this course, we will explore various methodologies and algorithms that underpin entity resolution, including deterministic and probabilistic approaches, machine-learning techniques, and strategies for handling complex scenarios. You will gain a solid understanding of the principles behind these methods, as well as the strengths and limitations of different techniques.

A key element of this course is the development of code to facilitate entity resolution tasks. Practical sessions will guide you through the implementation of record linkage algorithms using the fastLink R package. You will learn to write efficient, scalable code that can handle large datasets, perform data cleaning and preprocessing, and accurately link records. By the end of this course, you will not only have a strong theoretical grounding but also hands-on experience in developing and deploying practical solutions, empowering you to drive data integration projects with confidence.

Description of course materials for online teaching

Lecture notes and sample code will be distributed.

Target audience

Anyone interested in learning more about entity resolution.

Prerequisites

Familiarity with the R statistical computing environment.

Requirements

Participants should bring a laptop with R and the fastLink package installed.

Instructor biography

Ted Enamorado is an Associate Professor of Political Science at Washington University in St. Louis, affiliated with the Center for the Study of Race, Ethnicity & Equity, the Division of Computational & Data Sciences, and the Political Data Science Lab. He holds a Ph.D. in Politics from Princeton University, specializing in Political Economy and Political Methodology. His research focuses on improving probabilistic methods, particularly in record linkage and data integration, with applications in criminal justice, survey research, and political data. Before graduate study, he worked at the Inter-American Development Bank and the World Bank.

Bayesian Small Area Estimation, with an Emphasis on Low- and Middle-Income Countries

Jon Wakefield — Departments of Statistics and Biostatistics, University of Washington, Seattle, USA

Friday, 19 June 2026

4 hoursAdvancedApplied

Course summary

Small area estimation is of crucial importance in low- and middle-income countries (LMICs). A modern Bayesian treatment will be presented and illustrated using a range of examples. Area-level (Fay–Herriot) and unit-level models will be presented. Unit-level models for both linear and generalized linear models will be discussed. Fast computation is carried out with the Integrated Nested Laplace Approximation (INLA) method, which is embedded within the SUMMER and surveyPrev R packages. Hyperprior specification is via penalized complexity priors. Between-area variation will be modeled using independent and spatial random effects. For the latter, the Besag, York, Mollié model will be described.

Learning outcomes to be covered

An introduction to Bayesian statistics; limitations of direct estimates; a comprehensive description of area-level and unit-level models, including advantages and disadvantages; borrowing of strength and the bias/variance trade-off; mixed effects models; spatial models; regression modeling; parameter interpretation; Bayesian computation; prior specification.

Proposed delivery structure, including elements of engagement

Methodological lectures and software demonstrations.

Target audience

Anyone with an interest in learning about modern Bayesian methods, with an emphasis on LMICs, though the methods are generally applicable.

Prerequisites

Some knowledge of basic probability and linear and logistic regression models will be useful.

Preparatory material, including software

The website sae4health.stat.uw.edu contains links to the relevant software resources. The course will discuss the surveyPrev and SUMMER R packages, which support spatial and spatio-temporal modeling, respectively, and provide high-quality mapping capabilities. Pre-modeled estimates are available at sae4lmic.stat.uw.edu, while users can analyze DHS data from multiple countries and surveys at rsc.stat.washington.edu/sae4health.

Instructor biography

Jon Wakefield (faculty.washington.edu/jonno) is a professor in the Departments of Statistics and Biostatistics at the University of Washington in Seattle. He is a statistician with interests in small area estimation, demography, spatial epidemiology, global health, and exploring the links between Bayesian and frequentist estimation procedures. He wrote the book Bayesian and Frequentist Regression Models, published in 2013 by Springer, and has published 187 peer-reviewed papers and 29 book chapters, with an h-index of 71 and over 20,000 citations. He has extensive experience in SAE in LMICs, working closely with the UN, WHO, and the Gates Foundation, and has given workshops in multiple countries including Nigeria, Rwanda, South Africa, Malawi, and Ecuador. Dr Wakefield’s work is focused on putting tools into the hands of local researchers. He has also taught multiple short courses on Spatial Epidemiology and SAE as part of the UW summer school program. He has graduated 31 PhD students and runs the Space Time Analysis Bayes (STAB) working group (alanamcgovern.github.io/stablab). Dr Wakefield received the Guy Medal in Bronze in 2000 from the Royal Statistical Society and was elected a Fellow of the American Statistical Association in 2007. Dr Wakefield’s group produced subnational estimates for under-5 mortality and neonatal mortality for the UN Inter-Agency Group for Mortality Estimation (IGME), for around 30 LMICs (childmortality.org).