# MEAFA Professional Development Workshop on Bayesian analysis using Stata, 5-9 February 2018

## Bayesian analysis

Bayesian analysis is a statistical paradigm that answers research questions about unknown parameters using probability statements. For example, what is the probability that a person accused of a crime is guilty? What is the probability that treatment A is more cost effective than treatment B for a specific health care provider? What is the probability that the odds ratio is between 0.3 and 0.5? And many more. Such probabilistic statements are natural to Bayesian analysis because of the underlying assumption that all parameters are random quantities. In Bayesian analysis, a parameter is summarized by an entire distribution of values instead of one fixed value as in classical frequentist analysis. Estimating this distribution, a posterior distribution of a parameter of interest, is at the heart of Bayesian analysis.

This workshop will demonstrate the use of Bayesian analysis in various applications and will introduce Stata's suite of commands for conducting Bayesian analysis. You will learn the difference between Bayesian analysis and frequentist analysis, and the advantages and disadvantages of the Bayesian approach. You will learn how to work with Bayesian statistics, selecting priors and estimating posterior distributions and making predictions. You will learn about Bayesian computation including Markov chain Monte Carlo methods. You will learn how to compute effective sample sizes, judge model diagnostics, and perform Bayesian hypothesis testing. You will also learn how to fit Bayesian linear and nonlinear models to different types of outcomes. By the end of the workshop you should be comfortable performing Bayesian analysis in Stata.

The presenter for the two-day workshop on Bayesian analysis Using Stata (Thursday-Friday 8-9 Feb 2018) is Yulia Marchenko, Executive Director of Statistics, StataCorp LLC. Yulia oversaw and contributed to the development of the Bayesian suite of commands in Stata. Yulia is also the past Director of Biostatistics at StataCorp. Yulia's research interests also include multilevel modeling, multiple imputation, survival analysis, skewed distributions, and other areas of statistics and biostatistics. She co-authored the StataPress book An Introduction to Survival Analysis Using Stata, and was awarded a PhD in Statistics from Texas A\&M University.

## Working efficiently with Stata, Programming and Data management

The first three days of the workshop provide a thorough introduction to Stata, including Working Efficiently with Stata (Monday 5 Feb), Stata Programming (Tuesday 6 Feb), and Data Management (Wednesday 7 Feb). These days are delivered by Demetris Christodoulou, MEAFA General Convenor and architect of the MEAFA Professional Development Workshops. Demetris has extensive consulting experience on Stata and data analysis.

## Content description

You may attend any one day or any combination of the following days.

 Day 1 (Monday 5 Feb 2018): Working Efficiently with Stata by Demetris Christodoulou, MEAFA General Convenor This day assumes no previous knowledge of Stata 15. It describes the environment of Stata, its limitations and strengths, and core syntactic features. It demonstrates ways of working efficiently with Stata, including the use of logs and do-files. It discusses key principles and presents tools for developing work that is reproducible and verifiable. The day is of interest to those who are new to Stata 15 or have limited experience with earlier versions of Stata. It is also useful to more experienced users who wish to attain a more structural understanding of Stata from first principles.
 Day 2 (Tuesday, 6 Feb 2018): Introduction to Programming by Demetris Christodoulou, MEAFA General Convenor This day assumes working knowledge of Stata 15 but no knowledge of programming with Stata or any other software. By the end of this day you will be able to produce efficient, tractable and automated routines for data management, statistical analysis and estimation, creation of tables and graphs. The day covers key programming tools (saved results, stored restults, macros, scalars, loops), and the fundamentals of building your own commands in Stata (programs or ado-files). This day is appropriate to those who wish to attain a deeper knowledge of Stata and achieve the aforementioned attributes in their work. This day assumes knowledge of the material presented in Day 1.
 Day 3 (Wednesday, 7 Feb 2018): Management of Raw Data by Demetris Christodoulou, MEAFA General Convenor This day assumes working knowledge of Stata and basic programming skills but not of data management. The day demonstrates ways to import and export different data formats. It demonstrates the management of numerical variables, string variables and date/time variables, and the implications of missing values. It explores key data structures including cross-sectional, time-series and panel data in long and wide formats. It covers the management of data attributes, the organisation of data and the importance of metadata. It also demonstrates strategies for working efficiently with very large datasets. Dataset organisation, archiving, combinations, and transformations will also be discussed. If you have no or limited experience with Stata 15 then you are strongly advised to attend Day 1 first. Some programming tools will also be applied from Day 2 (stored results, macros, scalars and loops).
 Days 4-5 (Thursday-Friday, 8-9 Feb 2018): Bayesian Analysis by Yulia Marchenko, Executive Director of Statistics, StataCorp LLC Days 4-5 assume working knowledge of Stata and basic knowledge of statistics and regression analysis. During these two days, you will learn the difference between Bayesian analysis and frequentist analysis. You will learn about different Bayesian concepts and how to perform Bayesian inference. You will become familiar with several Markov Chain Monte Carlo approaches and will learn how to check for their convergence. You will also learn how to fit linear models, non-linear models and other advanced models such as mixed/multilevel models using the Bayesian approach. If you have no experience with Stata then you are required to attend at least Day 1.

## Enrolment and Fees

You may attend any day or any combination from Days 1 to 3, at the cost of \$600 per day. Days 4-5 are bundled together at \$1300. Prices include GST.

Fees include extensive course material, code, data sets, use of computing facilities, and full catering throughout the days. To express your interest in attending you must complete the online form:

Numbers are limited and places are reserved on a first-come first-served basis upon the submission of the online EOI form. Successful attendees will be notified shortly after and invoices will be issued accordingly. Due to limited places, MEAFA maintains a no refund policy. For more information on enrolment and fees contact business.meafa@sydney.edu.au.

Net proceeds from the workshop go to funding MEAFA PhD scholarships.

## Discounts

You may qualify for one of the following discounts:

• 25% discount for a limited number of non-employed full-time research students.
• 10% discount for additional attendees from the same business organisation, governmental department or academic unit.

## Venue and computing facilities

The workshop takes place at New Law School Learning Studio 030. The New Law School is inn the Camperdown Campus, bordered by Fisher Library, the Carslaw Building and Victoria Park. For directions, go to Campus Maps and search for New Law School.

Laptop PCs are provided onsite. You can also work on your own laptop but you cannot access the web using the University of Sydney server. If you plan to work on your own laptop then make sure to install beforehand Stata 15. If you do not have a Stata 15 license then we can provide a temporary license as part of your attendance to the workshop. No printing facilities are available.

## Accommodation

MEAFA does not engage in the administration of temporary accommodation. It is up to you to find suitable living arrangements.

## Timetable

All days have the same time schedule:

• 08:40-09:00 - Welcome tea and coffee
09:00-10:30 - Session 1
• 10:30-10:45 - Morning break
10:45-12:15 - Session 2
• 12:15-13:15 - Lunch
13:15-14:45 - Session 3
• 14:45-15:00 - Afternoon break
15:00-16:30 - Session 4
• 16:30-17:00 - Buffer-time and user-specific questions

The computer labs will be accessible from 8am to 8pm every day.

## Detailed Programme

Day 1 (Monday, 5 Feb 2018): Working efficiently with Stata

Session 1: The Stata environment
Stata interface; Stata limits; system constants and parameters; updates; profile; finding help and self-learning strategies; open-source programs.
Session 2: Syntactic features
General syntax; parsing; strings and double quotes; wildcards; operators; functions; qualifiers; missing values; Boolean evaluation; prefixes.
Session 3: Do-files and Logs
Do-files; comments; errors and prevention; troubleshooting; suppression of errors; recording selective output in logs; log archive; printing and translation.
Session 4: Data fundamentals
How Stata understands data; memory requirements and dataset size; data types; physical limitations; numerical precision; string precision; long strings.

Day 2 (Tuesday, 6 Feb 2018): Introduction to Programming

Session 1: Macros and scalars
Local and global macros; macro evaluation; macro expansion; prevent macro expansion; scalars; scalar evaluation; applications.
Session 2: Saved results
r-class values; e-class values; evaluating saved results; estimation and postestimation; storing estimates; results in tables; applications.
Session 3: Loops
Types of loops; initialising values; macro incrementation; nested macros; nested loops; rereferencing macros within loops; debugging loops; applications.
Session 4: Programs
Programming new commands; structure of an ado-file; storing and accessing programs; programs as 'wrappers'; softcoding.

Day 3 (Wednesday, 7 Feb 2018): Management of Raw Data

Session 1: Raw data fundamentals
The importance of raw data; decimal separator; data formats; inspecting data attributes; data elements; explicit subscripting.
Session 2: Data types and computational memory
Numerical variables; string variables; date/time variables; missing values; storage precision; working efficiently with large datasets.
Session 3: Data organisation and metadata
Naming rules for variables; dataset sorting; variable labels; notes; value labels; display format; data signature; archiving data.
Session 4: Data combinations and transformations
Dataset comparisons; appending datasets; one-to-one merging; match-merging; reshape long and wide; collapse to summaries; contract to frequencies.

Day 4 (Thursday, 8 Feb 2018): Bayesian analysis (Part A)

Session 1: Introduction to Bayesian analysis
What is Bayesian analysis?; Why Bayesian analysis?; advantages and disadvantages of Bayesian analysis; motivating example.
Session 2: Bayesian statistics
Prior and posterior distributions; point and interval estimation; model comparison; prior selection.
Session 3: Markov chain Monte Carlo (MCMC)
What is MCMC?; why MCMC?; adaptive Metropolis-Hastings and Gibbs sampling; burn-in period and MCMC sample size; convergence of MCMC.
Session 4: Bayesian analysis in Stata
Stata's Bayesian suite of commands; fitting Bayesian models using the bayesmh command; diagnostics; credible intervals.

Day 5 (Friday, 9 Feb 2018): Bayesian analysis (Part B)

Session 1: Bayesian regression
The bayes prefix; linear regression; autoregressive models; logistic regression; other regression models; postestimation and prediction.
Session 2: Panel-data and multilevel models
'Random' intercepts and coefficients; two-levels and more levels; nested and crossed effects; posterior distributions of subject-specific effects.
Session 3: Other applications of Bayesian models
Random-effects meta analysis; bioequivalence in crossover trials; change-point analysis; item response theory.
Session 4: Programming your own Bayesian models
Program evaluators; likelihood and posterior evaluators; example: a Hurdle model.

N.B. The precise content per session is subject to reshuffling and fine-tuning.

## Expression of Interest

Numbers are limited and places are reserved on a first-come first-served basis following the completion of the online form: