MEAFA workshop on Quantitative Analysis using Stata, 10-14 February 2014


18 October 2013: The workshop is now open for reservations. Places are limited and are reserved on a first-come first-served basis following the completion of the Expression of Interest form.

Presenter of Days 4-5 on Multilevel/Mixed models

Yulia Marchenko is Director of Biostatistics at StataCorp LP. Yulia leads the development of the xtmixed Stata suite for Multilevel/Mixed models. She is also the author of, and primary instructor for, Stata's training course on Multilevel/Mixed models.

Presenter of Days 1-3 on Working Efficiently with Stata 13, Programming and Monte Carlo Simulation

Demetris Christodoulou is General Convenor of the research group MEAFA. Demetris is the architect of the MEAFA Professional Development Workshops on Quantitative Analysis using Stata, that are widely recognised by industry, government and academia for their state-of-the-art content. Demetris had provided extensive consulting services to hundreds of executives and researchers on statistical computation and data analysis.

Brief workshop description

You may attend any one day or any combination of the following days. Days 4-5 on Multilevel/Mixed models are packaged together.

Day 1 (Monday, 10 Feb 2014): Working Efficiently with Stata by Demetris Christodoulou, MEAFA General Convenor

This day assumes no previous knowledge of Stata 13. It describes the environment of Stata and syntactic features. It demonstrates ways of working efficiently with Stata, including the use of logs and do-files. It discusses key principles and presents tools for developing work that is reproducible and verifiable. It also explains how Stata understands data, related precision and physical limitations of working with large data. The day is of interest to those who are new to Stata or have limited experience with Stata 13. It is also useful to more experienced users who wish to attain a more structural understanding of Stata from first principles.

Day 2 (Tuesday, 11 Feb 2014): Programming by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata 13 but no knowledge of programming with Stata or any other software. By the end of this day you will be able to produce efficient, tractable and automated routines for data management, statistical analysis, econometric estimation, creation of tables and graphs. The day covers key programming tools (macros, scalars, loops, saved results), and the fundamentals of building your own commands in Stata. This day is appropriate to those who wish to attain a deeper knowledge of Stata and achieve the aforementioned attributes in their work. This day assumes knowledge of all material presented in Day 1.

Day 3 (Wednesday, 12 Feb 2014): Monte Carlo Simulation by Demetris Christodoulou, MEAFA General Convenor

This day assumes good knowledge of Stata 13 and reasonable knowledge of statistics. Monte Carlo (MC) simulation describes the process of generating repeated random sampling for imitating real situations through the use of reasonable probabilistic assumptions. MC simulation is appropriate for evaluating complex deterministic formulations that are characterised by significant uncertainty. The focus is on the application of MC simulation. The principles of MC simulation will be demonstrated through a wide variety of applications. This days assumes knowledge of all material presented in Day 1, and some programming tools from Day 2.

Days 4-5 (Thursday-Friday, 13-14 Feb 2014): Multilevel/Mixed models with Stata 13 by Yulia Marchenko, Director of Biostatistics, StataCorp

These two days assume working experience with Stata and reasonable knowledge of statistics. Mixed models contain both fixed effects analogous to the coefficients in standard regression models and random effects not directly estimated but instead summarized through the unique elements of their variance-covariance matrix. Mixed models may contain more than one level of nested random effects, and hence, these models are also referred to as multilevel or hierarchical models, particularly in the social sciences. Stata's approach to linear mixed models is to assign random effects to independent panels where a hierarchy of nested panels can be defined for handling nested random effects. If you have no or little experience with Stata 13 then you are advised to attend at least Day 1. The StataCorp website gives a detailed coverage of Multilevel/Mixed capabilities.

Enrollment and Fees

You may attend any one day or any combination of days.

The cost for attending Days 1, 2 and 3 is $600 per day including GST.

Days 4-5 on Mixed/multilevel modelling are packaged together at $1,400 for both days including GST.

Fees include extensive course material, do-files and data sets, use of computing facilities, temporary use of Stata 13 licenses and full catering. Numbers are limited and places are reserved on a first-come first-served basis following the completion of the online Expression of Interest form. Successful attendees will be notified shortly after their expression of interest. Due to limited places, MEAFA maintains a no refund policy following payment. For more information on enrollment and fees contact

N.B. Proceedings from the workshop go to funding MEAFA PhD scholarships.


You may qualify for one of the following discounts:

  • 30% discount for a restricted number of non-employed full-time PhD students.
  • 15% discount for additional attendees from the same business organisation, governmental department or academic unit.


Successful Expressions of Interest will receive an email with a link to pay via the online system using a credit card. Alternative payment methods are also available, including bank transfers and cheques. All enquiries related to payment should be directed to Catherine Sumajit, MEAFA Treasurer, Business School Financial Services.

Venue and computing facilities

The workshop takes place at the computer Lab 5, ground level, Economics and Business Building H69, cnr Codrington & Rose streets, The University of Sydney Business School(see interactive map).

Desktop PCs with Stata 13 licenses for Microsoft Windows are provided onsite. You can also install a temporary one-month Stata 13 license on your own laptop and work from there but to do that make sure to arrive early to install the software. Note that you cannot access the web via the university network using your laptop, and that no printing facilities are available.


All days have the same time schedule:

  • 08:40-09:00 - Welcome tea and coffee
    09:00-10:30 - Session 1
  • 10:30-10:45 - Morning break
    10:45-12:15 - Session 2
  • 12:15-13:15 - Lunch
    13:15-14:45 - Session 3
  • 14:45-15:00 - Afternoon break
    15:00-16:30 - Session 4
  • 16:30-17:00 - Buffer-time and user-specific questions

The computer labs will be accessible from 8am to 8pm every day. Catering is provided at each break.

Detailed Programme

Day 1 (Monday, 10 Feb 2014): Working efficiently with Stata

Session 1: The Stata environment
Stata interface; configuration; limits; system constants and parameters; updates; profile and system directories; help files; manual entries; open-source programs; other help resources.
Session 2: Syntactic features
General syntax; parsing; strings and double quotes; wildcards; operators; functions; qualifiers; missing values; Boolean evaluation; prefixes.
Session 3: Do-files and Log-files
The Do-file Editor; key do-file commands; master do-files; comments; errors and prevention; troubleshooting; suppression of errors; recording output in logs; printing and translation.
Session 4: Data fundamentals
How Stata understands data; memory requirements and dataset size; data types; physical limitations; numerical precision; string precision; long strings.

Day 2 (Tuesday, 11 Feb 2014): Programming

Session 1: Macros and scalars
Local and global macros; macro evaluation; macro expansion; prevent macro expansion; scalars; scalar evaluation; applications.
Session 2: Saved results
r-class values; e-class values; evaluating saved results; estimation and postestimation; storing estimates; results in tables; applications.
Session 3: Loops
Types of loops; initialising values; macro incrementation; nested macros; nested loops; rereferencing macros within loops; debugging loops; applications.
Session 4: Programs
Programming new commands; structure of an ado-file; storing and accessing programs; programs as 'wrappers'; softcoding; the command syntax; applications.

Day 3 (Wednesday, 12 Feb 2014): Monte Carlo (MC) simulation

Session 1: MC Simulation fundamentals
What is MC simulation; Stata set-up parameters; initialisation seed; the Uniform distribution; first principle distributions (Triangular, Bernouli, Binomial, Normal); stable distributions; The Law of Large Numbers; the Central Limit Theorem.
Session 2: Simulation tools
The command simulate; postfile vs. simulate; positional arguments; probabilistic statements; applications.
Session 3: Evaluation
Accuracy and precision; measures of precision; boosting precision; loss of confidence; choice of sample size; simulation efficiency; variance reduction techniques (antithetic variables; control variables; common random numbers).
Session 4: Simulation for statistics
Regression misspecification (endogeneity, omitted variables, non-Normal errors, outliers); size and power of tests; time series simulation; panel data simulation.

Day 4 (Thursday, 13 Feb 2014): Multilevel/Mixed models, Part 1

Session 1: Mixed linear models
Fixed effects; random effects; random intercepts and random slopes; xtmixed versus xtreg; using Stata's R. factor notation for mixed models.
Session 2: Multilevel models
Two-, three-, and higher-level models; nested (hierarchical) models; crossed-effects models; balanced and unbalanced designs.
Session 3: Estimators
Within estimator versus the generalized least squares (GLS) estimator; maximum likelihood and restricted maximum likelihood; Gauss-Hermite quadrature (adaptive and non-adaptive); Laplacian approximation; EM method starting values.
Session 4: Covariance structures
Covariance structures for random effects; growth curves; identity structure; independent structure; exchangeable structure; unstructured and compound (combination).

Day 5 (Friday, 14 Feb 2014): Multilevel/Mixed models, Part 2

Session 1: Constraints and error structures
Linear constraints on fixed parameters; linear constraints on variance components; independent error structure; exchangeable structure; autoregressive structure; moving average structure; banded structure; Toeplitz structure; unstructured.
Session 2: Postestimation and validation
Linear and nonlinear Wald tests; the Hausman test; heteroskedastic residual errors; likelihood-ratio (LR) tests; linear and nonlinear predictions; composition of nested groups; information criteria.
Session 3: Marginal analysis
Estimated marginal means; marginal and partial effects; least-squares means; predictive margins; adjusted predictions, means, and effects; contrasts of margins; pairwise comparisons of margins; profile plots; graphs of margins and marginal effects.
Session 4: Discrete choice models
Binary outcomes (logistic, probit); count outcomes (Poisson; negative binomial); categorical outcomes (multinomial logistic); ordered outcomes; generalized linear models.

N.B. The precise content per session is subject to reshuffling and fine-tuning.

Expression of Interest form

Numbers are limited and places are reserved on a first-come first-served basis following the completion of the Expression of Interest form.