MEAFA workshop on Quantitative Analysis using Stata, 24-28 June 2013


2 April 2013: The workshop is now open for reservations. Places are limited and are reserved on a first-come first-served basis following the completion of the online Reservation Form.

Presenter of Days 4-5 on Structural Equation Modelling

Kristin MacDonald is Senior Statistician at StataCorp LP. Kristin has played an instrumental role in designing the sem Stata suite for Structural Equation Modeling (SEM). Kristin is the principal author of Stata's [SEM] Structural Equation Modeling Reference Manual. She is also the author of, and primary instructor for, Stata's SEM training course. Kristin works frequently with Stata users implementing models using Stata's SEM.

sem1.png sem3.png sem2.png

Presenter of Days 1-3 on Working Efficiently with Stata, the Management of Raw Data, and Data Visualisation

Demetris Christodoulou is General Convenor of the research group MEAFA. Demetris is the architect of the MEAFA Professional Development Workshops on Quantitative Analysis using Stata, that are widely recognised by industry, government and academia for their state-of-the-art content. Demetris had provided extensive consulting services to hundreds of executives and researchers on statistical computation and data analysis.

Brief workshop description

You may attend any one day or any combination of the following days. Days 4-5 on SEM are packaged together.

Day 1 (Monday, 24 June 2013): Working Efficiently with Stata by Demetris Christodoulou, MEAFA General Convenor

This day assumes no previous knowledge of Stata 12. It describes the environment of Stata and its core syntactic features. It demonstrates ways of working efficiently with Stata, including the use of logs and do-files. It presents the key programming principles and tools for constructing code that is automated, reproducible, tractable and verifiable. It demonstrates the access to saved results, and the use of macros and loops. The day is of interest to those who are new to Stata or have limited experience with Stata 12. It is also useful to more experienced users who wish to attain a more structural understanding of Stata from first principles. The material has been revamped from previous years.

Day 2 (Tuesday, 25 June 2013): Management of Raw Data by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata and basic programming skills. If you have no or limited experience with Stata 12 then you are strongly advised to attend Day 1 first. The day shows how to import and export different data formats. It demonstrates the management of various types of data including numerical variables, string variables and date/time variables, and the implications of missing values. It explores key data structures including cross-sectional, time-series and panel data in long and wide formats. It covers the management of data attributes, the organisation of data and the importance of metadata. It also demonstrates strategies for working efficiently with very large datasets. Dataset organisation, archiving, combinations, and transformations will also be discussed.

Day 3 (Wednesday, 26 June 2013): Data Visualisation by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata but no knowledge of data visualisation with Stata or any other software. The day provides an in-depth analysis of Stata's graphing logic and how to make sense of its vast graph syntax. Graphing examples will be demonstrated for a variety of data structures, using real data or simulated data. Demonstrations include the contrast of theoretical to empirical probability densities, y-x relationships, parametric and non-parametric fits, diagnostic plots, advanced bar charts and box plots, and more. By the end of this day you should be able to produce informative, robust, flexible and beautiful graphs using reproducible and adaptable routines. If you have no or limited experience with Stata then you are strongly advised to attend Day 1 first. Data management elements from Day 2 will also be used.

Days 4-5 (Thursday-Friday, 27-28 June 2013): Structural Equation Modelling (SEM) by Kristin MacDonald, Senior Statistician, StataCorp

These two days assume working experience with Stata and reasonable knowledge of statistics. Structural equation modeling (SEM) is a statistical methodology for formulating and estimating causal relationships of all sorts. SEM is an alternative way of thinking, formulating and estimating simple and complex cause-and-effect models, from simple linear regressions and instrumental variable models to measurement models, systems of simultaneous equations, confirmatory factor analysis, correlated uniqueness models, latent growth models, and much more. SEM will be demonstrated using a variety of applications across disciplines. If you have no or little experience with Stata 12 then you are advised to attend at least Day 1. See the StataCorp website for a detailed description of SEM, for an application, and the complete list of SEM capabilities.

Enrollment and Fees

You may attend any one day or any combination of days. The cost for attending Days 1, 2 and 3 is $600 per day. Days 4-5 on SEM are packaged together at $1,200 for both days.

Fees include extensive course material, do-files and data sets, use of computing facilities, temporary use of Stata 12 licenses and full catering. Numbers are limited and places are reserved on a first-come first-served basis following the completion of the online Reservation Form. Successful attendees will be notified shortly after reservation and invoices will be issued accordingly. Due to limited places, MEAFA maintains a no refund policy following payment. For more information on enrollment and fees contact

N.B. Proceedings from the workshop go to funding MEAFA PhD scholarships.


You may qualify for one of the following discounts:

  • 30% discount for a restricted number of non-employed full-time PhD students.
  • 15% discount for additional attendees from the same business organisation, governmental department or academic unit.

Venue and computing facilities

The workshop takes place at the computer Lab 5, ground level, Economics and Business Building H69, cnr Codrington & Rose streets, The University of Sydney Business School(see interactive map). Only Session 2 of Friday 28 June will take place at the Experimental Lab 190, Merewether Building H04.

Desktop PCs with Stata 12 licenses for Microsoft Windows are provided onsite. You can also install a temporary one-month Stata 12 license on your own laptop and work from there but to do that make sure to arrive early to install the license. You cannot access the web using your laptop. No printing facilities are available.


Monday to Thursday have the same schedule. Due to a clash, Session 1 of Friday begins earlier at 08:30, followed by a longer morning break.

Monday - Thursday Friday
Welcome tea and coffee 08:30-09:00 08:00-08:30
Session 1 09:00-10:30 08:30-10:00
Morning break 10:30-10:45 10:00-10:30
Session 2 10:45-12:15 10:30-12:15
Lunch break 12:15-13:15 12:15-13:15
Session 3 13:15-14:45 13:15-14:45
Afternoon break 14:45-15:00 14:45-15:00
Session 4 15:00-16:30 15:00-16:30

The computer labs will be accessible from 8am to 8pm every day. Catering is provided at each break.

Detailed Programme

Day 1 (Monday, 24 June 2013): Working efficiently with Stata

Session 1: The Stata environment
Stata interface; configuration; limits; system constants and parameters; updates; profile and system directories; help files; manual entries; open-source programs; other help resources.
Session 2: Syntactic features
General syntax; parsing; strings and double quotes; wildcards; operators; functions; qualifiers; missing values; Boolean evaluation; prefixes.
Session 3: Do-files and Logs
The Do-file Editor; key do-file commands; master do-files; comments; errors and prevention; troubleshooting; suppression of errors; recording selective output in logs; command logs; common headings; log archive; printing and translation.
Session 4: Key programming tools
Macros; macro evaluation; macro extended functions; scalars; scalar evaluation; loops; nested macros and nested loops; macro incrementation; saved results.

Day 2 (Tuesday, 25 June 2013): Management of Raw Data

Session 1: Raw data fundamentals
The importance of raw data; use, save and describe data; decimal separator; data formats; inspecting data attributes; data elements; explicit subscripting; record data subsets in logs; preserve, destroy and restore.
Session 2: Data types and computational memory
Numerical variables; string variables; date/time variables; missing values; memory requirements and physical limitations; settings and Stata limitations; numerical and string precision; working efficiently with large datasets.
Session 3: Data organisation and metadata
Naming rules for variables; order of variables; sorting of observations; data and variable labels; notes; value labels; labels in graphs; display format; data signature; archiving data and metadata.
Session 4: Data combinations and transformations
Dataset comparisons; appending datasets; one-to-one merging; match-merging; reshape long and wide; stack datasets; transposition; dataset expansion; collapse and contract.

Day 3 (Wednesday, 26 June 2013): Data Visualisation

Session 1: Data visualisation fundamentals
Graphics settings; general graph syntax; graph types; inspecting the data prior graphing; the histogram; graph areas; title, subtitle and note; axes values; bar look; saving and exporting graphs.
Session 2: Multiple graphs
The twoway suite and overlaying graphs; theoretical parametric densities; non-parametric density estimators; scatter plots; line graphs; graphing by groups; overall vs by-group options; by-group graph organisation; legend presentation; graph dimensions.
Session 3: Graph elements, properties and values
Textboxes; relative sizes; relative colours; line patterns; SMCL and graph text syntax; formatting numbers and text; graphing capabilities; immediate graphs; separate and quantile plots.
Session 4: Graph reproduction and adaptability
Flexible and transferable graph syntax; graph settings; graph schemes; the Graph Editor; basic programming tools with graphs; ensuring exact reproduction; freestanding graph do-files; box-plots; bar graphs.

Day 4 (Thursday, 27 June 2013): Structural Equation Modelling (SEM) Part 1

Session 1: SEM fundamentals
What is SEM? the -sem- suite; capabilities and resources; sem language; path diagrams and commands; GUI specification vs command specification; causality specification; observed and unobserved effects; correlation structure.
Session 2: Specification
Assumptions and choice of estimation method; joint normality and conditional normality; variable types: observed, latent, endogenous, exogenous, error; constraints on slopes and intercepts; constrains on variances and covariances.
Session 3: Identification
Identification vs non-identification; singularity; normalization constraints (anchoring); SEM solution and overriding the SEM solution; starting values; poor starting values vs. lack of identification.
Session 4: Estimation models
Single-factor measurement models; multiple-factor measurement models; confirmatory factor analysis (CFA) models; linear regression; dependencies between endogenous variables; unobserved inputs, outputs, or both; multiple indicators and multiple causes (MIMIC).

Day 5 (Friday, 28 June 2013): Structural Equation Modelling (SEM) Part 2

Session 1: Advanced estimation models
Simultaneous equations; seemingly unrelated regression (SUR); multivariate regression; higher-order CFA models; correlated uniqueness model; latent growth models; models with reliability.
Session 2: Comparing groups
The generic SEM model; estimation by group; group-specific parameters; constrained parameters across groups; constrained statistics across groups; group-specific constraints and paths.
Session 3: Postestimation tests and predictions
Goodness-of-fit statistics; tests for including omitted paths and relaxing constraints; tests of model simplification; predicted values; saved results.
Session 4: Standard errors and fitting summary statistics data
Robust and clustered standard errors; bootstrap and jackknife; fitting models using summary statistics data (SSD); SSD for multiple groups.

N.B. The precise content per session is subject to reshuffling and fine-tuning.

Reservation Form

Numbers are limited and places are reserved on a first-come first-served basis following the completion of the Reservation Form.