MEAFA Professional Development Workshop in Quantitative Analysis Using Stata, 21-25 June 2010
Announcements
20 April 2010: following popular demand, we are very glad to announce a 2-day workshop on Multiple Imputation (MI) by Yulia Marchenko (the developer of the -mi- suite in Stata). The MI workshop has been integrated into MEAFA's annual 5-day professional development workshop in quant analysis. This is done in order to provide the opportunity to Stata novices to obtain an exposure to Stata and to quant analysis using Stata prior attending the MI workshop. Nonetheless, the first three days of the workshop are still open to those not interested in MI.
Brief description of Multiple Imputation (MI)
Many real world datasets are incomplete, in the sense that some observations may not have data for all the variables you wish to analyse. A popular way to handle these situations is to use Donald Rubin's multiple imputation (MI). Like listwise deletion, which is another common way for handling missing data, MI is applicable to a wide variety of analyses. Unlike listwise deletion, which results in a loss of data, MI preserves all available information in the dataset and can therefore lead to more efficient estimation. MI is a simulation-based method consisting of three steps: (1) imputation creates multiple imputed (completed) datasets according to a chosen imputation model, (2) complete-data analysis performs primary analysis of interest on each of the imputed datasets, (3) pooling consolidates results from step 2 into one MI inference using Rubin's combination rules. In Stata 11, you can use the -mi- suite command to perform multiple imputation; for an example see The Stata News, 2010, vol 25, No 1 (PDF).
Workshop description
You may attend any one or any combination of the following days:
Day 1 (Monday, June 21): Introduction to Stata 11 and data management by Demetris Christodoulou, MEAFA General Convener
| This day assumes no previous knowledge of Stata. An overall introduction to Stata 11 will be provided and ways to customise/personalise the software will be discussed. Basic data structures, the analysis of different types of variables and various data management techniques with be discussed. Some examples of graphing, tables and the management of output will be presented. This day is of interest to those who are new or have limited experience with Stata. |
Day 2 (Tuesday, June 22): Two parallel sessions - you can choose only one to attend
Proficient Programming with Stata 11 by Demetris Christodoulou This day assumes working knowledge of Stata but no knowledge of programming with Stata or with any other software. By the end of this day you will be able to produce tractable, reproducible and automated routines for data management, statistical analysis, econometric estimation, creation of tables, graphing etc. This day is appropriate for those who wish to become more efficient in working with Stata and have an appreciation for programming. |
Econometric Modelling and Statistical Testing by Andrey Vasnev This day assumes familiarity with Stata 11 and a basic understanding of quantitative methods. It uses applications to demonstrate the use of statistical analysis, hypothesis testing and basic econometric modelling for validating assumptions and expectations. This day is of interest to those who wish to know how to apply various quantitative methods using Stata. Detail notes on theory will be provided as background reading. |
| N.B.: MEAFA reserves the right to cancel a parallel session in case of low demand. | |
Day 3 (Wednesday, June 23): Two parallel sessions - you can choose only one to attend
Panel Data Analysis by Vasilis Sarafidis This day assumes working knowledge of Stata 11 and basic knowledge of econometrics. It explains the rationale of panel data methods and demonstrates the use of static, dynamic and nonlinear models of panel data. This day is of interest to those who wish to learn how to use panel data analysis using Stata. Detail notes on theory will be provided as background reading. |
Time Series Analysis by Richard Gerlach This day assumes working knowledge of Stata 10 and basic knowledge of econometrics. It details the theory for modelling univariate time series, forecasting and bivariate causality relationships, and offers extensive applications using Stata. This day is of interest to those who wish to learn how to use time series analysis using Stata. Detail notes on theory will be provided as background reading. |
| N.B.: MEAFA reserves the right to cancel a parallel session in case of low demand. | |
Days 4-5 (Thursday-Friday, June 24-25): Multiple Imputation Using Stata by Yulia Marchenko, Senior Statistician at StataCorp LP
| These two days assume working knowledge of Stata and of standard statistical techniques such as linear/logistic regression. The course provide a brief introduction to multiple imputation (MI) analysis and a detail description of the three stages of MI (imputation, complete-data analysis, pooling) with applications in Stata 11. Various imputation techniques will be discussed with the main focus on multivariate normal imputation. A number of examples demonstrating how to safely and efficiently manage multiply-imputed data will be provided. Linear and logistic regression analysis of multiply-imputed data as well as several post-estimation features will be presented. Detailed notes will be provided outlining all theory and applications. The presenter, Yulia Marchenko, is the chief developer of Stata's routines for multiple imputation -mi- and co-author of An Introduction to Survival Analysis using Stata. |
Enrollment and Fees
You may attend any one or any combination of days. See the description of each day to determine which days are of most interest to you. Fees vary on the days attended. The following combinations are possible (prices exclude GST):
- Attend one of Days 1, 2 or 3: $500
- Attend two of Days 1, 2 or 3: $900
- Attend Days 1, 2 and 3: $1300
- Attend Days 4 & 5 on Multiple Imputation: $1300
- Attend Days 4 & 5 and one of Days 1, 2 or 3: $1700
- Attend Days 4 & 5 and two of Days 1, 2 or 3: $2100
- Anttend all five days: $2500
Fees include extensive course material, a detailed guide to Stata 11, data sets, lectures, use of computing facilities, temporary use of Stata 11 licenses and full catering. Numbers are limited and places are reserved on a first-come first-served basis following ?the completion of the online Reservation Form. Successful attendees will be notified shortly after reservation and invoices will be issued accordingly. Due to the limited places, MEAFA maintains a no refund policy following payment. For more information on enrollment and fees contact meafa@econ.usyd.edu.au.
Discounts
You may qualify for one of the following discounts:
- 35% discount for a restricted number of non-employed full-time PhD students.
- 15% discount for additional attendees from the same organisation or academic unit.
Venue
The workshop will take place at Sydney University at the Faculty of Economics and Business computer labs, ground level of Building H69, cnr Codrington & Rose streets (see interactive map). You do not need to bring your own laptop. PCs and Stata 11 licenses for Microsoft Windows will be provided.
Timetable
All days have the following schedule:
-
08:40-09:00 - Welcome tea and coffee
09:00-10:30 - Session 1
10:30-10:45 - Morning break
10:45-12:15 - Session 2
12:15-13:15 - Lunch
13:15-14:45 - Session 3
14:45-15:00 - Afternoon break
15:00-16:30 - Session 4
16:30-17:00 - Buffer-time and user-specific questions
Detailed Programme
Day 1 (Monday, 21 June): Introduction to Stata 11 and Data Management |
|
|---|---|
| Session 1: Introduction to Stata 11 environment The Stata environment; configuration; special features; updates; personalised system; obtain help and perform search; Stata syntax |
|
| Session 2: Data formats and data handling Data formats; import, export, load and save datasets; simulated datasets; document the dataset; sorting and ordering; display formatting; append and merge |
|
| Session 3: Data structures and types of variables Categorical vs. continuous data; numerical, string and date/time variables; missing data; generate variables; dummy variables; special purpose variables |
|
| Session 4: Data management and output management Logs for output; prefixes; tables and graphs; export output; stored and saved results |
Day 2 (Tuesday, 22 June): Parallel sessions |
|
|---|---|
Proficient Programming with Stata 11 |
Econometric Modelling and Statistical Testing |
| Session 1: Basics of Stata programming Executing commands using do-files; proper structure of do-files; using comments; writing long commands; do v.s run; combination of preserve and restore; the command display; accessing Stata parameters and Stata constants |
Session 1: Statistical description and linear regression analysis Means, variances and higher order moments; medians and modes; confidence intervals; ordinary least squares; predicted values and residuals; correlation and standardized regression coefficients; hypothesis testing; problems with regression |
| Session 2: It's all about Macros! What is a Stata macro; local macros; global macros; numerical macros; string macros; compound punctuation; macro evaluation; formatting macro output; nested macros |
Session 2: Multiple regression analysis Multiple regression models; partial effects; variable selection; t-tests and confidence intervals for individual coefficients; F-tests for sets of coefficients; multicollinearity; interaction effects; intercept and slope dummy variables; logarithmic regression |
| Session 3: Special features of macros and loops Incrementing/decrementing macros; combining incrementation with evaluation; macro expansion; function keys and global macros; foreach loop; forvalues loop; nested loops; using _rc (return codes) |
Session 3: Statistical description and nonlinear regression functions Graphing the data; a general strategy for modelling nonlinear regression functions; transformations; polynomials and logarithms; interactions (incl. continuous and dummy variables); internal and external validity |
Session 4: Automating routines and other special features |
Session 4: Regression with a binary dependent variable |
Day 3 (Wednesday, 23 June): Parallel sessions |
|
|---|---|
Panel Data Analysis |
Time Series Analysis |
| Session 1: Introduction to panel data analysis Advantages of panel data analysis; panel data sets; balanced and unbalanced panels; panel data dimensions and frequencies; properties of estimators; unbiasedness; efficiency; consistency; describing panel data; graphing panel data |
Session 1: Introduction to forecasting and time series Qualitative and quantitative forecasts; sata structure; describing time series; graphing time series; smoothing; time series components; data transformations; basic trend modelling and forecasting; forecast accuracy |
| Session 2: Static linear models Specification and estimation; one-way and two-way error components; fixed and random effects; the Least Squares Dummy Variable model; the Within, Between and GLS estimators; the Hausman test; variance decomposition |
Session 2: Stationarity and time series models Time series decomposition; stationarity; auto-correlation; time series regression; non-seasonal exponential forecasting; seasonal Holt-Winters methods |
| Session 3: Dynamic linear models Nickell biases; Anderson-Hsiao IV estimation; the problem and tests of weak instruments; the Generalised Method of Moments; testing for overidentifying restrictions; cross sectional dependence |
Session 3: Non-seasonal Box-Jenkins models AR, MA and ARMA processes; ARIMA and further trend modelling; detecting trends and/or mean non-stationarity; Box-Jenkins model forecast behaviour |
| Session 4: Nonlinear panel data models Poisson Regression Model; Probit and Logit: latent variable representations; marginal effects; model diagnostics |
Session 4: Seasonal Box-Jenkins and intervention models ARIMA; Seasonal ARIMA models; pure additive and factored models; models for outliers, level shifts and other interventions |
Day 4 (Thursday, 24 June): Multiple Imputation Using Stata, Part A |
|
|---|---|
| Session 1: Statistical overview of multiple imputation Introduction to MI; MI as a statistical procedure; Stages of MI: imputation, complete-data analysis, pooling |
|
| Session 2: Multiple imputation using Stata MI in Stata; overview of the mi suite of commands |
|
| Session 3: Methods and applications of imputation Imputation techniques; univariate imputation; multivariate imputation |
|
| Session 4: Advanced imputation Imputing complex data: survival, panel; checking sensibility of imputations |
Day 5 (Friday, 25 June): Multiple Imputation Using Stata, Part B |
|
|---|---|
| Session 1: Basic management of imputed data Storing multiply-imputed data; importing existing multiply-imputed data; verification of multiply-imputed data |
|
| Session 2: Advanced management of imputed data Variable management (passive variables); merging, appending, and reshaping multiply-imputed data; exporting multiply-imputed data to a non-Stata application |
|
| Session 3: Basic estimation of multiple imputed data Analysis and pooling stages of MI in one easy step; overview and applications of mi estimate |
|
| Session 4: Advanced estimation of multiple imputed data Estimating linear and nonlinear functions of coefficients; testing linear and nonlinear hypotheses |
N.B. The precise content may be subject to minor changes.
Reservation Form
Numbers are limited and places are reserved on a first-come first-served basis following the completion of the Reservation Form.