5-day MEAFA Professional Development Workshop on Quantitative Analysis Using Stata, 13 - 17 July 2009

Examples of data analysis using Stata 10

Description

The professional development workshop is primarily aimed to social sciences researchers who wish to develop quantitative analysis skills using Stata 10 . The 2009 workshop spans over five days and offers a variety of topics.

  • Day 1 (July 13): Introduction to Stata 10 and the Management of Data
    This day assumes no previous knowledge of Stata. It begins with introducing the environment of Stata 10 and unlocks some of the software's most subbtle aspects, followed by a demonstration of various data structures and an eclectic selection of data management techniques.
  • Day 2 (July 14): Econometric Modelling and Statistical Testing
    This day assumes elementary knowledge of Stata 10 and a basic appreciation of quantitative analysis methods. This day will use applications to demonstrate the theory of basic econometric modelling including time-series forecasting and the use of statistical testing for validating assumptions and expectations.
  • Day 3 (July 15): Stata Proramming and Mata Secrets
    This day assumes good knowledge of Stata 10. It demonstrates key programming Stata skills for building a more structured and methodical approach to quantitative analysis, and also shows how to use Mata (Stata's background matrix language) in order to enhance programming efficiency.
  • Day 4 (July 16): Survey Data Analysis
    This day assumes good knowledge of Stata 10 and basic understanding of quantitative analysis. It introduces the concept of survey data analysis and explains its distinction to the other types of quantitative analysis. It uses the European Social Survey to apply the principles and concepts of empirical survey data analysis.
  • Day 5 (July 17): Panel Data analysis
    This day assumes good knowledge of Stata 10 and good understanding of econometric modelling. It explains the rationale of panel data methods of analysis and then demonstrates static and dynamic models of panel data. It also covers the special class of mixed panel data structures that combine the dimensions of random and fixed effects.

This is a hands-on workshop, and all days have a strictly applied focus using real data or simulated datasets. Nonetheless, detailed notes will be provided outlining the theoretical foundations of all types of analysis demonstrated. You may attend any one day or any combination of days, and fees vary on the number of days attended (depending on availablity).

If you have no or little experience with Stata then you should attend Day 1 before progressing to the rest of the days. Similarly, if you do not feel comfortable with statistical or econometric modelling, then you should first attend Day 2. For more information on content see the detailed programme or contact meafa@econ.usyd.edu.au.

Stata 10

Stata 10 is a complete, integrated statistical package for data management, statistical analysis, graphing and econometric estimation. Stata is fast, accurate and easy to use. For more information visit StataCorp's website.

Computing Facilities and Venue

The workshop will take place at the Faculty of Economics and Business computer labs, Building H69 ground floor, cnr Codrington & Rose streets, the University of Sydney (see interactive map). You do not need to bring your own laptop. PCs and Stata 10 licenses for Microsoft Windows will be provided.

Enrolment

Numbers are limited and places are reserved on a first-come first-served basis. Successful attendees will be notified shortly after reservation and invoices will be issued accordingly. MEAFA maintains a no refund policy following payment. For more information on enrollment and fees contact meafa@econ.usyd.edu.au.

Fees

You may attend one or more days and fees vary on the number of days attended (prices exclude GST):

  • Any one day: $550
  • Any two days: $1000
  • Any three days: $1450
  • Any four days: $1900
  • All five days: $2350

Fees include extensive course material, data sets, lectures, use of computing facilities, temporary Stata 10 licenses, full catering and opportunity to network with fellow researchers.

Discounts

You may qualify for one of the below discounts that are available:

  • 50% discount for a restricted number of non-employed full-time PhD students
  • 15% discount for additional attendees from the same private/public organisation
  • 25% discount for additional attendees from the same academic institution

Presenters

Programme

Day 1: Monday 13 July 2009, Introduction to Stata 10 and the Management of Data

08:40

Welcome tea and coffee

09:00-10:30

Introduction to Stata 10 Environment

Stata environemnt; configuration and special features; updates; personalised system; directory management; obtain help and perform search; online sources; Stata syntax

10:30-10:45

Morning break

10:45-12:15

Data Formats and Data Handling

Types of data formats; import, export, load and save datasets; create pseudorandom datasets; review and document the dataset; ordering of dataset; display format

12:15-13:15

Lunch

13:15-14:45

Data Structures and Types of Variables

Categorical data; continuous data; append and merge other datasets; reorganise datasets; numerical, string and date/time variables; manage missing data; generate variables; dummy variables; generate other special purpose variables

14:45-15:00

Afternoon break

15:00-16:30

Data Management and Output Management

Usable dataset and benefits of filtering; validate claims on data structure; identify duplicate observations; logs for output; copy & paste from Stata to text editors and spreadsheets; stored results

16:30-17:00

Questions and User-Specific Issues


Day 2: Tuesday 14 July 2009, Econometric Modelling and Statistical Testing

08:40

Welcome tea and coffee

09:00-10:30

Statistical Description and Linear Regression Analysis

Means, variances and higher order moments; medians and modes; confidence intervals; simple / multiple regressions with continuous and dummy variables; estimation; hypothesis testing; internal and external validity

10:30-10:45

Morning break

10:45-12:15

Statistical Description and Nonlinear Regression Functions

General strategy for modelling nonlinear regression functions; polynomials / logarithms in regression; interactions between independent variables (including continuous and dummy variables); internal and external validity

12:15-13:15

Lunch

13:15-14:45

Introduction to Time Series Regressions and Forecasting

Introduction to time series data; serial correlation; random walks; autoregressions; moving averages; regressions with additional predictors; autoregressive distributed lag model

14:45-15:00

Afternoon break

15:00-16:30

Testing for Time Series Validity

Stochastic vs. deterministic trend; stationarity; testing for a unit root; testing for breaks; testing for trends; information criteria and lag length selection; ARMA models; forecasting

16:30-17:00

Questions and User-Specific Issues


Day 3: Wednesday 15 July 2009, Stata Programming and Mata Secrets

08:40

Welcome tea and coffee

09:00-10:30

Introduction to do-files

Storing and executing commands in do-files; writing long commands in do-files; difference between do and run; calling other do-files from a do-file; comments in do-files; passing arguments to the do-file; edit saved reviews from the results window

10:30-10:45

Morning break

10:45-12:15

Creating Programs

Creating a program from a do-file; when to use a program; including a program in a do-file; the command clear and its use in programming; creating an ado file from a program

12:15-13:15

Lunch

13:15-14:45

Elements in a Program

local macros; global macros; scalars; if statements; loops; the combination of preserve and restore; parsing elements; introduction to the syntax command

14:45-15:00

Afternoon break

15:00-16:30

Mata Secrets

When to use Mata and when not to; getting data in Mata; looping; if statements; subscripting matrices; string and numerical matrices; getting a mata matrix into Stata; Mata functions; Mata optimize; Mata matrix maths; solving simultaneous equations using Mata

16:30-17:00

Questions and User-Specific Issues


Day 4: Thursday 16 July 2009, Survey Data Analysis

08:40

Welcome tea and coffee

09:00-10:30

Introduction to Sampling and Survey Design

Variable types; exploratory data analysis; rating scales: choices and pros/cons; question wording; simple random sampling; sampling with and without replacement; sampling distributions for means and proportions; European Social Survey

10:30-10:45

Morning break

10:45-12:15

Sampling Weights and Estimation

Complex sampling methods; stratification; sampling weights; unequal selection probabilities; sample size determination; optimal sample size given design; adapting statistical analysis

12:15-13:15

Lunch

13:15-14:45

Advanced Sampling and Estimation

Multi-stage sampling; cluster sampling; mixed sampling approaches; Horwitz-Thompson estimation; variance estimation; design effects; analysis of tables and testing; ratios; linear regression

14:45-15:00

Afternoon break

15:00-16:30

Advanced Survey Estimation

Multiple linear regression; logistic regression; ordinal regression; dignostics; graphing; examples from European Social Survey

16:30-17:00

Questions and User-Specific Issues


Day 5: Friday 17 July 2009, Panel Data Analysis

08:40

Welcome tea and coffee

09:00-10:30

Introduction to Panel Data Analysis

Advantages of panel data analysis; panel data sets; balanced and unbalanced panels; panel data dimensions and frequencies; properties of estimators; unbiasedness; efficiency; consistency; describing panel data; graphing panel data

10:30-10:45

Morning break

10:45-12:15

Static Linear Models

Specification and estimation; one-way and two-way error components; fixed and random effects; the Least Squares Dummy Variable model; the Within, Between and GLS estimators; the Hausman test; variance decomposition

12:15-13:15

Lunch

13:15-14:45

Dynamic Linear Models

Nickell biases; Anderson-Hsiao IV estimation; the problem and tests of weak instruments; the Generalised Method of Moments; testing for overidentifying restrictions; cross sectional dependence

14:45-15:00

Afternoon break

15:00-16:30

Mixed Models of Panel Data

Panel data modes with both fixed and random effects; syntax for the -xtmixed- command; random intercepts; random slopes; estimation of variances and covariances; multiple levels and nested effects; error distributions; postestimation

16:30-17:00

Questions and User-Specific Issues