MEAFA workshop on Quantitative Analysis using Stata, 23-27 June 2014

Announcements

7 May 2014: The workshop is now open for reservations. Places are limited and are reserved on a first-come first-served basis following the completion of the online Expression of Interest Form.

Presenter of Day 5 on Time series analysis

Richard Gerlach is Professor of Statistics and Chair of Business Analytics at the University of Sydney Business School. Richard is past President of the NSW Branch of the Statistical Association of Australian Inc and is current co-convenor of MEAFA and the Risk Analytics research groups. Richard is a world-renowed expert on time-series analysis, forecasting and risk analysis, with extensive consulting experience.

Presenter of Days 1-4 on Working Efficiently with Stata, Programming, Management of Raw Data and Advanced Data Visualisation

Demetris Christodoulou is General Convenor of the research group MEAFA. Demetris is the architect of the MEAFA Professional Development Workshops on Quantitative Analysis using Stata, that are widely recognised by industry, government and academia for their state-of-the-art content. Demetris had provided extensive consulting services to hundreds of executives and researchers on statistical computation and data analysis.

Examples of Advanced Data Visualisation that may be discussed are presented below.

Cumulative FTSE Value function

Empirical Rule Colors USA vs China

Questions VolumePrice

Brief workshop description

You may attend any one day or any combination of the following days.

Day 1 (Monday, 23 June 2014): Working Efficiently with Stata by Demetris Christodoulou, MEAFA General Convenor

This day assumes no previous knowledge of Stata 13. It describes the environment of Stata and its core syntactic features. It demonstrates ways of working efficiently with Stata, including the use of logs and do-files. It discusses key principles and presents tools for developing work that is reproducible and verifiable. It explains how Stata understands data and related precision and the limitations of working with large data. The day is of interest to those who are new to Stata or have limited experience with Stata 13. It is also useful to more experienced users who wish to attain a more structural understanding of Stata from first principles.

Day 2 (Tuesday, 24 June 2014): Programming by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata 13 but no knowledge of programming with Stata or any other software. By the end of this day you will be able to produce efficient, tractable and automated routines for data management, statistical analysis and estimation, creation of tables and graphs. The day covers key programming tools (saved results, stored restults, macros, scalars, loops), and the fundamentals of building your own commands in Stata (programs or ado-files). This day is appropriate to those who wish to attain a deeper knowledge of Stata and achieve the aforementioned attributes in their work. This day assumes knowledge of the material presented in Day 1.

Day 3 (Wednesday, 25 June 2014): Management of Raw Data by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata and basic programming skills but not of data management. The day demonstrates ways to import and export different data formats. It demonstrates the management of numerical variables, string variables and date/time variables, and the implications of missing values. It explores key data structures including cross-sectional, time-series and panel data in long and wide formats. It covers the management of data attributes, the organisation of data and the importance of metadata. It also demonstrates strategies for working efficiently with very large datasets. Dataset organisation, archiving, combinations, and transformations will also be discussed. If you have no or limited experience with Stata 13 then you are strongly advised to attend Day 1 first. Some programming tools will also be applied from Day 2 (saved results, stored results, macros, scalars, basic loops).

Day 4 (Thursday, 26 June 2014): Data Visualisation by Demetris Christodoulou, MEAFA General Convenor

This day assumes good knowledge of Stata but no knowledge of data visualisation with Stata or any other software. The day provides a conceptual workflow for and the key principles of data visualisation. The day also presents an in-depth break-down of Stata's graphing logic and how to make sense of its vast graph syntax. The material are inspired by Jacques Bertin's classification of visual objects and encoding tools (retinal variables) and how these map onto Stata's architecture of graphing capabilities. Graphing examples are demonstrated for a variety of data structures, using real data or simulated data. Demonstrations include the contrast of theoretical to empirical probability densities, y-x relationships, bar charts, box plots and more. By the end of this day you should be able to produce informative, robust, flexible and beautiful graphs using reproducible and adaptable routines. Examples of graphs that may be discussed are provided just above. This day assumes knowledge of the material presented in Days 1-3.

Day 5 (Friday, 27 June 2014): Time-series analysis and forecasting by Richard Gerlach, Prof of Statistics and Chair of Business Analytics, The University of Sydney

This day assumes working knowledge of Stata and basic knowledge of statistics and econometrics, but assumes zero knowledge of time-series analysis. This is an application-driven day that details the advantages and limitations of univariate time series analysis and how it leads to forecasting. This day is of interest to those who wish to learn how to mode, analyse and test univariate time series structures using Stata. Detailed notes on theory will be provided as background reading. This day assumes knowledge of the material presented in Day 1.

Enrollment and Fees

You may attend any one day or any combination of days. The cost for attending any one day is $600 per day (prices include GST).

Fees include extensive course material, do-files and data sets, use of computing facilities, temporary use of Stata 13 licenses and full catering. Numbers are limited and places are reserved on a first-come first-served basis following the completion of the online Expression of Interest Form. Successful attendees will be notified shortly after reservation and invoices will be issued accordingly. Due to limited places, MEAFA maintains a no refund policy. For more information on enrollment and fees contact business.meafa@sydney.edu.au.

N.B. Proceedings from the workshop go to funding MEAFA PhD scholarships.

Discounts

You may qualify for one of the following discounts:

  • 30% discount for a restricted number of non-employed full-time PhD students.
  • 15% discount for additional attendees from the same business organisation, governmental department or academic unit.

Venue and computing facilities

The workshop takes place at the computer Lab 1, ground level, Economics and Business Building H69, cnr Codrington & Rose streets, The University of Sydney Business School(see interactive map).

Desktop PCs with Stata 13 licenses for Microsoft Windows are provided onsite. You can also work on your own laptop but you will not be able to access the web using the University of Sydney server. To install the temporary one-month Stata 13 license on your own laptop allow for at least 15 minutes. No printing facilities are available.

Timetable

All days have the same time schedule:

  • 08:40-09:00 - Welcome tea and coffee
    09:00-10:30 - Session 1
  • 10:30-10:45 - Morning break
    10:45-12:15 - Session 2
  • 12:15-13:15 - Lunch
    13:15-14:45 - Session 3
  • 14:45-15:00 - Afternoon break
    15:00-16:30 - Session 4
  • 16:30-17:00 - Buffer-time and user-specific questions

The computer labs will be accessible from 8am to 8pm every day. Catering is provided at each break.

Detailed Programme

Day 1 (Monday, 23 June 2014): Working efficiently with Stata

Session 1: The Stata environment
Stata interface; configuration; limits; system constants and parameters; updates; profile and system directories; help files; manual entries; open-source programs; other help resources.
Session 2: Syntactic features
General syntax; parsing; strings and double quotes; wildcards; operators; functions; qualifiers; missing values; Boolean evaluation; prefixes.
Session 3: Do-files and Logs
The Do-file Editor; key do-file commands; master do-files; comments; errors and prevention; troubleshooting; suppression of errors; recording selective output in logs; command logs; common headings; log archive; printing and translation.
Session 4: Data fundamentals
How Stata understands data; memory requirements and dataset size; data types; physical limitations; numerical precision; string precision; long strings.

Day 2 (Tuesday, 24 June 2014): Programming

Session 1: Macros and scalars
Local and global macros; macro evaluation; macro expansion; prevent macro expansion; scalars; scalar evaluation; applications.
Session 2: Saved results
r-class values; e-class values; evaluating saved results; estimation and postestimation; storing estimates; results in tables; applications.
Session 3: Loops
Types of loops; initialising values; macro incrementation; nested macros; nested loops; rereferencing macros within loops; debugging loops; applications.
Session 4: Programs
Programming new commands; structure of an ado-file; storing and accessing programs; programs as 'wrappers'; softcoding; the command syntax; applications.

Day 3 (Wednesday, 25 June 2014): Management of Raw Data

Session 1: Raw data fundamentals
The importance of raw data; use, save and describe data; decimal separator; data formats; inspecting data attributes; data elements; explicit subscripting; record data subsets in logs; preserve, destroy and restore.
Session 2: Data types and computational memory
Numerical variables; string variables; date/time variables; missing values; memory requirements and physical limitations; settings and Stata limitations; numerical and string precision; working efficiently with large datasets.
Session 3: Data organisation and metadata
Naming rules for variables; order of variables; sorting of observations; data and variable labels; notes; value labels; labels in graphs; display format; data signature; archiving data and metadata.
Session 4: Data combinations and transformations
Dataset comparisons; appending datasets; one-to-one merging; match-merging; reshape long and wide; collapse and contract.

Day 4 (Thursday, 26 June 2014): Advanced Data Visualisation

Session 1: Graphing environment
Graph window; graph editor; schemes; efficient graphing; graph combinations; export; print and copy; graph do-files.
Session 2: The twoway suite of graphs
Overlaying graphs; density estimators; bivariate relationships; bar charts; immediate charts; graphing by groups; recasting.
Session 3: Visual objects and encoding tools
Visual objects (points, lines, areas); retinal variables of size, location, shape and texture; encoding multivariate data.
Session 4: Graph enhancement
Regions; textboxes; axes scale; aes labels; reference lines; reference areas; legends.

Day 5 (Friday, 27 June 2014): Time-series analysis and forecasting

Session 1: Introduction to forecasting and time series
Why forecast?; Stata time series structure; describing, graphing time series; smoothing and time series components; data transformations; exponential smoothing and forecasting; forecast accuracy; stationarity; auto-correlation and ACF plots.
Session 2: Time series modelling and forecasting
The autoregressive process (AR); the moving average process (MA); ARMA processes; time series regression; Holt-Winters for trends; seasonal Holt-Winters.
Session 3: Integrated and seasonal Box-Jenkins models
Trends and integration; ARIMA processes; detecting trends and/or mean non-stationarity; ARIMA model forecast behaviour; Seasonal ARIMA models; pure additive and factored models; models for outliers, level shifts and other interventions.
Session 4: Time series regression and volatility modelling
Advanced time series regression; distributed lag models; conditional heteroskedasticity (CH); ARCH and Generalised ARCH processes (GARCH); Value-at-Risk.

N.B. The precise content per session is subject to reshuffling and fine-tuning.

Expression of Interest form

Numbers are limited and places are reserved on a first-come first-served basis following the completion of the Expression of Interest form.