MEAFA Professional Development Workshop on Stochastic Frontier Analysis using Stata, 13-17 February 2017

Stochastic Frontier Analysis

This workshop showcases two days on Stochastic Frontier Analysis (Thursday-Friday 16-17 Feb), delivered by Artem Prokhorov, Discipline of Business Analytics at The University of Sydney. Artem is an expert in econometrics and actively works in the field of stochastic frontier modelling. Artem has extensive experience as a consultant, assessor and expert witness, including work with Australian Securities and Investment Commission and Australian Research Council. Artem is on editorial board of three influential journals in business, economics and statistics.

Working efficiently with Stata, Programming and Data management

The first three days of the workshop provide a thorough introduction to Stata, including Working Efficiently with Stata (Monday 13 Feb), Stata Programming (Tuesday 14 Feb), and Data Management (Wednesday 15 Feb). These days are delivered by Demetris Christodoulou, MEAFA General Convenor and architect of the MEAFA Professional Development Workshops. Demetris has extensive consulting experience on Stata and data analysis.

Workshop description

You may attend any one day or any combination of the following days.

Day 1 (Monday 13 Feb 2017): Working Efficiently with Stata by Demetris Christodoulou, MEAFA General Convenor

This day assumes no previous knowledge of Stata 14. It describes the environment of Stata, its limitations and strengths, and core syntactic features. It demonstrates ways of working efficiently with Stata, including the use of logs and do-files. It discusses key principles and presents tools for developing work that is reproducible and verifiable. The day is of interest to those who are new to Stata 14 or have limited experience with earlier versions of Stata. It is also useful to more experienced users who wish to attain a more structural understanding of Stata from first principles.

Day 2 (Tuesday, 14 Feb 2017): Introduction to Programming by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata 14 but no knowledge of programming with Stata or any other software. By the end of this day you will be able to produce efficient, tractable and automated routines for data management, statistical analysis and estimation, creation of tables and graphs. The day covers key programming tools (saved results, stored restults, macros, scalars, loops), and the fundamentals of building your own commands in Stata (programs or ado-files). This day is appropriate to those who wish to attain a deeper knowledge of Stata and achieve the aforementioned attributes in their work. This day assumes knowledge of the material presented in Day 1.

Day 3 (Wednesday, 15 Feb 2017): Management of Raw Data by Demetris Christodoulou, MEAFA General Convenor

This day assumes working knowledge of Stata and basic programming skills but not of data management. The day demonstrates ways to import and export different data formats. It demonstrates the management of numerical variables, string variables and date/time variables, and the implications of missing values. It explores key data structures including cross-sectional, time-series and panel data in long and wide formats. It covers the management of data attributes, the organisation of data and the importance of metadata. It also demonstrates strategies for working efficiently with very large datasets. Dataset organisation, archiving, combinations, and transformations will also be discussed. If you have no or limited experience with Stata 13 then you are strongly advised to attend Day 1 first. Some programming tools will also be applied from Day 2 (stored results, macros, scalars and loops).

Days 4-5 (Thursday-Friday, 16-17 Feb 2017): Stochastic Frontier Analysis by Artem Prokhorov, Business Analytics and MEAFA

These two days assume working knowledge of Stata and basic knowledge of econometrics. We start with the basic stochastic frontier models of production with a single output and multiple inputs, and discuss estimation using cross-sectional data and technical efficiency score calculation. We then extend the model to include environmental variables that affect inefficiency. Then, we allow for time varying technical inefficiency and for unobserved effects and consider estimation of stochastic frontier models using panel data. Finally we consider important extensions such as estimation of stochastic frontier models when production inputs are endogenous, when there are multiple outputs, and when a cost frontier is considered instead of a production frontier. The type of applications we cover include dairy and rice farm production, power generation, efficiency of airlines, mine production and banks. By the end of the two days, you will be able to estimate stochastic frontier models, test hypotheses about them, interpret the estimates and obtain efficiency scores. If you have no experience with Stata then you are required to attend at least Day 1.

Enrollment and Fees

You may attend any one day or any combination of days. The cost for attending any one day is $600 per day. Days 4-5 are bundled together at $1200. Prices include GST. If you are paying with a University of Sydney card then deduct the cost of GST.

Fees include extensive course material, do-files and data sets, use of computing facilities, temporary use of Stata 14 licenses and full catering throughout the days. To express your interest in attending you must complete the online form:

Expression of Interest

Numbers are limited and places are reserved on a first-come first-served basis upon the submission of the online EOI form. Successful attendees will be notified shortly after and invoices will be issued accordingly. Due to limited places, MEAFA maintains a no refund policy. For more information on enrolment and fees contact

Net proceedings from the workshop go to funding MEAFA PhD scholarships.


You may qualify for one of the following discounts:

  • 30% discount for a limited number of non-employed full-time research students.
  • 15% discount for additional attendees from the same business organisation, governmental department or academic unit.

Venue and computing facilities

The workshop takes place at the computer labs of The University of Sydney Business School Building H69.

Desktop PCs with Stata 14 licenses for Microsoft Windows are provided onsite. You can also work on your own laptop but you cannot access the web using the University of Sydney server. To install the temporary one-month Stata 14 license on your own laptop (Mac or PC) allow for at least 15 minutes. No printing facilities are available.


MEAFA does not engage in the administration of temporary accommodation. It is up to you to find suitable living arrangements.


All days have the same time schedule:

  • 08:40-09:00 - Welcome tea and coffee
    09:00-10:30 - Session 1
  • 10:30-10:45 - Morning break
    10:45-12:15 - Session 2
  • 12:15-13:15 - Lunch
    13:15-14:45 - Session 3
  • 14:45-15:00 - Afternoon break
    15:00-16:30 - Session 4
  • 16:30-17:00 - Buffer-time and user-specific questions

The computer labs will be accessible from 8am to 8pm every day. Catering is provided at each break.

Detailed Programme

Day 1 (Monday, 13 Feb 2017): Working efficiently with Stata

Session 1: The Stata environment
Stata interface; configuration; limits; system constants and parameters; updates; profile and system directories; help files; manual entries; open-source programs; other help resources.
Session 2: Syntactic features
General syntax; parsing; strings and double quotes; wildcards; operators; functions; qualifiers; missing values; Boolean evaluation; prefixes.
Session 3: Do-files and Logs
The Do-file Editor; key do-file commands; master do-files; comments; errors and prevention; troubleshooting; suppression of errors; recording selective output in logs; command logs; common headings; log archive; printing and translation.
Session 4: Data fundamentals
How Stata understands data; memory requirements and dataset size; data types; physical limitations; numerical precision; string precision; long strings.

Day 2 (Tuesday, 14 Feb 2017): Introduction to Programming

Session 1: Macros and scalars
Local and global macros; macro evaluation; macro expansion; prevent macro expansion; scalars; scalar evaluation; applications.
Session 2: Saved results
r-class values; e-class values; evaluating saved results; estimation and postestimation; storing estimates; results in tables; applications.
Session 3: Loops
Types of loops; initialising values; macro incrementation; nested macros; nested loops; rereferencing macros within loops; debugging loops; applications.
Session 4: Programs
Programming new commands; structure of an ado-file; storing and accessing programs; programs as 'wrappers'; softcoding.

Day 3 (Wednesday, 15 Feb 2017): Management of Raw Data

Session 1: Raw data fundamentals
The importance of raw data; use, save and describe data; decimal separator; data formats; inspecting data attributes; data elements; explicit subscripting; record data subsets in logs; preserve, destroy and restore.
Session 2: Data types and computational memory
Numerical variables; string variables; date/time variables; missing values; memory requirements and physical limitations; settings and Stata limitations; numerical and string precision; working efficiently with large datasets.
Session 3: Data organisation and metadata
Naming rules for variables; order of variables; sorting of observations; data and variable labels; notes; value labels; labels in graphs; display format; data signature; archiving data and metadata.
Session 4: Data combinations and transformations
Dataset comparisons; appending datasets; one-to-one merging; match-merging; reshape long and wide; collapse to summaries; contract to frequencies.

Day 4 (Thursday, 16 Feb 2017): Stochastic Frontier Analysis (Part A)

Session 1: SFA fundamentals
em>Introduction to stochastic frontier analysis (SFA); single-output production frontier and technical inefficiency; the cost frontier.
Session 2: Cross-sectional models
COLS and MLE estimation of basic stochastic frontier models using cross-sectional data; calculation of efficiency scores; prediction using SF models.
Session 3: Constraints and distributions
Constrained estimation and testing hypotheses about production; alternative distributional assumptions and tests; model diagnostics.
Session 4: Alternative models
Estimation of cost and profit frontiers; estimation of multiple-output models; applications.

Day 5 (Friday, 17 Feb 2017): Stochastic Frontier Analysis (Part B)

Session 1: Panel data models
Panel data SFA; time-varying and time-constant inefficiency and unobserved heterogeneity; trend estimation.
Session 2: Fixed and random effects
Fixed effects specifications; random effects specifications; heterogeneity vs inefficiency.
Session 3: Endogeneity
Endogeneity in SFA models; testing for endogeneity; reduced forms and instrument based estimation.
Session 4: Special case models
Zero-inefficiency SF models; multiple production units; good and bad output; semi-parametric SFA models.

N.B. The precise content per session is subject to reshuffling and fine-tuning.

Expression of Interest

Numbers are limited and places are reserved on a first-come first-served basis following the completion of the online form:

Expression of Interest