Introductory Analysis of Linked Data (PUBH5215) - Professional Development Course


This 5-day short course introduces the topic of the analysis of linked data at an introductory to intermediate level. It acquaints health services researchers, clinical practitioners and managers with the theory and skills needed to analyse linked health data. The modular structure of the course provides participants with a theoretical grounding on each theme, followed by a hands-on practical exercise in our computer lab each day, using de-identified linked NSW data files.

Who should attend?
The course is suitable for people with no previous experience in the analysis of linked health data. However, it does assume familiarity with introductory statistical and epidemiological methods, as taught, for example, in a Master of Public Health degree course. The computing component of the unit also assumes a basic familiarity with computing syntax used in SAS and methods of basic statistical analysis of fixed-format data files. Participants must have this assumed knowledge.

Topics covered:

  • Introduction to data linkage and its history
  • Description of CHeReL and how record linkage works
  • Quality of data linkage
  • Ethics, data security, applying to CHeReL for data
  • Types of population health databases
  • ICD coding
  • Overview of linked data studies
  • Constructing study populations
  • SAS commands for arrays, merging datasets, tagging records, creating sequence variables
  • Measures of health care utilisation; health care episodes
  • Prevalent pool effect
  • Inter-hospital transfers
  • Data quality I: Preparing data for analysis
  • Data quality II: Accuracy and reliability of data sources
  • Measures of health care outcomes: treatment outcomes and adverse events
  • Introduction to survival analysis and Cox regression
  • Available covariates: sociodemographic, illness severity, comorbidity
  • Methods of risk adjustment

Course aims

On completion of this short course participants will be able to:

  • understand the theory of data linkage methods and features of comprehensive data linkage systems, sufficient to know the sources and limitations of linked health data sets, and in particular those for NSW;
  • apply epidemiological principles to the design of studies using linked data;
  • construct numerators and denominators for the
  • analysis of disease trends and health care utilisation and outcomes;
  • assess the accuracy and reliability of data sources;
  • check data linkages and assure the quality of the study process, e.g. consistency of definitions, missing data;
  • list the issues to be considered when analysing large linked data files;
  • write syntax to prepare linked data files for analysis, derive exposure and outcome variables, relate numerators and denominators and produce results from statistical procedures.


Ms Kerry Lewis, Health Information Manager
Dr Timothy Dobbins, Cancer Epidemiology and Services Research, University of Sydney
Dr Jane Ford, Perinatal Health Research Group, University of Sydney
Ms Katie Irvine, Centre for Health Record Linkage (CHeReL)
Ms Sanja Lujic, University of Western Sydney
Dr Kathleen Falster, The Sax Institute
Assoc Professor Christine Roberts, Perinatal Health Research Group, University of Sydney
Professor Judy Simpson, Sydney School of Public Health, University of Sydney


Semester 1: Monday 20 June – Friday 24 June 2016
Semester 2: Monday 14 November – Friday 18 November 2016


Edward Ford Building (A27)
The University of Sydney


Applications for the June short course must submitted by Friday 3 June 2016, and for the November short course by Friday 28 October 2016.

Places are strictly limited so book early to avoid disappointment.

Application Form

The application process

  1. Please email the application form to: . DO NOT enter your credit card details on the form. Your application will then be assessed;
  2. Once approved, you will be emailed a link to make an online payment;
  3. Once payment has been received, your place in the course is secure.

Course Fee

The 2016 fee for the short course is $3,300 including GST.

Discounts are available for a group of participants from the same institution or organisation, as follows:
3-4 participants: 10% discount;
5 or more participants: 20% discount.

Full course fees for the June short course are to be paid by Friday 3 June 2016 and for the November short course by Friday 28 October 2016. Cancellation after these dates will incur a fee of $100 per participant for administration.

More information

Biostatistics Program
Room 301, Level 3
Edward Ford Building (A27)
The University of Sydney
NSW 2006 Australia

Phone: +61 2 9351 5994