Skip to main content

Introductory analysis of linked data

Understand the theory and analysis of linked health datasets
This five day short course is designed for health services researchers, clinical practitioners and managers, and introduces them to linked data analysis at an introductory to intermediate level.

Gain an understanding of the theory and skills needed to analyse linked health data. The modular structure of the course provides participants with a theoretical grounding on each theme, followed by a hands-on practical exercise in our computer lab each day, using de-identified linked NSW data files.

Course details

  • Introduction to data linkage and its history
  • Description of CHeReL and how record linkage works
  • Quality of data linkage
  • Ethics, data security, applying to CHeReL for data
  • Types of population health databases
  • ICD coding
  • Overview of linked data studies
  • Constructing study populations
  • SAS commands for arrays, merging datasets, tagging records, creating sequence variables
  • Measures of health care utilisation; health care episodes
  • Prevalent pool effect
  • Inter-hospital transfers
  • Data quality I: Preparing data for analysis
  • Data quality II: Accuracy and reliability of data sources
  • Measures of health care outcomes: treatment outcomes and adverse events
  • Introduction to survival analysis and Cox regression
  • Available covariates: sociodemographic, illness severity, comorbidity
  • Methods of risk adjustment

On completion of this short course participants will be able to:

  • understand the theory of data linkage methods and features of comprehensive data linkage systems, sufficient to know the sources and limitations of linked health data sets, and in particular those for NSW;
  • apply epidemiological principles to the design of studies using linked data;
  • construct numerators and denominators for the
  • analysis of disease trends and health care utilisation and outcomes;
  • assess the accuracy and reliability of data sources;
  • check data linkages and assure the quality of the study process, e.g. consistency of definitions, missing data;
  • list the issues to be considered when analysing large linked data files;
  • write syntax to prepare linked data files for analysis, derive exposure and outcome variables, relate numerators and denominators and produce results from statistical procedures.

The course is suitable for people with no previous experience in the analysis of linked health data. However, it does assume familiarity with introductory statistical and epidemiological methods, as taught, for example, in a Master of Public Health degree course. The computing component of the unit also assumes a basic familiarity with computing syntax used in SAS and methods of basic statistical analysis of fixed-format data files. Participants must have this assumed knowledge.

Associate Professor Timothy Dobbins
National Drug and Alcohol Research Centre, University of NSW

Associate Professor Jane Ford
Perinatal Health Research Group, University of Sydney

Ms Katie Irvine
Centre for Health Record Linkage (CHeReL)

Ms Sanja Lujic
Centre for Big Data Research in Health, University of NSW

Miss Filippa Pretty
Health Information Manager, University of Sydney

Dr Deborah Randall
Perinatal Health Research Group, University of Sydney

Dr Erin Cvejic
Sydney School of Public Health, University of Sydney

Associate Professor Siranda Torvaldse
Perinatal Health Research Group, University of Sydney 

Professor Andrew Hayen
Public Health, University of Technology, Sydney

Michael Smith
Department of Health

Victoria Pye
Department of Health

Dr Ibinabo Ibiebele
Sydney School of Public Health, University of Sydney

Important: This short course is a variant of the unit of study, Introductory Analysis of Linked Data (PUBH5215). It enables you to complete the unit without formal university enrolment. You will receive a certificate of completion, however, you will not receive credit points towards a University of Sydney degree.

To receive credit points and an academic transcript, please see the Medicine Postgraduate Non Award.

Key information
Course fees

Full price
$3,300 incl. GST

Group bookings:
10% discount for small groups (3 - 4)
20% discount for larger groups (5+)

Delivery/location Block/Intensive mode (5 days, 9am - 5pm)
No formal assessments or examinations required

Camperdown campus
Dates 18-22 November 2019 (Applications close 25 October)


  • Room 301a, Level 3, Edward Ford Building (A27) Fisher Road, The University of Sydney