Introductory Analysis of Linked Data (PUBH5215)


This unit introduces the topic of linked health data analysis. It will usually run in late June and late November. The topic is a very specialised one and will not be relevant to most MPH students. The modular structure of the unit provides students with a theoretical grounding in the classroom on each topic, followed by hands-on practical exercises in the computing lab using de-identified linked NSW data files. The computing component assumes a basic familiarity with SAS computing syntax and methods of basic statistical analysis of fixed-format data files. Contents include: an overview of the theory of data linkage methods and features of comprehensive data linkage systems, sufficient to know the sources and limitations of linked health data sets; design of linked data studies using epidemiological principles; construction of numerators and denominators used for the analysis of disease trends and health care utilisation and outcomes; assessment of the accuracy and reliability of data sources; data linkage checking and quality assurance of the study process; basic statistical analyses of linked longitudinal health data; manipulation of large linked data files; writing syntax to prepare linked data files for analysis, derive exposure and outcome variables, relate numerators and denominators and produce results from statistical procedures at an introductory to intermediate level. The main assignment involves the analysis of NSW linked data, which can be done only in the School of Public Health Computer Lab, and is due 10 days after the end of the unit.

block/intensive mode 5 days 9am-5pm


Workbook exercises (30%) and 1x assignment (70%)


Notes will be distributed in class.

(PUBH5010 or BSTA5011 or CEPI5100) and (PUBH5211 or BSTA5004)

