Skip to main content
Unit of study_

OLET5606: Data Wrangling

2024 unit information

Data comes in many and varied formats, it can be tall or wide, big or small, structured or unstructured. Regardless of where you get your data from, it will almost always require some wrangling. Data wrangling is the convolution, alignment and preparation of data before use. This unit provides an overview of best practices in organising your research data from the point of discovery through to its use for scientific applications. You will learn the principles of data handling and how to maintain rigour and integrity of your data throughout your research, including documenting data provenance, how to access major databases, and data licensing. After calculating summary statistics to aid in the identification of outliers and missing values, you will learn how to clean and wrangle data in a reproducible manner in R, at a variety of scales. You will "wrangle" your research data using R, identifying outliers and missing values and ensuring provenance.

Unit details and rules

Managing faculty or University school:

Science

Study level Postgraduate
Academic unit Mathematics and Statistics Academic Operations
Credit points 2
Prerequisites:
? 
None
Corequisites:
? 
None
Prohibitions:
? 
None
Assumed knowledge:
? 
Basic exploratory data analysis, basic coding in R

At the completion of this unit, you should be able to:

  • LO1. Describe the importance of data provenance, and major databases that can be used to mine data.
  • LO2. Define data licensing.
  • LO3. Calculate summary statistics to identify outliers and missing values.
  • LO4. Clean and wrangle data in a reproducible manner in R, at a variety of scales.

Unit availability

This section lists the session, attendance modes and locations the unit is available in. There is a unit outline for each of the unit availabilities, which gives you information about the unit including assessment details and a schedule of weekly activities.

The outline is published 2 weeks before the first day of teaching. You can look at previous outlines for a guide to the details of a unit.

Session MoA ?  Location Outline ? 
Intensive July 2024
Block mode Camperdown/Darlington, Sydney
Session MoA ?  Location Outline ? 
Intensive July 2020
Block mode Camperdown/Darlington, Sydney
Intensive July 2021
Block mode Camperdown/Darlington, Sydney
Intensive July 2021
Block mode Remote
Intensive July 2022
Block mode Camperdown/Darlington, Sydney
Intensive July 2022
Block mode Remote
Intensive July 2023
Block mode Camperdown/Darlington, Sydney

Find your current year census dates

Modes of attendance (MoA)

This refers to the Mode of attendance (MoA) for the unit as it appears when you’re selecting your units in Sydney Student. Find more information about modes of attendance on our website.