Introduction to 'big data' databases and programming for transport analytics

2018 (dates to be determined)

Institute of Transport and Logistics Studies
University of Sydney Business School

Course Overview

Transport researchers and practitioners are increasingly faced with performing analytics on 'big data' containing millions of observations from sources such as ticketing systems, smartphones, GPS and sensors. To make effective use of these datasets it is necessary to process and analyse the data at a disaggregate level. Accomplishing this requires a knowledge of databases (to efficiently store, manage and access data) and programming skills (to implement the appropriate logic for data cleaning and processing). In addition, while some have access to programmers and database analysts to perform these tasks it is nonetheless crucial to have an understanding of the capabilities and issues relating to the management and processing since how this is done has a direct effect on any results.

This short course will provide transport researchers and practitioners with the knowledge and tools to manage and process these big data datasets. Attendees will first be introduced to relational databases - enabling them to store, manage and retrieve data. Subsequently, an introduction to programming will give students the tools to create algorithms to process raw data and merge datasets to make them useable for a variety of transport analyses including statistical modelling and spatial data analytics. The course will then cover issues on extracting meaningful analytics from these large datasets and visualising them in an engaging and convincing way. The course will be taught as a mixture of lectures and practical tutorials throughout all four days and will teach R, scripting languages, PostgreSQL and the graph database, Neo4j.

By the end of the course you will be able to:

  • Work with raw data from a variety of sources
  • Store, edit, retrieve and manage related large datasets within a database
  • Combine data from several sources at the same time
  • Retrieve data to perform statistical analyses
  • Create powerful visualisations from raw and processed data
  • Work with geographic and non-geographic data

Course includes:

  • Full course notes
  • All software in a pre-configured package
  • Morning tea, lunch and afternoon tea

Participant Feedback

This short course has been attended by participants from government (state and federal), industry and academia. Below is a selection of feedback received in 2015 and 2016.

  • Teaching from easy to complex make it easy to understand
  • It is an intensive course but quite inspiring
  • The guys really know their stuff and present well, very polished and comprehensive
  • Adrian and Richard are very knowledgeable and were able to answer all my questions!
  • Excellent compilation of notes and resources
  • Very relevant knowledge for my work and generally a great intro to big data programming and visualisation
  • Detailed guidance, hands-on use and practice on computer. Seeing real data and analysis
  • A great course; a credit to Richard and Adrian
  • Interesting and resourceful training course
  • Thanks for organising the big databases training. By far the best training I have attended in years.


Dr Adrian Ellison and Dr Richard Ellison


For any further information please email