Skip to main content
Research_

Research projects

Our diverse projects are changing the way we do research

New techniques in data science allow us to challenge traditional scientific processes. Our projects span four key areas:

Natural resources

Research vision

Globally, coral reef systems are under threat from major changes in environmental parameters (e.g. sea level, sea surface temperature, pH and water quality). Forward Stratigraphic Modelling (FSM) represents a powerful tool to model the past and future evolution coral reef systems. This research project will utilise optimisation and Bayesian inference methods for established coral reef models. It will use machine learning methods to investigate the contribution or impact of specific environmental parameters on reef model evolution and allow the more accurate prediction of fossil reef cores.

Research impact

The research will help in determining which of the features (eg. hydo-dynamic flow, Malthusian (competition) and sediment flux parameters) make the major contributing factors towards the model for better accuracy of reef core prediction.

Project lead

Collaboration team

The past 10 years has seen significant advances in the use of Forward Stratigraphic Modelling (FSM) to model carbonate sedimentary systems. However, most efforts have been focused on large-scale simulations of carbonate platform systems over long time scales (millions of years). To successfully model coral reef systems significant challenges need to be overcome including: (i) too coarse spatial and temporal scales; (ii) poorly represented biological processes (e.g. spawning, settlement, growth and competition); and (iii) the oversimplification of key physical–chemical–biological processes involved in sediment production, transport and deposition. PyReef model is a complex reef modelling software tool that requires constraint parameter optimisation for accurate prediction of reef cores that represents reef evolution over thousands of years. Bayesian inference methods have been used in the past for uncertainty quantification in free parameters for these models. However, there are a number of limitations when the complexity of the model increases with additional parameters. The absence of prior knowledge regarding the parameters makes it a major challenge for inference and optimisation. The first part of the project will explore hybrid methods that feature Bayesian inference with evolutionary algorithm for optimisation and uncertainty quantification. This includes multi-threading implementations of the hybrid algorithms for timely convergence.

Research focus

  • Data Fusion (Probabilistic Causal Models for Observational Data) 
  • Scientific Discovery, (Planetary Science) 

Research techniques

  • Bayesreef
  • Bayesian inference
  • Great Barrier Reef
  • Coral reef systems

Outputs

Research vision

Bayesian inference has been a popular methodology for the estimation and uncertainty quantification of parameters in geological and geophysical forward models. Badlands is a basin and landscape evolution forward model for simulating topography evolution at a large range of spatial and time scales. Our solid Earth evolution projects consider Bayesian inference for parameter estimation and uncertainty quantification for landscape dynamics model (Bayeslands). 

Research impact

The Bayeslands framework will form the foundation for more complex models of landscape and basin evolution. In the future, we envision to include many additional parameters in Bayeslands, including the uncertain initial model topography, global sea level fluctuations, tectonic and dynamic topography evolution, spatially varying lithospheric flexural rigidity, spatio-temporal variations in mountain uplift rates and in precipitation.

Project lead

Dr Rohitash Chandra (USyd/CTDS)

Collaboration team

The challenge is in parameter estimation for computationally expensive models which are being addressed by high-performance computing and surrogate-assisted Bayesian inversion. 

Research focus

  • Data Fusion (Probabilistic Causal Models)
  • Scientific Discovery (Planetary Science) 

Research techniques

  • Landscape evolution models
  • Bayesian inference
  • Optimisation

Outputs

Research papers:

Chandra R, Azam D, Müller RD, Salles T, Cripps S. "BayesLands: A Bayesian inference approach for parameter uncertainty quantification in Badlands." Computers & Geosciences. October, 2019

Rohitash ChandraR. Dietmar MüllerDanial AzamRatneel DeoNathaniel ButterworthTristan SallesSally Cripps “Multi-core parallel tempering Bayeslands for basin and landscape evolution”, Geochemistry, Geophysics, Geosystems, August 2019:

Workshop resources: https://www.earthbyte.org/bayeslands-resources/

Software: https://github.com/intelligentEarth/pt-Bayeslands

Better Ways to Sample from Complex Probability Distributions

Research Vision

Many processes in natural sciences are too complex to solve explicitly, but can be simulated numerically.  Fusing, and/or making decisions with, data generated by such processes is often done using Markov chain Monte Carlo (MCMC), but no convenient library exists to build new MCMC algorithms from component parts.

Research impact

The new sampling schemes will be of great theoretical interest, but should also apply to many areas, including geology or geophysics problems that involve sampling over numerically simulated histories of geological areas. Solving such problems are crucial to reconstructing the Earth’s evolution over the last two billion years. 

Project lead

Collaboration team

  • Professor Sally Cripps (CTDS/USyd)
  • Prof. Mark Girolami (Alan Turing Insititute)
  • Dr. Hadi Afshar (USyd/CTDS)
  • Dr. Roman Marchant (USyd/CTDS)
  • Dr. Rohitash Chandra (USyd/CTDS)
  • Dr Rafael dos Santos de Oliveira (USyd/CTDS)

We are building an extensible library for customised Metropolis-Hastings proposals as we work on more applied research projects. 

Its strengths will include: sampling for complex hierarchical models, sampling for solutions of differential equations, and transdimensional sampling for models with an unknown or random number of parameters.

Research focus

  • Scientific Discovery
  • Data Fusion
  • Understanding the earth’s evolution 

Research techniques

  • Statistical modeling 
  • Dimension reduction

Outputs

In progress: GitHub repository (link)

Research vision

The goal of the project is to leverage machine learning data-fusion techniques to produce a subsurface geology model providing predictions with quantified uncertainties of available resources.  This will enable improvements in asset development, planning, and risk assessment while expanding the capabilities of geologist and geotechnical engineers

Research impact

This project will demonstrate the value that machine learning can bring to established fields by leveraging data in novel ways, providing additional insight and improved processes.

Project lead

  • Fabio Ramos

Collaboration team

  • Lionel Ott
  • Philippe Morere 

The project will employ Gaussian process methods to fuse the various data sources and provide predictions including their uncertainties. To handle the large amount of data and large spatial scales of the problem various recent advances in approximate methods will be used.

Research focus

  • Infrastructure
  • Data Fusion

Research techniques

  • Natural resources
  • Optimization
  • Origin Energy
  • Subsurface geology

Using gravity to see what lies under the ground

Research vision

Recent advances in quantum sensing technologies now allow measuring incredibly small variations in gravity with long-term reliability. As a force originated by mass, precise gravity sensing should allow us to non-invasively infer what lies under the ground. In this context, we seek to develop methods for subsurface modelling which properly handle uncertainty.

Research impact

This research has many application areas, including mining exploration, hydrology, agriculture, and maintenance of subsurface infrastructure, such as water main networks. For example, we are looking at monitoring underground water and detecting leakage in pipes before it reaches the surface.

Research lead

Collaboration team

Inferring what we cannot directly observe comes up with many sources of uncertainty, which traditionally are often neglected or not properly accounted for. In this work, we derive probabilistic inference methods which allow modelling and quantifying uncertainty so that  better decisions can be made.

Research focus

  • Foundations
  • Natural Resources 

Research techniques

  • Bayesian inference
  • Statistical modelling
  • Gravity gradiometry
  • Geophysical inversions

Active monitoring of greenhouse emissions

Research vision

Greenhouse gases are emitted from many sources. Identifying the source is critical in order to minimise emissions, however, this requires finding the source properties, such as its location, emission rate and any time variations. This is relatively simple with a single source, but when there are many sources, such as landfill which are distributed over a large area, this problem is no longer simple. 

This project will develop a methodology for deploying sensors, ground and airborne, that will learn to map these emissions over time and space. This will enable government and companies to monitor and reduce greenhouse gas emission and thus tackle climate change. 

Research impact

Tools for combating climate change. Providing government and industry the methodology to assess and monitor greenhouse gas emission, which up till now was only estimated.

Project lead

  • Prof Fabio Ramos
  • Dr Gilad Francis

Collaboration team

  • Draco Analytics
  • Melbourne Water
  • UC Berkley

An integrated system for measuring greenhouse emissions, suited to complex sites such as waste-water treatment plants, landfills, gas fields and mining sites, that learns the properties of a mix of diffuse emission sources as well as hot-spots, “hot times”. This combination of features and sources makes the measurement of gas emission particularly challenging. However, the identification of sources is critical to understanding the total flux, which would allow mitigating action to be taken to minimise source emissions.

This project will develop the tools that will determine flux and map sources and hot spots from sensor measurements. In doing so, it will make a tangible contribution to reducing greenhouse gas emission, and combatting climate change.

Research focus

  • Environmental monitoring
  • Bayesian Inference Methods
  • Bayesian optimisation

Research techniques

  • Environmental monitoring
  • Bayesian Inference Methods
  • Bayesian optimisation

Outputs

A system and methodology to monitor unspecified greenhouse gas emission sources such as landfill, water treatment plant, mine sites etc

Human condition

Research vision

There is a global need for building causal models for crime and societal dimensions that can use historical data to unravel complex structures in the data. The vision for this project is to improve on existing models for capturing the dynamics of the population demographics over time and couple that with the changing criminal activity over the last 15 years.   

Research impact

These explanatory tools can be used by quantitative social scientists and policy makers to better improve societal theories and make informed decisions. The impact of this model is of high relevance, considering that understanding these complex patterns can validate existing theories and uncover new causal relationships in demographics and crime. For example, this project will help inform government decision making and resource allocation to minimise occurrence of crime, reduce unemployment or address rising population density and house pricing.

Project lead

Collaboration team

Incorporate time into current Spatial-Demographic crime models (see previous publication by Marchant et al - (2017) Applying Machine Learning to Criminology: Semi-Parametric Spatial-Demographic Bayesian Regression. Security Informatics). By considering the time dependence structure directly into the model we can capture more complex patterns and extrapolate into the future with more confidence. The work involves implementation of space-time and demographics Gaussian Process regression. By incorporating time into the models, we can uncover causal structure over the longitudinal data by using Vector Autoregressive Models.

Research focus

  • Data Fusion  
  • The Science of Decision Making  
  • Human Behaviour  

Research techniques

  • Vector Auto Regressive Model
  • Causal Inference
  • Demographics
  • Quantitative Social Science
  • Crime

Research vision

Improve police response time to criminal events by optimising patrolling routes based on spatial-temporal patterns of criminal activity. 

How to navigate within a complex street network in order to maximise a reward function is a complex problem. There are existing algorithms that allow approximate decision making under uncertainty in an Euclidian space. However, this project aims to provide better informed decision making by planning directly over street networks.  

Research impact

The research impact of this project is highly relevant because it has impact across both the data-science domain and criminology. As part of this project we develop Bayesian Optimisation for Generalised Linear Models and apply it to patrolling.  

Project lead

Collaboration team

The work involves using a decision-making algorithm under uncertainty that can allow patrolling units to maximise the chances to catch criminals.  

The application of this novel framework allows generalisation of existing techniques for decision making in street networks and a new family of kernels for non-parametric regression models that can improve accuracy over Euclidian space kernels.  

  The goals of this project are the following:  

  • Conduct decision making over a street network using Open Street Maps.  
  • Apply Decision Making algorithm to police patrolling scenario and car sharing/pooling optimisation.  

Research focus

  • Data Fusion  
  • The Science of Decision Making  
  • Human Behaviour  

Research techniques

  • Predictive Policing
  • Bayesian Optimisation
  • Crime
  • Gaussian Process
  • Log Gaussian Cox Process

Outputs

Marchant R, Lu D, Cripps S (2018) Cox Bayesian optimization for police patrolling. In: 32nd Annual Conference on Neural Information Processing Systems, Workshop on Modelling and Decision-Making in the Spatiotemporal Domain, Montreal, Canada.

Research vision

Improve existing models of criminal activity by adding features from the environment. The new include green space quantification and density of premises of different type. In the long term, it can be possible to use Convolutional Neural Networks to extract characteristics of the environment that contribute to Features of the Environment.  

Research impact

The impact of this project is high for criminology, mainly because it allows to uncover patterns for the relationships between characteristics of the Environment and criminal activity.  

Project lead

Collaboration team

The work involves building data handling modules that can extract different information from the environment. After the information is extracted it is fused with other sources of information by using the semi-parametric spatial model of crime.

Research focus

  • Data Fusion  
  • Human Behaviour  

Health

Research vision

The aim of this study is to demonstrate the feasibility of accurately identifying >90% of patients with acute coronary syndrome from existing data collected from electronic medical records integrated from Sydney, Western and Northern Local Health Districts (equivalent to 10% Australian population; 50% NSW population) to characterise the management and outcomes of patients presenting with chest pain.

Research impact

The governance and technological processes established by this research project will lay the necessary foundation for generating actionable insights from electronic medical records and has untold potential to improve patient care, save lives and lower health care costs. This approach can be scaled up across multiple jurisdictions, states and throughout Australia.

Project leads

Collaboration team

Research focus

  • Personalised medicine

Research techniques

  • Electronic medical records
  • Health informatics
  • Cardiology

Intelligent systems

Research vision

Due to modular knowledge representation in biological neural systems, the absence of certain sensory inputs does not hinder decision-making processes. For instance, damage to an eye does not result in loss of one's entire vision.

Incomplete information in problems and datasets is becoming a growing challenge for machine learning. Incomplete information could arise from 1) limited data availability, 2) datasets where certain features are missing in various locations, or 3) when the nature of the problem dynamically changes that requires decision making given absence of certain groups of features in the input space. 

Research impact

In several cases, it is desirable to produce uncertainty estimates in decision-making that could be addressed through Bayesian methods. It is important to develop robust learning algorithms and network architectures that can adapt with dynamic problems, environment, and inconsistent features in datasets.

Project lead

Dr Rohitash Chandra (USyd/CTDS)

Collaboration team

In order to develop robust learning algorithms, it is important to take into account modular learning in order to utilise knowledge as building blocks. Traditional learning algorithms such as stochastic gradient descent provide point estimates or single solution for the weights that represent knowledge learnt in deep neural networks. As a result, these networks make predictions that do not account for uncertainty in the parameters.

We have used coevolutionary algorithms and Bayesian inference for modular knowledge representation in neural networks via multi-task learning. Ongoing work features reversible jump MCMC methods to provide inference of network parameters (weights) given changing environment, in terms of input space and network topology. 

Research focus

  • Data Fusion 
  • Scientific Discovery (Planetary Science) 

Research techniques

  • Multi-task learning
  • Neuroevolution
  • Bayesian inference

Outputs

Research vision

Bayesian inference provides a rigorous approach for neural learning with knowledge representation via the posterior distribution that accounts for uncertainty quantification. Markov Chain Monte Carlo (MCMC) methods typically implement Bayesian inference by sampling from the posterior distribution. This not only provides point estimates of the weights, but the ability to propagate and quantify uncertainty in decision making. However, these techniques face challenges in convergence and scalability, particularly in settings with large datasets and neural network architectures.

Research impact

We demonstrate the techniques using time series prediction and pattern classification applications. The results show that the method not only improves the computational time, but provides better decision making capabilities when compared to related methods.

Project lead

Dr Rohitash Chandra (USyd/CTDS)

Collaboration team

  • Professor Sally Cripps (CTDS/USyd)
  • Ratneel Deo (University of the South Pacific)
  • Ashray Aman (Indian Institute of Technology)
  • Rishab Gupta  (Indian Institute of Technology)
  • Harrison Nguyen (PhD student, USyd)

We developed Bayesian neural networks that feature parallel tempering and parallel computing in order to address computationally expensive problems. The challenge is in the inference for deep learning network architectures that features millions of parameters. First, parallel tempering MCMC sampling method is used to explore multiple modes of the posterior distribution and implemented in multi-core computing architecture. Second, we make within-chain sampling scheme more efficient by using Langevin gradient information for creating Metropolis–Hastings proposal distributions.

We apply parallel tempering MCMC for recurrent neural networks (RNNs, LSTMs, GRUs), generative adversarial networks (GANs) and for reinforcement learning tasks.

Research techniques

  • Bayesian neural networks
  • MCMC
  • High performance computing 

Outputs

Chandra, R., Jain, K., Deo, R.V. and Cripps, S., 2019. Langevin-gradient parallel tempering for Bayesian neural learning. Neurocomputing, September

Software: https://github.com/sydney-machine-learning/parallel-tempering-neural-net

Smart communication for adverse environments.

Research vision

This project aims to develop a communication system that optimally controls radio communications in adverse and contested scenarios, which is reactive and can adapt to counter-offensives from external parties.

Research impact

Improve tactical communication capabilities to work in adverse and contested environments.

Project leads

  • Prof Fabio Ramos (CTDS/USyd)
  • Dr Gilad Francis (CTDS/USyd)

Collaboration team

Defence Science and Technology (DST).

Develop an adaptive learning methodology for radio frequency (RF) communication systems based on reinforcement learning (RL). This will allow learning of policies from past experiences, which will enable an instrument to autonomously take optimal decisions when given a specific task and environment. 

An software defined radio (SDR) based system that can adapt to its environment and produce communication polices to counteract an external party's actions.

Research focus

  • Reinforcement learning (RL)

Research techniques

  • Reinforcement learning
  • Software defined radio (SDR)
  • Communication

Learning reactive radio systems for countering external parties

Research vision

Learning the behaviour of a external party from limited observations and using the learned model to develop a method to disable its operation.

Research impact

Improving the safety of Australians through the ability to counter the activites of external party radio signals.

Project lead

  • Prof Fabio Ramos (CTDS/USyd)
  • Dr Gilad Francis (CTDS/USyd)

Collaboration team

  • DST

Developing methodologies for learning external party's behaviour for limited observations (data fusion), and methodologies to counteract its operation. 

Research focus

  • Reinforcement learning (RL)

Research techniques

  • Reinforcement learning
  • Optimisation
  • Cyber warfare
  • Machine learning

Research vision

Reinforcement learning (RL) provides a framework to learn behavioural patterns to interact with systems that are hard to model. However, RL methods still struggle to handle uncertainty about the environment and may require large amounts of data to succeed. Our focus is to overcome these drawbacks by deriving derive robust and adaptable RL methods for problems involving physical systems.

Research impact

The methods developed by this project should impact the fields of reinforcement learning, approximate inference and robust control.

Project lead

  • Prof. Fabio Ramos (NVIDIA/CTDS)

Collaboration team

  • Dr Rafael dos Santos de Oliveira (USyd/CTDS)
  • Lucas Barcelos de Oliveira (USyd)
  • Rafael Carvalhaes Possas (USyd/NVIDIA)

We derive approximate inference methods to learn probability distributions over simulators based on data from a target physical system. Adaptively randomising the simulation environment should then allows us to learn control policies that are robust to the uncertainty in physical environments and allow us to minimise the amount of interactions with it.

Research focus

  • Foundations
  • Intelligent systems

Research techniques

  • Reinforcement learning
  • Domain adaptation
  • Sim-to-real transfer
  • Approximate inference

Untangling the 3-D Structure of Galaxies

Project vision

Integral field units (IFUs) provide detailed maps of the chemical components and dynamics of galaxies in three-dimensional “datacubes”.  Traditional cubing methods (weighted averages) often produce spurious structures caused by the atmosphere or instrumental signatures.  Our new analysis method removes these effects, maps to standard coordinates, and provides reliable uncertainties.

Project impact

We believe our method has the potential to become widely used for IFU data analysis, which will dramatically improve the accuracy and precision of observations in many upcoming next-generation surveys of thousands of galaxies.

Project lead

Collaboration team

We used IFU data from the SAMI Galaxy Survey as a test case.  Our datacubes are modeled as a Gaussian process, with covariance transformed by the instrument response (which can be numerically calculated, and specified as an input for use of the method on data from different instruments).

Research focus

  • Scientific Discovery
  • Data Fusion

Research techniques

  • Bayesian non-parametrics
  • Statistical modeling
  • Missingness

Finished outputs

[Internal] Outputs in progress

  • Code for general IFUs
  • Methods paper (led by RS; ETA April 2019)
  • Possibly separate SAMI data release paper applying new algorithm (led by TBD)