# Table 1: Statistics

Unit of study |
Credit points |
A: Assumed knowledge P: Prerequisites C: Corequisites N: Prohibition |
Session |
---|---|---|---|

## Statistics |
|||

For a major in Statistics, the minimum requirement is 24 credit points from senior units of study listed below. | |||

## Junior units of study |
|||

DATA1001Foundations of Data Science |
6 | N DATA1901 or MATH1005 or MATH1905 or MATH1015 or MATH1115 or ENVX1001 or ENVX1002 or ECMT1010 or BUSS1020 or STAT1021 or STAT1022 |
Semester 1 Semester 2 |

DATA1901Foundations of Data Science (Adv) |
6 | A An ATAR of 95 or more N MATH1905 or ECMT1010 or ENVX2001 or BUSS1020 or DATA1001 or MATH1115 |
Semester 1 Semester 2 |

## Intermediate units of study |
|||

DATA2002Data Analytics: Learning from Data |
6 | A Basic Linear Algebra and some coding P [DATA1001 or ENVX1001 or ENVX1002] or [MATH10X5 and MATH1115] or [MATH10X5 and STAT2011] or [MATH1905 and MATH1XXX (except MATH1XX5)] or [BUSS1020 or ECMT1010 or STAT1021] N STAT2012 or STAT2912 or DATA2902 |
Semester 2 |

DATA2902Data Analytics: Learning from Data (Adv) |
6 | A Basic linear algebra and some coding for example MATH1014 or MATH1002 or MATH1902 and DATA1001 or DATA1901 P A mark of 65 or above in any of the following (DATA1001 or DATA1901 or ENVX1001 or ENVX1002) or (MATH10X5 and MATH1115) or (MATH10X5 and STAT2011) or (MATH1905 and MATH1XXX [except MATH1XX5]) or (QBUS1020 or ECMT1020 or STAT1021) N STAT2012 or STAT2912 or DATA2002 |
Semester 2 |

STAT2011Probability and Estimation Theory |
6 | P (MATH1X21 or MATH1931 or MATH1X01 or MATH1906 or MATH1011) and (DATA1X01 or MATH10X5 or MATH1905 or STAT1021 or ECMT1010 or BUSS1020) N STAT2911 |
Semester 1 |

STAT2911Probability and Statistical Models (Adv) |
6 | P (MATH1X21 or MATH1931 or MATH1X01 or MATH1906 or MATH1011) and a mark of 65 or greater in (DATA1X01 or MATH10X5 or MATH1905 or STAT1021 or ECMT1010 or BUSS1020) N STAT2011 |
Semester 1 |

## Senior units of study |
|||

STAT3021Stochastic Processes |
6 | P STAT2X11 and (MATH1003 or MATH1903 or MATH1907 or MATH1023 or MATH1923 or MATH1933) N STAT3911 or STAT3011 |
Semester 1 |

STAT3022Applied Linear Models |
6 | P STAT2X11 and (DATA2X02 or STAT2X12) N STAT3912 or STAT3012 or STAT3922 |
Semester 1 |

STAT3922Applied Linear Models (Advanced) |
6 | P STAT2X11 and [a mark of 65 or greater in (STAT2X12 or DATA2X02)] N STAT3912 or STAT3012 or STAT3022 |
Semester 1 |

STAT3023Statistical Inference |
6 | A DATA2X02 or STAT2X12 P STAT2X11 N STAT3913 or STAT3013 or STAT3923 |
Semester 2 |

STAT3923Statistical Inference (Advanced) |
6 | P STAT2X11 and a mark of 65 or greater in (DATA2X02 or STAT2X12) N STAT3913 or STAT3013 or STAT3023 |
Semester 2 |

STAT3888Statistical Machine Learning |
6 | A STAT3012 or STAT3912 or STAT3022 or STAT3922 P STAT2X11 and (DATA2X02 or STAT2X12) N STAT3914 or STAT3014 |
Semester 2 |

STAT3911Stochastic Processes and Time Series Adv |
6 | P (STAT2911 or a mark of 65 or above in STAT2011) and (MATH1X03 or MATH1907 or MATH1X23 or MATH1933) N STAT3011 or STAT3905 or STAT3005 or STAT3003 or STAT3903 |
Semester 1 |

STAT3914Applied Statistics Advanced |
6 | A STAT3012 or STAT3912 or STAT3022 or STAT3922 P STAT2912 or (a mark of 65 or above in STAT2012 or DATA2002) N STAT3014 or STAT3907 or STAT3902 or STAT3006 or STAT3002 |
Semester 2 |

ENVX3002Statistics in the Natural Sciences |
6 | P ENVX2001 or BIOM2001 or STAT2X12 or BIOL2X22 or DATA2002 or QBIO2001 Interdisciplinary Unit |
Semester 1 |

### Statistics

For a major in Statistics, the minimum requirement is 24 credit points from senior units of study listed below.

##### Junior units of study

**DATA1001 Foundations of Data Science**

Credit points: 6 Teacher/Coordinator: A/Prof Qiying Wang Session: Semester 1,Semester 2 Classes: 3x1-hr lectures; 1x2-hr lab/wk Prohibitions: DATA1901 or MATH1005 or MATH1905 or MATH1015 or MATH1115 or ENVX1001 or ENVX1002 or ECMT1010 or BUSS1020 or STAT1021 or STAT1022 Assessment: RQuizzes (10%); 3 x projects (30%); final exam (60%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

DATA1001 is a foundational unit in the Data Science major. The unit focuses on developing critical and statistical thinking skills for all students. Does mobile phone usage increase the incidence of brain tumours? What is the public's attitude to shark baiting following a fatal attack? Statistics is the science of decision making, essential in every industry and undergirds all research which relies on data. Students will use problems and data from the physical, health, life and social sciences to develop adaptive problem solving skills in a team setting. Taught interactively with embedded technology, DATA1001 develops critical thinking and skills to problem-solve with data. It is the prerequisite for DATA2002.

Textbooks

Statistics, (4th Edition), Freedman Pisani Purves (2007)

**DATA1901 Foundations of Data Science (Adv)**

Credit points: 6 Teacher/Coordinator: A/Prof Qiying Wang Session: Semester 1,Semester 2 Classes: Lecture 3 hrs/week + Computer lab 2 hr/week Prohibitions: MATH1905 or ECMT1010 or ENVX2001 or BUSS1020 or DATA1001 or MATH1115 Assumed knowledge: An ATAR of 95 or more Assessment: RQuizzes (10%), Projects (30%), Final Exam (60%). Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

DATA1901 is an advanced level unit (matching DATA1001) that is foundational to the new major in Data Science. The unit focuses on developing critical and statistical thinking skills for all students. Does mobile phone usage increase the incidence of brain tumours? What is the public's attitude to shark baiting following a fatal attack? Statistics is the science of decision making, essential in every industry and undergirds all research which relies on data. Students will use problems and data from the physical, health, life and social sciences to develop adaptive problem solving skills in a team setting. Taught interactively with embedded technology and masterclasses, DATA1901 develops critical thinking and skills to problem-solve with data at an advanced level. By completing this unit you will have an excellent foundation for pursuing data science, whether directly through the data science major, or indirectly in whatever field you major in. The advanced unit has the same overall concepts as the regular unit but material is discussed in a manner that offers a greater level of challenge and academic rigour.

Textbooks

All learning materials will be on Canvas. In addition, the textbook is Statistics (4th Edition) { Freedman, Pisani, and Purves (2007), which is available in 3 forms: 1) E-text $65 (www.wileydirect.com.au/buy/statistics-4th-international-student-edition/), 2) hard copy (Co-op Bookshop), and 3) the Library.

##### Intermediate units of study

**DATA2002 Data Analytics: Learning from Data**

Credit points: 6 Teacher/Coordinator: A/Prof Jennifer Chan Session: Semester 2 Classes: 3x1-hr lecture; 1x2-hr computer laboratory/wk Prerequisites: [DATA1001 or ENVX1001 or ENVX1002] or [MATH10X5 and MATH1115] or [MATH10X5 and STAT2011] or [MATH1905 and MATH1XXX (except MATH1XX5)] or [BUSS1020 or ECMT1010 or STAT1021] Prohibitions: STAT2012 or STAT2912 or DATA2902 Assumed knowledge: Basic Linear Algebra and some coding Assessment: Computer practicals (10%), online quizzes (15%), group work assignment and presentation (15%), and final exam (60%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

Technological advances in science, business, engineering have given rise to a proliferation of data from all aspects of our life. Understanding the information presented in these data is critical as it enables informed decision making into many areas including market intelligence and science. DATA2002 is an intermediate unit in statistics and data sciences, focusing on learning data analytic skills for a wide range of problems and data. How should the Australian government measure and report employment and unemployment? Can we tell the difference between decaffeinated and regular coffee ? In this unit, you will learn how to ingest, combine and summarise data from a variety of data models which are typically encountered in data science projects as well as reinforcing your programming skills through experience with a statistical programming language. You will also be exposed to the concept of statistical machine learning and develop the skill to analyse various types of data in order to answer a scientific question. From this unit, you will develop knowledge and skills that will enable you to embrace data analytic challenges stemming from everyday problems.

**DATA2902 Data Analytics: Learning from Data (Adv)**

Credit points: 6 Teacher/Coordinator: A/Prof Jennifer Chan Session: Semester 2 Classes: Lecture 3 hrs/week + computer tutorial 2 hr/week Prerequisites: A mark of 65 or above in any of the following (DATA1001 or DATA1901 or ENVX1001 or ENVX1002) or (MATH10X5 and MATH1115) or (MATH10X5 and STAT2011) or (MATH1905 and MATH1XXX [except MATH1XX5]) or (QBUS1020 or ECMT1020 or STAT1021) Prohibitions: STAT2012 or STAT2912 or DATA2002 Assumed knowledge: Basic linear algebra and some coding for example MATH1014 or MATH1002 or MATH1902 and DATA1001 or DATA1901 Assessment: Computer practicals in-class (10%), Online quizzes (15%), Project group work assignment (5%), Project group work presentation (10%), Final exam (60%). Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

Technological advances in science, business, and engineering have given rise to a proliferation of data from all aspects of our life. Understanding the information presented in these data is critical as it enables informed decision making into many areas including market intelligence and science. DATA2902 is an intermediate unit in statistics and data sciences, focusing on learning advanced data analytic skills for a wide range of problems and data. How should the Australian government measure and report employment and unemployment? Can we tell the difference between decaffeinated and regular coffee? In this unit, you will learn how to ingest, combine and summarise data from a variety of data models which are typically encountered in data science projects as well as reinforcing their programming skills through experience with statistical programming language. You will also be exposed to the concept of statistical machine learning and develop the skill to analyse various types of data in order to answer a scientific question. From this unit, you will develop knowledge and skills that will enable you to embrace data analytic challenges stemming from everyday problems.

**STAT2011 Probability and Estimation Theory**

Credit points: 6 Teacher/Coordinator: A/Prof Jennifer Chan Session: Semester 1 Classes: 3x1-hr lectures; 1x1-hr tutorial; and 1x1-hr computer lab/wk Prerequisites: (MATH1X21 or MATH1931 or MATH1X01 or MATH1906 or MATH1011) and (DATA1X01 or MATH10X5 or MATH1905 or STAT1021 or ECMT1010 or BUSS1020) Prohibitions: STAT2911 Assessment: 2 x quizzes (25%); weekly computer practical reports (10%); a 1-hr computer exam in week 13 (15%); and a final 2-hr exam (50%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

This unit provides an introduction to probability, the concept of random variables, special distributions including the Binomial, Hypergeometric, Poisson, Normal, Geometric and Gamma and to statistical estimation. This unit will investigate univariate techniques in data analysis and for the most common statistical distributions that are used to model patterns of variability. You will learn the method of moments and maximum likelihood techniques for fitting statistical distributions to data. The unit will have weekly computer classes where you will learn to use a statistical computing package to perform simulations and carry out computer intensive estimation techniques like the bootstrap method. By doing this unit you will develop your statistical modeling skills and it will prepare you to learn more complicated statistical models.

Textbooks

An Introduction to Mathematical Statistics and Its Applications (5th edition), Chapters 1-5, Larsen and Marx (2012)

**STAT2911 Probability and Statistical Models (Adv)**

Credit points: 6 Teacher/Coordinator: A/Prof Jennifer Chan Session: Semester 1 Classes: 3x1-hr lectures; 1x1-hr tutorial; and 1x1-hr computer lab/wk Prerequisites: (MATH1X21 or MATH1931 or MATH1X01 or MATH1906 or MATH1011) and a mark of 65 or greater in (DATA1X01 or MATH10X5 or MATH1905 or STAT1021 or ECMT1010 or BUSS1020) Prohibitions: STAT2011 Assessment: 2 x quizzes (10%); 2 x assignments (5%); computer work (5%); weekly computer lab reports (5%); a computer lab exam (10%) and a final 2-hr exam (70%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

This unit is essentially an advanced version of STAT2011, with an emphasis on the mathematical techniques used to manipulate random variables and probability models. Common distributions including the Poisson, normal, beta and gamma families as well as the bivariate normal are introduced. Moment generating functions and convolution methods are used to understand the behaviour of sums of random variables. The method of moments and maximum likelihood techniques for fitting statistical distributions to data will be explored. The notions of conditional expectation and prediction will be covered as will be distributions related to the normal: chi^2, t and F. The unit will have weekly computer classes where candidates will learn to use a statistical computing package to perform simulations and carry out computer intensive estimation techniques like the bootstrap method.

Textbooks

Mathematical Statistics and Data Analysis (3rd edition), J A Rice

##### Senior units of study

**STAT3021 Stochastic Processes**

Credit points: 6 Teacher/Coordinator: Dr John Ormerod Session: Semester 1 Classes: 3 lectures per week, tutorial 1hr per week. Prerequisites: STAT2X11 and (MATH1003 or MATH1903 or MATH1907 or MATH1023 or MATH1923 or MATH1933) Prohibitions: STAT3911 or STAT3011 Assessment: 2 x Quiz (2 x 15%), 2 x Assignment (2 x 5%), Final Exam (60%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

A stochastic process is a mathematical model of time-dependent random phenomena and is employed in numerous fields of application, including economics, finance, insurance, physics, biology, chemistry and computer science. After setting up basic elements of stochastic processes, such as time, state, increments, stationarity and Markovian property, this unit develops important properties and limit theorems of discrete-time Markov chain and branching processes. You will then establish key results for the Poisson process and continuous-time Markov chains, such as the memoryless property, super positioning, thinning, Kolmogorov's equations and limiting probabilities. Various illustrative examples are provided throughout the unit to demonstrate how stochastic processes can be applied in modeling and analyzing problems of practical interest. By completing this unit, you will develop the essential basis for further studies, such as stochastic calculus, stochastic differential equations, stochastic control and financial mathematics.

**STAT3022 Applied Linear Models**

Credit points: 6 Teacher/Coordinator: Dr John Ormerod Session: Semester 1 Classes: Three 1 hour lectures, one 1 hour tutorial and one 1 hour computer laboratories per week. Prerequisites: STAT2X11 and (DATA2X02 or STAT2X12) Prohibitions: STAT3912 or STAT3012 or STAT3922 Assessment: 2 x assignment (15%), 3 x quizzes (30%), final exam (55%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

In today's data-rich world more and more people from diverse fields are needing to perform statistical analyses and indeed more and more tools for doing so are becoming available; it is relatively easy to point and click and obtain some statistical analysis of your data. But how do you know if any particular analysis is indeed appropriate? Is there another procedure or workflow which would be more suitable? Is there such thing as a best possible approach in a given situation? All of these questions (and more) are addressed in this unit. You will study the foundational core of modern statistical inference, including classical and cutting-edge theory and methods of mathematical statistics with a particular focus on various notions of optimality. The first part of the unit covers various aspects of distribution theory which are necessary for the second part which deals with optimal procedures in estimation and testing. The framework of statistical decision theory is used to unify many of the concepts. You will apply the theory to various real-world problems using statistical software in laboratory sessions. By completing this unit you will develop the necessary skills to confidently choose the best statistical analysis to use in many situations.

**STAT3922 Applied Linear Models (Advanced)**

Credit points: 6 Teacher/Coordinator: Dr John Ormerod Session: Semester 1 Classes: Three 1 hour lectures, one 1 hour tutorial and one 1 hour computer laboratory per week. Prerequisites: STAT2X11 and [a mark of 65 or greater in (STAT2X12 or DATA2X02)] Prohibitions: STAT3912 or STAT3012 or STAT3022 Assessment: 2 x assignment (10%), 3 x quizzes (35%), final exam (55%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

This unit will introduce the fundamental concepts of analysis of data from both observational studies and experimental designs using classical linear methods, together with concepts of collection of data and design of experiments. You will first consider linear models and regression methods with diagnostics for checking appropriateness of models, looking briefly at robust regression methods. Then you will consider the design and analysis of experiments considering notions of replication, randomization and ideas of factorial designs. Throughout the course you will use the R statistical package to give analyses and graphical displays. This unit is essentially an Advanced version of STAT3012, with additional emphasis on the mathematical techniques underlying applied linear models together with proofs of distribution theory based on vector space methods.

**STAT3023 Statistical Inference**

Credit points: 6 Teacher/Coordinator: Dr John Ormerod Session: Semester 2 Classes: Three 1 hour lectures, one 1 hour tutorial and one 1 hour computer laboratory per week. Prerequisites: STAT2X11 Prohibitions: STAT3913 or STAT3013 or STAT3923 Assumed knowledge: DATA2X02 or STAT2X12 Assessment: 2 x Quizzes (25%), Computer Lab Report (10%), Computer Exam (10%), Final Exam (55%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

In today's data-rich world more and more people from diverse fields are needing to perform statistical analyses and indeed more and more tools for doing so are becoming available; it is relatively easy to point and click and obtain some statistical analysis of your data. But how do you know if any particular analysis is indeed appropriate? Is there another procedure or workflow which would be more suitable? Is there such a thing as the best possible approach in a given situation? All of these questions (and more) are addressed in this unit. You will study the foundational core of modern statistical inference, including classical and cutting-edge theory and methods of mathematical statistics with a particular focus on various notions of optimality. The first part of the unit covers various aspects of distribution theory which are necessary for the second part which deals with optimal procedures in estimation and testing. The framework of statistical decision theory is used to unify many of the concepts. You will apply the methods learnt to real-world problems in laboratory sessions. By completing this unit you will develop the necessary skills to confidently choose the best statistical analysis to use in many situations.

**STAT3923 Statistical Inference (Advanced)**

Credit points: 6 Teacher/Coordinator: Dr John Ormerod Session: Semester 2 Classes: Three 1 hour lectures, one 1 hour tutorial and one 2 hour advanced workshop. Prerequisites: STAT2X11 and a mark of 65 or greater in (DATA2X02 or STAT2X12) Prohibitions: STAT3913 or STAT3013 or STAT3023 Assessment: 2 x Quizzes (20%), weekly homework (5%), Computer Lab Reports (10%), Computer Exam (10%), Final Exam (55%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

In today's data-rich world more and more people from diverse fields are needing to perform statistical analyses and indeed more and more tools for doing so are becoming available; it is relatively easy to point and click and obtain some statistical analysis of your data. But how do you know if any particular analysis is indeed appropriate? Is there another procedure or workflow which would be more suitable? Is there such thing as a best possible approach in a given situation? All of these questions (and more) are addressed in this unit. You will study the foundational core of modern statistical inference, including classical and cutting-edge theory and methods of mathematical statistics with a particular focus on various notions of optimality. The first part of the unit covers various aspects of distribution theory which are necessary for the second part which deals with optimal procedures in estimation and testing. The framework of statistical decision theory is used to unify many of the concepts. You will rigorously prove key results and apply these to real-world problems in laboratory sessions. By completing this unit you will develop the necessary skills to confidently choose the best statistical analysis to use in many situations.

**STAT3888 Statistical Machine Learning**

Credit points: 6 Teacher/Coordinator: Dr John Ormerod Session: Semester 2 Classes: Three 1 hour lectures, one 1 hour tutorial and one 1 hour computer laboratory per week. Prerequisites: STAT2X11 and (DATA2X02 or STAT2X12) Prohibitions: STAT3914 or STAT3014 Assumed knowledge: STAT3012 or STAT3912 or STAT3022 or STAT3922 Assessment: Written exam (40%), major project *50%), computer labs (10%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

Data Science is an emerging and inherently interdisciplinary field. A key set of skills in this area fall under the umbrella of Statistical Machine Learning methods. This unit presents the opportunity to bring together the concepts and skills you have learnt from a Statistics or Data Science major, and apply them to a joint project with NUTM3888 where Statistics and Data Science students will form teams with Nutrition students to solve a real world problem using Statistical Machine Learning methods. The unit will cover a wide breadth of cutting edge supervised and unsupervised learning methods will be covered including principal component analysis, multivariate tests, discrimination analysis, Gaussian graphical models, log-linear models, classification trees, k-nearest neighbors, k-means clustering, hierarchical clustering, and logistic regression. In this unit, you will continue to understand and explore disciplinary knowledge, while also meeting and collaborating through project-based learning; identifying and solving problems, analysing data and communicating your findings to a diverse audience. All such skills are highly valued by employers. This unit will foster the ability to work in an interdisciplinary team, and this is essential for both professional and research pathways in the future.

**STAT3911 Stochastic Processes and Time Series Adv**

Credit points: 6 Session: Semester 1 Classes: Three 1 hour lecture, one 1 hour tutorial per week, plus an extra 1 hour lecture per week on advanced material in the first half of the semester. Seven 1 hour computer laboratories (on time series) in the second half of the semester (one 1 hour class per week). Prerequisites: (STAT2911 or a mark of 65 or above in STAT2011) and (MATH1X03 or MATH1907 or MATH1X23 or MATH1933) Prohibitions: STAT3011 or STAT3905 or STAT3005 or STAT3003 or STAT3903 Assessment: One 2 hour exam, assignments and/or quizzes, and computer practical reports (100%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

This is an Advanced version of STAT3011. There will be 3 lectures in common with STAT3011. In addition to STAT3011 material, theory on branching processes and Brownian bridges will be covered.

**STAT3914 Applied Statistics Advanced**

Credit points: 6 Session: Semester 2 Classes: Three 1 hour lectures and one 1 hour computer laboratory per week plus an extra hour each week which will alternate between lectures and tutorials. Prerequisites: STAT2912 or (a mark of 65 or above in STAT2012 or DATA2002) Prohibitions: STAT3014 or STAT3907 or STAT3902 or STAT3006 or STAT3002 Assumed knowledge: STAT3012 or STAT3912 or STAT3022 or STAT3922 Assessment: Written exam (40%), major project (50%), computer labs (10%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

This unit is an Advanced version of STAT3014. There will be 3 lectures per week in common with STAT3014. The unit will have extra lectures focusing on multivariate distribution theory developing results for the multivariate normal, partial correlation, the Wishart distribution and Hotelling's T^2. There will also be more advanced tutorial and assessment work associated with this unit.

**ENVX3002 Statistics in the Natural Sciences**

Credit points: 6 Teacher/Coordinator: A/Prof Peter Thomson Session: Semester 1 Classes: One 2-hour workshop per week, one 3-hour computer practical per week Prerequisites: ENVX2001 or BIOM2001 or STAT2X12 or BIOL2X22 or DATA2002 or QBIO2001 Assessment: One computer-based exam during the exam period (50%), assessment tasks focusing on analysing and interpreting real datasets (50%) Campus: Camperdown/Darlington, Sydney Mode of delivery: Normal (lecture/lab/tutorial) day

Note: Interdisciplinary Unit

This unit of study is designed to introduce students to the analysis of data they may face in their future careers, in particular data that are not well behaved. The data may be non-normal, there may be missing observations, they may be correlated in space and time or too numerous to analyse with standard models. The unit is presented in an applied context with an emphasis on correctly analysing authentic datasets, and interpreting the output. It begins with the analysis and design experiments based on the general linear model. In the second part, students will learn about the generalisation of the general linear model to accommodate non-normal data with a particular emphasis on the binomial and Poisson distributions. In the third part linear mixed models will be introduced which provide the means to analyse datasets that do not meet the assumptions of independent and equal errors, for example data that is correlated in space and time. The units ends with an introduction to machine learning and predictive modelling. A key feature of the unit is using R to develop coding skills that are become essential in science for processing and analysing datasets of ever increasing size.