Bio2E Workshops

• The Bio2E facility offers training and advice in Biostatistics, Bioinformatics and Experimental Design.
• Workshops are free with Bio2E facility membership, however registration is still required.
• Non-members can also register for individual workshops.

Workshops History 2019

This series of four workshops provides research students with a theoretical and practical introduction to statistical analysis using GraphPad Prism software. No previous experience with the software is assumed.

Workshop 1: Variables and Distributions

• Understand the role of statistics in empirical research
• Understand the key concepts of variables and distributions
• Be able to create data sets in PRISM format
• Be able to conduct exploratory data analysis in PRISM
Some tools and concepts covered include:
• Population and Sample statistics
• The Central Limit Theorem
• Roles and Types of variables
• Scatterplot and boxplot
• Tests for normality
• Outlier analysis
• Lognormal distribution
• Binomial and Poisson distributions

Workshop 2: Statistical Tests for Two variable experiments

• Understand what questions statistical analysis can answer
• Explore how to analyse a wide variety of 2 variable experiments
Some tools and concepts covered include:
• Hypothesis testing and P values
• Type I and Type II errors
• Parametric vs non-parametric tests
• Binomial and Poisson tests
• Chi square and Fisher’s Exact test
• McNemars matched pairs test
• Wilcoxon matched pairs signed rank test

Workshop 3: Regression and ANOVA

• Correlation and Regression
• Correlation and Causation
• Non-linear regression
• Experimental Design
• One-way ANOVA
• Pairwise comparisons after ANOVA
• Correcting for multiple comparisons
• Non-parametric ANOVA
• Repeated Measures ANOVA
• Two-way ANOVA

Workshop 4: Significance vs Importance

• Significance vs Importance
• Point Estimation
• Interval estimation for parameters
• Interval estimation for Effect Sizes
• Standardised effect sizes
• Presenting and reporting your results
• Sample size Power calculations – G*Power
• Two-way ANOVA in SPSS
• Resources

Workshops History (2018)

Experimental Design and Introduction to Power Calculation

This workshop provides research students with a theoretical and practical introduction to experimental design concepts. The implications for choice of statistical software will be explained so that researchers can obtain and learn how to use the appropriate software tools.

Power and sample size calculations will also be introduced. Some simple experimental scenarios will be examined using the G*Power open-source software.

No experience with the software is assumed.

Meta-analysis for Systematic Reviews using MIX 2.0

A systematic review answers a defined research question by examining all the available evidence that fit specified criteria. A meta-analysis is a rigorous statistical approach to combining and analyzing the chosen evidence.

In this workshop we will be examining the process of performing a meta-analysis, focusing in particular on key statistical concepts such as Fixed and Random effects modelling.

Example data will be explored and analysed using the MIX 2.0 Lite program within Microsoft Excel. MIX 2.0 runs on Windows OS (but not Mac OS).

Linear Models and Generalised Linear Models using SPSS

Linear models may be known by many names: linear regression, multivariate analysis, general linear model, etc. Whatever the name they are often capable of creating powerful statistical models in an experimental context.

This workshop will explain some of the features and differences of different linear and generalised linear models by using worked examples in SPSS.

Power and Sample Size calculation using G*Power

This should be one of the first steps in any experimental plan. It acts as an important check step in determining whether the experiment is realistic and feasible to achieve the stated aims. Often it is a mandatory component of grant applications and ethics approval.

This workshop will use examples in G*Power to cover a wide range of common power calculation scenarios.

Experimental Design

Good statistical analysis of experimental data starts with a good experimental design. A properly designed experiment will save time and effort later, potentially allowing for simpler statistical analysis, a more coherent story, and more powerful inferences. This workshop will look at how to prepare, including: what type of experimental design to use; how to perform a power analysis; how to organise your data; and what type of software and statistical tests might be appropriate later on.

This series of four workshops provides research students with a theoretical and practical introduction to statistical analysis using GraphPad Prism software. No previous experience with the software is assumed.

Workshop 1: Types of Variables and Distributions

Understanding the type of variable and its distribution is fundamental in choosing the correct statistical
test. In this workshop, we will examine the different types of variables contained in an experimental data set. We will look at plotting/determining the distribution of continuous variables (normal, lognormal and skewed), outlier analysis and paired data. Other common distributions, binomial and poisson, will be discussed as well as the application of the central limit theorem. The appropriate descriptive statistics for particular types of variable/distributions will be outlined. The aim of the workshop is also for the attendee to become proficient at formatting data tables of different types, performing analyses and creating graphs and layouts.

Workshop 2: Choosing a Statistical Test

This workshop covers the underlying concepts involved in the application of statistical tests including hypotheses, significance, Type I/II errors and P-values. It will focus on experimental designs involving a categorical variable with two groups where the outcome is either numeric or categorical. The aim is to understand the steps in deciding upon an appropriate statistical test and develop strategies for when the choice isn’t clear (e.g. unable to test assumptions because of small sample size). Topics covered include: t test, Mann-Whitney test, parametric vs non-parametric, paired data, Fisher’s exact test/ chi-square test / chi-square test for trend, applying the central limit theorem when choosing a statistical test. We will also look at analyses involving two numeric variables – regression and correlation.

Workshop 3: Sources of Variation

We will look at the sources of variation in laboratory and clinical research. Many studies are designed to examine the variation in response to an intervention or treatment, however, variation from other sources can obscure the effect or produce a misleading result. Experimental designs and considerations used to improve the power and validity of a study will be discussed. Comparing the variation within groups to the variation between groups is the basis of ANOVA, a commonly-used statistical test for identifying differences more than 2 groups. This workshop also looks at the different approaches for identifying differences between more than 2 groups and the challenges of multiple comparisons. We will explore more complex types of analyses, i.e. multivariate analysis, and the software needed to perform them.

Workshop 4: Significance vs Importance

Hypothesis –testing is an important part of statistical analysis, however, estimation can provide valuable information about the size of an effect. This workshop will focus on the estimation of population parameters from a sample and the calculation and interpretation of confidence intervals. Understanding effects sizes is also useful for determining sample sizes and we will look at the theory and practice of calculating the power of a study. Presentation of quantitative data and statistical analysis in a thesis or paper will also be discussed.

Workshops from previous years

Bioinformatics week

The programming language R is a powerful and open-source tool for Bioinformatics analysis. This series of workshops provides an introduction to R and its use in analysis of gene expression in Bioconductor packages such as limma and DESeq2.

Introduction to R

The statistical and graphical programming language, R, is widely used because of its power, versatility and free access. This workshop provides a practical introduction to using R/Rstudio. Using the base packages, we will import data sets, perform statistical analyses and write functions. Another aim is to understand the basic data structures in R and how they relate to some of the more specialized data structures in genomics analysis. One of the advantages of R is that it is easily extensible through the installation of additional packages, providing a huge range of statistics and bioinformatics analysis and graphing options. We will use one of the most popular packages, markdown, to document your analysis.

Microarray Analysis with Open Source Tools

This workshop focuses on the analysis of oligonucleotide array data, from quality-control metrics to identifying differentially-expressed genes.
We will be using some tools from Bioconductor and some previous experience with R is assumed. The workshop will cover:
Generation and interpretation of QC metrics.
Normalisation and summarisation of expression data.
Methods and statistical challenges for identifying differentially-expressed genes.
Design of Microarray experiments.

Gene Profiling and Discovery with RNAseq

Gene expression analysis with RNA-seq is sensitive, accurate and versatile, but it can also seem daunting to those accustomed to working with RTqPCR or oligonucleotide arrays. This workshop provides an overview of the technology and the considerations in designing a study, with a particular focus on using RNA-seq to study differential gene expression. The steps in the data analysis and how to perform them with open source tools will be covered. Some statistical approaches for identifying the differentially-expressed genes will be outlined. The workshop includes some data analysis using the DEseq2, edgeR and limma packages in Bioconductor/R. The aim is that participants will obtain an overview in performing a RNA-seq analysis, from selecting an appropriate sample preparation method and sequencing protocol, to understanding the statistical approaches for identifying differentially-expressed genes.

From Gene Lists to Pathways

A guide to understanding the biological context of differentially-expressed genes or proteins, with the use of open source tools and Ingenuity Pathways Analysis.

This workshop covers annotation of gene lists with up-to-date functional information. The statistical approaches used to understand whether Gene Ontology (GO) terms or metabolic pathways are overrepresented in data sets will be discussed. We will use open source tools to annotate genes and investigate the involvement of GO terms and pathways. The features of a commercially-available program available to Sydney University researchers, Ingenuity Pathways Analysis, will also be outlined and compared to the open source tools. Gene Set Enrichment Analysis, as an alternative to working with lists of genes obtained with an arbitrary threshold, will also be discussed.