6  Non-parametric Tests

Week 6 - Non-parametric tests and an introduction to analysis of two-way table of counts.

In this workshop, you will learn about non-parametric tests and some techniques for analysing a two-way table of counts. Workshop materials are available in the github repository ECS200.

We start this week with the analysis of count data.

Analysis of count data focuses on situations where observations are recorded as frequencies in categories, rather than as continuous measurements. In ecology, this often occurs when researchers count the number of individuals, events, or occurrences within defined groups, such as the number of insects on different plant species or the number of animals observed in different habitats.

Our second focus this week is non-parametric tests.

Non-parametric tests are statistical methods that make fewer assumptions about the underlying distribution of the data than traditional parametric tests. Parametric methods, such as the t-test, analysis of variance (ANOVA), or Pearson correlation, typically assume that the response variable follows a normal distribution (or that the errors follow a normal distribution) and that variances are similar among groups (the equal variances assumption). In contrast, non-parametric methods do not require these assumptions and instead typically focus on other approaches such as analysing the ranks of the data rather than the raw values.

Non-parametric methods are particularly useful when sample sizes are small, when data are strongly skewed, when outliers are present, or when measurements are ordinal rather than continuous. Below is a table which names four non-parametric approaches to research questions we have covered in our first year statistics refresher.

Method Data type Parametric equivalent Ecology example
Mann–Whitney U Test (Wilcoxon Rank-Sum) Continuous or ordinal response with two independent groups Two-sample t-test Compare plant biomass between grazed and ungrazed grassland plots
Kruskal–Wallis Test Continuous or ordinal response with more than two independent groups One-way ANOVA Compare species richness across low, mid, and high elevation sites
Wilcoxon Signed-Rank Test Continuous or ordinal paired observations Paired t-test Compare bird abundance at the same wetlands before and after restoration
Spearman Rank Correlation Two continuous or ordinal variables Pearson correlation Examine the relationship between elevation and species richness

Background

Assessment

For this simulated dataset we use the green tree frog (Litoria caerulea), a common frog in northern and eastern Australia as an example. Your study site includes three habitat types:

  • Trees in urban areas
  • Native forests
  • Rocky areas with crevices

Each frog has been sexed.

A green tree frog (Litoria caerulea) in Mt Gloroius, Brisbane. Photo by N. Wu

Research Question: In there an association between habitat type and sex for green tree frogs?

Habitat Male Female
Urban trees 40 45
Native forests 20 18
Rocky areas 12 16

Your tasks are to:

1. Enter the data into R

  • Create a matrix of the table data from above.
  • Convert matrix to data.frame and convert to long format.

2. Visualisation the data

  • Hint: Count by habitat type.

3. Assess the assumptions

  • Check ‘Count data: Analysis of two-way tables’ for assessing assumptions.

4. Provide a summary of the model and write an interpretation