To compare the statistical significance of multiple quantitative variable, the ANOVA test is the way to go. Here, I discuss what you should consider when performing an ANOVA in R.
Using statistical tests, it is possible to make a statement about the significance of a set of measurements by calculating a test statistic. If it is unlikely to obtain a test statistic at least as extreme as the observed value, then the result is significant. For example, at a significance level of 5%, the probability of a false positive test result would be bounded by roughly 5%.
Parametric vs non-parametric tests
There is a multitude of tests for determining statistical significance. These tests can be differentiated into two categories: parametric and non-parametric tests. While parametric tests make assumptions on the distribution of the data, non-parametric tests do not rely on such assumptions. For example, the parametric t-test compares the means of two groups because it assumes that the data have a normal distribution.
The non-parametric Wilcoxon rank sum test (Mann-Whitney U test), on the other hand, considers the medians of the groups instead. If the assumptions of parametric tests are met, they are generally more capable of detecting an effect than non-parametric tests. If this is not the case, however, non-parametric tests should be preferred.
Choosing an appropriate significance test
To find an appropriate statistical test, the structure of the data should be considered. Before starting an analysis, one should ask the following questions:
- How many dependent/independent variables are there?
- What are the types of the variables?
- Are the measurements in some way associated (i.e. matched)?
What is there besides significance?
Once you have found an appropriate test, you may want to look into topics that go beyond mere significance, such as:
- How can I use effect sizes to describe the extent of an effect?
- How can I use power analysis to identify the likelihood that a test detects an effect if it exists?
- How can I interpret measurements using other quantities such as confidence intervals?
Posts on statistical testing
You can find answers to these questions (and more) in the following posts on statistical testing.
When designing and performing statistical tests it is important to think about type 1 and type 2 errors. In this post, I investigate the impact of the two error types on significance and power, respectively.
Effect sizes are often overlooked in favor of significance. Here, you will learn why effect sizes are important and how they can be computed using R.
McNemar's test is a simple test for for checking whether pairwise measurements from two categories are independent. Here, I investigate the properties of the test and how it is used in R.
Measurements often come in pairs. Here I discuss what can go wrong when performing statistical tests that do not take this structure into account.
Parametric tests require that data are normally distributed. Here, you will learn how many samples are necessary to satisfy the assumptions of parametric tests.
Testing whether two groups are independent of each other is a common use case for the Chi-squared and Fisher's exact test. But, under which conditions are these tests appropriate?
If you want to compare the means or medians of paired measurements, you can use a paired Student's t-test or a Wilcoxon signed rank test, respectively. This post explores the properties of these two tests and contrasts them.