If your data follows a normal distribution using the mean is fine. But what should you do in other cases? Here, I explore the implications of using one or the other measure.
All posts with the R tag deal with applications of the statistical programming language R in the data science setting.
Posts about R
To compare the statistical significance of multiple quantitative variable, the ANOVA test is the way to go. Here, I discuss what you should consider when performing an ANOVA in R.
When designing and performing statistical tests it is important to think about type 1 and type 2 errors. In this post, I investigate the impact of the two error types on significance and power, respectively.
Effect sizes are often overlooked in favor of significance. Here, you will learn why effect sizes are important and how they can be computed using R.
McNemar's test is a simple test for for checking whether pairwise measurements from two categories are independent. Here, I investigate the properties of the test and how it is used in R.
Measurements often come in pairs. Here I discuss what can go wrong when performing statistical tests that do not take this structure into account.
Parametric tests require that data are normally distributed. Here, you will learn how many samples are necessary to satisfy the assumptions of parametric tests.
Testing whether two groups are independent of each other is a common use case for the Chi-squared and Fisher's exact test. But, under which conditions are these tests appropriate?
If you want to compare the means or medians of paired measurements, you can use a paired Student's t-test or a Wilcoxon signed rank test, respectively. This post explores the properties of these two tests and contrasts them.