Linear discriminant analysis (LDA) is a classification and dimensionality reduction technique that is particularly useful for multi-class prediction problems. In this post I investigate the properties of LDA and the related methods of quadratic discriminant analysis and regularized discriminant analysis.
Humans are visual creatures. Thus, visualization is one of the most important tools for conveying information and data scientists should be adapt at selecting appropriate visualizations.
Which plot is appropriate?
Choosing an appropriate plot for a given set of data can be hard because there are so many types of plots such as scatter plots, box plots, and histograms. Fortunately, I have created an overview of the most important plots, when they are appropriate, and how they can be used in R.
Posts about data visualization
The following posts deal with topics from data visualization.
Radar plots are exceptional for visualizing the properties of individual objects. Here, I demonstrate how to draw radar plots in R by plotting the properties of whiskeys from several distilleries.
People without technical backgrounds can have a hard time understanding plots. A less formal means for conveying information is provided by infographics, which are easily understandable. This post compares several free tools for creating engaging infographics.
Box plots are limited since they only show Q1, Q2, and Q3. Box plot alternatives such as the beeswarm and violin plot, however, provide more information about the overall distribution of the data.
Line plots are ideally suited for visualizing time series data. Using some stock market data, I demonstrate how line plots can be generated using native R, the MTS package, and ggplot.
Bar plots are frequently used due to their simplicity. However, they also do not convey a lot of information. Here, I discuss how error bars can be used to visualize variance and under which circumstances bar charts should not be used.
Box plots are ideal for showing the variation of measurements because they do not only visualize the first, second, and third quartile, but also outliers.
Scatter plots are a great tool for learning about individual data points. Here, I demonstrate the use of scatter plots for visualizing the correlation between two variables.
Histograms are an ideal tool for visualizing the distribution of a variable and frequently used for data exploration. Here, I show how a histogram acan aid in differentiating two distributions.