Forecasting is a powerful technique for time-series data. Here, I investigate the most common variants of forecasting algorithms: ARMA, ARIMA, SARIMA, and ARIMAX, which are primarily based on autocorrelation and moving averages.
Supervised learning is concerned with models for predicting the outcome for new data points.
Models for supervised learning
The following supervised learning models are important:
- Linear models: models that assume the existence of a linear relationship between the independent variables and the outcome.
- Support vector machines: models that deal with non-linear associations by transforming the data to another space via kernel functions.
- Neural networks: models that emulate the interaction of neurons in the nervous system.
- \(k\)-nearest neighbors: a model that classifies a new data point according to its \(k\) nearest neighbors in the training data.
Posts on supervised learning
The following posts discuss the use of supervised learning in R.
Prediction and forecasting are similar, yet distinct areas for which machine learning techniques can be used. Here, I differentiate the two approaches using weather forecasting as an example.
Inference is concerned with learning about the data generation process, while prediction is concerned with estimating the outcome for new observations. These contrasting principles are associated with the the generative modeling and machine learning communities. Here, I showcase the differences and similarities between the two concepts and offer insights about what the practitioners from both fields can learn from each other.
Linear discriminant analysis (LDA) is a classification and dimensionality reduction technique that is particularly useful for multi-class prediction problems. In this post I investigate the properties of LDA and the related methods of quadratic discriminant analysis and regularized discriminant analysis.
Dimensionality reduction is primarily used for exploring data and for reducing the feature space in machine learning applications. In this post, I investigate techniques such as PCA to obtain insights from a whiskey data set and show how PCA can be used to improve supervised approaches. Finally, I introduce the notion of the whiskey twilight zone.
Generalized linear models (GLMs) are related to conventional linear models but there are some important differences. For example, GLMs are based on the deviance rather than the conventional residuals and they enable the use of different distributions and linker functions. This post investigates how these aspects influence the interpretation of GLMs.
Although ordinary least-squares regression is often used, it is not appropriate for all types of data. Using the airquality data set, I try to find a generalized linear model that fits the data better. For this purpose, I use the following methods: weighted regression, Poisson regression, and imputation.
Linear machine learning models are very convenient for interpretation. This post discusses the following aspects: residuals, coefficients, standard errors, p-values, the F-statistic, and much more.