This blog is intended for everyone that is interested in working with data, particularly those that use R to do so. The idea of this project is to give an overview of the most relevant methods for practicing data science, spanning areas such as statistical testing, machine learning, and data visualization. Since the site focuses on applications, every method is applied to an appropriate data set and the results are critically discussed. Besides dealing with basic methods that are explained using small data sets, I also plan to do more comprehensive analyses of larger data sets involving the subsequent application of several methods.
The motivation for this blog is that descriptions in text books and manuals are often too obscure to be understandable without investing considerable time. That is probably one of the reasons why many people begin to pale and sweat when they only hear the word statistics. With this blog, I’d like to change that. Having worked with data and R for more than 6 years, I felt that now is the right time to give something back to the community. Moreover, way too much time has passed since the last time I blogged, which was back in the year 2009.