# Visualizing Time-Series Data with Line Plots

The line plot is the go-to plot for visualizing time-series data (i.e. measurements for several points in time) as it allows for showing trends along time. Here, we’ll use stock market data to show how line plots can be created using native R, the MTS package, and ggplot.

## The EuStockMarkets data set

The EuStockMarkets data set contains the daily closing prices (except for weekends/holidays) of four European stock exchanges: the DAX (Germany), the SMI (Switzerland), the CAC (France), and the FTSE (UK). An important characteristic of these data is that they represent stock market points, which have different interpretations depending on the exchange. Thus, one should not compare points between different exchanges.

```
data(EuStockMarkets)
summary(EuStockMarkets)
```

```
## DAX SMI CAC FTSE
## Min. :1402 Min. :1587 Min. :1611 Min. :2281
## 1st Qu.:1744 1st Qu.:2166 1st Qu.:1875 1st Qu.:2843
## Median :2141 Median :2796 Median :1992 Median :3247
## Mean :2531 Mean :3376 Mean :2228 Mean :3566
## 3rd Qu.:2722 3rd Qu.:3812 3rd Qu.:2274 3rd Qu.:3994
## Max. :6186 Max. :8412 Max. :4388 Max. :6179
```

`class(EuStockMarkets)`

`## [1] "mts" "ts" "matrix"`

What is interesting is that the data set is not only a matrix but also an *mts* and *ts* object, which indicate that this is a time-series object.

In the following, I will show how these data can be plotted with native R, the MTS package, and, finally, ggplot.

## Creating a line plot in native R

Creating line plots in native R is a bit messy because the `lines`

function does not create a new plot by itself.

```
# create a plot with 4 rows and 1 column
par(mfrow=c(4,1))
# set x-axis to number of measurements
x <- seq_len(nrow(EuStockMarkets))
for (i in seq_len(ncol(EuStockMarkets))) {
# plot stock exchange points
y <- EuStockMarkets[,i]
# show stock exchange name as heading
heading <- colnames(EuStockMarkets)[i]
# create empty plot as template, don't show x-axis
plot(x, y, type="n", main = heading, xaxt = "n")
# add actual data to the plot
lines(x, EuStockMarkets[,i])
# adjust x tick labels to years
years <- as.integer(time(EuStockMarkets))
tick.posis <- seq(10, length(years), by = 100)
axis(1, at = tick.posis, las = 2, labels = years[tick.posis])
}
```

The plot shows us that all of the European stock exchanges are highly correlated and we could use the plot to explain the stock market variation based on past economic events.

Note that this is a quick and dirty way of creating the plot because it assumes that the time between all measurements is identical. This approximation is acceptable for this data set because there are (nearly) daily measurements. However, if there were time periods with lower sampling frequency, this should be shown by scaling the axis according to the dates of the measured (see the ggplot example below).

## Creating a line plot of an MTS object

If you have an object of type mts, then it is much easier to use the plot function associated with the mts object, `plots.mts`

, which is provided by the `stats`

package that is included with every R distribution. This plotting functions gives a similar but admittedly improved plot than the one I manually created above.

`plot(EuStockMarkets)`

## Creating a line plot with ggplot

Creating a ggplot version of the line plot can either be done by hand, which is quite cumbersome, or via the `zoo`

package, which is much more convenient.

### The manual approach

To create the same plot with ggplot, we need to construct a data frame first. In this example, we want to consider the dates at which the measurements were taken when scaling the x-axis.

The problem here is that the mts object doesn’t store the years as dates but as floating point numbers. For example, a value of 1998.0 indicates a day in the beginning of 1998, while 1998.9 indicates a value at the end if 1998. Since I could not find a function that transforms such representations, we will create a function that transforms this numeric representation to dates.

```
scale.value.range <- function(x, old, new) {
# scale value from interval (min/max) 'old' to 'new'
scale <- (x - old[1]) / (old[2] - old[1])
newscale <- new[2] - new[1]
res <- scale * newscale + new[1]
return(res)
}
float.to.date <- function(x) {
# convert a float 'x' (e.g. 1998.1) to its Date representation
year <- as.integer(x)
# obtaining the month: consider decimals
float.val <- x - year
# months: transform from [0,1) value range to [1,12] value range
mon.float <- scale.value.range(float.val, c(0,1), c(1,12))
mon <- as.integer(mon.float)
date <- get.date(year, mon.float, mon)
return(date)
}
days.in.month <- function(year, mon) {
# day: transform based on specific month and year (leap years!)
date1 <- as.Date(paste(year, mon, 1, sep = "-"))
date2 <- as.Date(paste(year, mon+1, 1, sep = "-"))
days <- difftime(date2, date1)
return(as.numeric(days))
}
get.date <- function(year, mon.float, mon) {
max.nbr.days <- days.in.month(year, mon)
day.float <- sapply(seq_along(year), function(x)
scale.value.range(mon.float[x] - mon[x], c(0,1), c(1,max.nbr.days[x])))
day <- as.integer(day.float)
date.rep <- paste(as.character(year), as.character(mon),
as.character(day), sep = "-")
date <- as.Date(date.rep, format = "%Y-%m-%d")
return(date)
}
mts.to.df <- function(obj) {
date <- float.to.date(as.numeric(time(obj)))
df <- cbind("Date" = date, as.data.frame(obj))
return(df)
}
library(ggplot2)
df <- mts.to.df(EuStockMarkets)
# go from wide to long format
library(reshape2)
dff <- melt(df, "Date", variable.name = "Exchange", value.name = "Points")
# load scales to format dates on x-axis
library(scales)
ggplot(dff, aes(x = Date, y = Points)) +
geom_line(aes(color = Exchange), size = 1) +
# use date_breaks to have more frequent labels
scale_x_date(labels = date_format("%m-%Y"), date_breaks = "4 months") +
# rotate x-axis labels
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
```

Creating the ggplot visualization for this example involved more work because I wanted to have an improved representation of the dates as for the other two approaches for creating the plot. For a faster, yet less accurate representation, the plot could have also been created by ignoring the months and just using the years, as in the first example.

### Creating the plot with the zoo package

To create a ggplot version of the plot, we can use the `autoplot`

function from `ggplot2`

, ater transforming the `mts`

object to a `zoo`

object via `as.zoo`

:

```
library(zoo)
zooMarkets <- as.zoo(EuStockMarkets)
#autoplot(zooMarkets) # plot with facets
autoplot(zooMarkets, facet = NULL) # plot without facets
```

Rather than using the custom `mts.to.df`

function, we could have also used the ggplot2’s `fortify`

function on the zoo object in order to convert it to a data frame:

```
market.df <- fortify(zooMarkets)
print(head(market.df))
```

```
## Index DAX SMI CAC FTSE
## 1 1991.496 1628.75 1678.1 1772.8 2443.6
## 2 1991.500 1613.63 1688.5 1750.5 2460.2
## 3 1991.504 1606.51 1678.6 1718.0 2448.2
## 4 1991.508 1621.04 1684.1 1708.1 2470.4
## 5 1991.512 1618.16 1686.6 1723.1 2484.7
## 6 1991.515 1610.61 1671.6 1714.3 2466.8
```

Note, however, that the *Index* column provides the date as a floating point number rather than as a `Date`

as in the `mts.to.df`

function.

## R Packages for time-series data

Additional functions for multivariate time-series data are available via the MTS package. For irregular time-series data, the XTS and zoo packages are useful.