library(lessR)
##
## lessR 3.4.9 feedback: gerbing@pdx.edu web: lessRstats.com
## -------------------------------------------------------------------
## 1. Read a text, Excel, SPSS, SAS or R data file from your computer
## into the mydata data table: mydata <- Read()
## 2. For a list of help topics and functions, enter: Help()
## 3. Use theme function for global settings, Ex: theme(colors="gray")
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following object is masked from 'package:lessR':
##
## theme
goal: plot a time series of values over a range of dates
ggplot wants the date to be its own variable in the data table, so the data table contains a column of dates and at least one column of values that are to be plotted against the dates, one value for each date
date fields in the data file with values such as 2016-05-02 are initially read as a character strings, so need to interpret them as a date class, instead of character strings, integers, etc.
data file for this example: http://web.pdx.edu/~gerbing/data/PPStech.csv in this example, the name of the variable that contains the dates is date
read the dates in the data file as character strings, then convert mydata <- Read(“http://web.pdx.edu/~gerbing/data/PPStech.csv”) the R as.Date function converts from class character to class date then save the result into the data table with lessR Transform function or standard R transform function mydata <- Transform(date=as.Date(date)) as always, lessR functions default to data=mydata just to illustrate, the following is the same as the above mydata <- Transform(date=as.Date(date), data=mydata)
or
use the R colClasses option to instruct R how to read the data values under the Date column also, for large data sets, the use of the colClasses option can drastically speed up reading a data table because R does not have to do the work of examining all the data values to infer the corresponding class of each variable, instead you tell R how to read the data values for each column, as integer, numeric, factor, Date, etc.
mydata <- Read("http://web.pdx.edu/~gerbing/data/PPStech.csv",
colClasses=list(date="Date"))
## [more information with details() for mydata, or details(name)]
##
## Data Types
## ------------------------------------------------------------
## integer: Numeric data values, but integers only
## numeric: Numeric data values with decimal digits
## ------------------------------------------------------------
##
## Variable Missing Unique
## Name Type Values Values Values First and last values
## ----------------------------------------------------------------------------------------
## date Date 426 0 426 2016-05-02 ... 1980-12-01
## Apple numeric 426 0 388 90.52 93.173 ... 0.424 0.512
## IBM numeric 426 0 423 147.72 144.545 ... 6.916 7.292
## Intel numeric 426 0 403 29.91 30.021 ... 0.269 0.291
## ----------------------------------------------------------------------------------------
##
## No variable labels
##
## No variable units
after reading the data into the R data table mydata, the data table begins with the following
head(mydata, n=3)
## date Apple IBM Intel
## 1 2016-05-02 90.520 147.720 29.910
## 2 2016-04-01 93.173 144.545 30.021
## 3 2016-03-01 108.330 150.002 32.073
this data table is in what is called the wide format, as it has multiple (three) measurements per row however, to plot a single time series, just need two columns of the data table, the column of dates and one column of measurements the time series can be plotted as a line and/or an area, for which there are specific geoms
ggplot time series of single variable plot line only
ggplot(mydata, aes(date, Apple)) + geom_line()
plot area only
ggplot(mydata, aes(date, Apple)) + geom_area()
plot customized area
ggplot(mydata, aes(date, Apple)) + geom_area(fill="green", alpha=.2)
plot customized line and area
ggplot(mydata, aes(date, Apple)) +
geom_area(fill=rgb(0,1,0), alpha=.2) +
geom_line(color="darkgreen")
can modify the format of the displayed dates to do so, use the ggplot scale_x_date function the scales package is needed for the date_format function
library(scales)
see ?strptime for date formats, e.g., %b is the abbreviated month name
ggplot(mydata, aes(date, Apple)) + geom_line() +
scale_x_date(labels=date_format("%b-%Y"))
lessR time series of single variable plot the time series with the lessR function LineChart or lc
LineChart follows the standard R practice of adding the dates to the analysis instead of using any dates that may exist in the data table here the dates in the data are ignored
specify the dates with parameters time.start and time.by, for LineChart, dates st red in the data table should begin with the first time period, if not, as in this situation, set the time.reverse parameter
LineChart(Apple, time.reverse=TRUE, time.start="1980/12/01", time.by="1 month")
## [To view the individual runs: show.runs=TRUE]
customized, lc abbreviation for LineChart
lc(Apple, time.reverse=TRUE, time.start="1980/12/01", time.by="1 month",
color.area=rgb(0,1,0,.2), color.stroke="darkgreen")
## [To view the individual runs: show.runs=TRUE]
either each series by itself, or faceted or stacked
data table is in wide format, multiple measurements per row
mydata <- Read("http://web.pdx.edu/~gerbing/data/PPStech.csv",
colClasses=list(date="Date"))
## [more information with details() for mydata, or details(name)]
##
## Data Types
## ------------------------------------------------------------
## integer: Numeric data values, but integers only
## numeric: Numeric data values with decimal digits
## ------------------------------------------------------------
##
## Variable Missing Unique
## Name Type Values Values Values First and last values
## ----------------------------------------------------------------------------------------
## date Date 426 0 426 2016-05-02 ... 1980-12-01
## Apple numeric 426 0 388 90.52 93.173 ... 0.424 0.512
## IBM numeric 426 0 423 147.72 144.545 ... 6.916 7.292
## Intel numeric 426 0 403 29.91 30.021 ... 0.269 0.291
## ----------------------------------------------------------------------------------------
##
## No variable labels
##
## No variable units
to plot multiple time series on the same graph, here by company, still need just one measurement per row, but now we want to refer to all three measurements in a single analysis reshape data table to long format, one measurement per row to reshape, use the melt function from Hadley Wickham’s reshape2 package, that is, “melt” the wide form to the long (narrow) form
library(reshape2)
the melt function refers to three different types of variables id.vars: x-axis variable, here the dates to be plotted against variable.name: grouping variable Company note that the variable Company is created from the existing variables in the wide form of the data table: Apple, IBM, and Intel value.name: y=axis variable, the values plotted against the dates melt presumes by default that after the ID variable, all remaining variables are measured variables, here Apple, IBM and Intel melt then takes the measurements for these variables and puts them under the variable specified as the value.name, and takes the name of each of the original variables and defines these names as values of the corresponding grouping variable Company
here leave the original wide form of the data in data frame mydata alone and instead create a new data frame in long format called myd
myd <- melt(mydata, id.vars="date", variable.name="Company", value.name="Price")
the newly created variable Company has three values: Apple, IBM, Intel
head(myd, n=3)
## date Company Price
## 1 2016-05-02 Apple 90.520
## 2 2016-04-01 Apple 93.173
## 3 2016-03-01 Apple 108.330
the time series for multiple companies can be printed on the same graph
each time series plotted independent of the others on the same graph
ggplot(myd, aes(date, Price, color=Company)) + geom_line()
or
ggplot(myd, aes(date, Price, linetype=Company)) + geom_line()
or, does not work well here, but illustrates,
ggplot(myd, aes(date, Price, size=Company)) + geom_line()
## Warning: Using size for a discrete variable is not advised.
lessR use scatter plot function Plot specify y as three separate variables use original wide-format form of the data table, no conversion needed error: need to retain the date format for the x-axis, an easy fix for next version note: when dates are on the x-axis, will default to a line plot in next version
Plot(date, c(Apple, IBM, Intel), object="line")
##
## >>> geometric object to plot: object = "line"
## >>> subject of the analysis: topic = "data"
faceted, plot each time series separately in its own graph stacked vertically for comparison
ggplot(myd, aes(date, Price, fill=Company)) + geom_area() + facet_grid(Company ~ .)
keep the graph gray scale, with a narrow black border at top of each graph
ggplot(myd, aes(date, Price, fill=Company)) +
geom_area(color="black", size=.25, fill="darkgray") +
facet_grid(Company ~ .)
time series stacked area graph stacked area plot, each area (region) plotted separately, but stacked on the previous plot
ggplot(myd, aes(date, Price, fill=Company)) + geom_area()
change the order of the stacking
first re-order the levels of the grouping variable that defines the stack order myd <- Transform(Company=factor(Company, levels=c(“Intel”, “Apple”, “IBM”)), data=myd) then reorder the data frame according to this order syntax is myd[rows,columns], so myd[rows, ] means no change to the columns
myd <- myd[order(myd$Company), ]
plot with new order
ggplot(myd, aes(date, Price, fill=Company)) + geom_area()
customize the colors of the stacked layers
package RColorBrewer has functions that generate gradients of colors
library(RColorBrewer)
display built-in color scales from the RColorBrewer library that are synchronized with ggplot via special ggplot functions but to see all the scales, need a RColorBrewer function, so need to load the library
display.brewer.all()
use ggplot functions scale_fill_brewer or scale_color_brewer, the distinction between filling in the interior of a region and the border of the region
use the default brewer scale, “Blues”
ggplot(myd, aes(date, Price, fill=Company)) + geom_area() +
scale_fill_brewer()
specify the gray brewer scale
ggplot(myd, aes(date, Price, fill=Company)) + geom_area() +
scale_fill_brewer(palette="Greys")
an explicit gray scale function, from which can specify start and end points of grayness 0 is black, 1 is white
ggplot(myd, aes(date, Price, fill=Company)) + geom_area() +
scale_fill_grey(start=.4, end=.8)
can also manually specify discrete color ranges, here toned down a bit with some alpha transparency
ggplot(myd, aes(date, Price, fill=Company)) + geom_area(alpha=.7) +
scale_fill_manual(values=c(Apple="blue", IBM="red", Intel="green"))
the ggplot scales are based on the hls (or hsl) color model: hue lightness (luminance) saturation HSL Color Picker: http://www.workwithcolor.com/hsl-color-picker-01.htm
darken colors: changed default luminosity from 65 down to 45
ggplot(myd, aes(date, Price, fill=Company)) + geom_area() +
scale_fill_hue(l=45)
can modify virtually every detail of a ggplot graph with the theme function labels, theme and theme modification
ggplot(myd, aes(date, Price, fill=Company)) + geom_area(alpha=.8) +
# y-axis label
labs(y="Adjusted Closing Price") +
# plot title
ggtitle("Stock Market Data") +
# change from default theme to black & white theme
theme_bw() +
# panel.grid refers to grid lines
# element_blank removes the element
theme(panel.grid.major.y=element_line(color="grey50"),
panel.grid.major.x=element_blank(),
panel.grid.minor=element_blank(),
# format plot title, rel is relative to the base line of 1.0
plot.title=element_text(size=rel(1.1), face="bold"),
# legend
legend.position="top",
legend.title=element_blank(),
legend.key=element_blank())
mydata <- Read("~/Dropbox/511Stuff/BookNew/data/share_price/PPStech.csv",
colClasses=list(date="Date"))
## [more information with details() for mydata, or details(name)]
##
## Data Types
## ------------------------------------------------------------
## integer: Numeric data values, but integers only
## numeric: Numeric data values with decimal digits
## ------------------------------------------------------------
##
## Variable Missing Unique
## Name Type Values Values Values First and last values
## ----------------------------------------------------------------------------------------
## date Date 426 0 426 2016-05-02 ... 1980-12-01
## Apple numeric 426 0 388 90.52 93.173 ... 0.424 0.512
## IBM numeric 426 0 423 147.72 144.545 ... 6.916 7.292
## Intel numeric 426 0 403 29.91 30.021 ... 0.269 0.291
## ----------------------------------------------------------------------------------------
##
## No variable labels
##
## No variable units
first use R functions to get forecast need to get data re-ordered from earliest to latest for R and lessR R has no single function to sort a data frame, here use lessR Sort
mydata <- Sort(date)
##
## Sort Specification
## -------------------------------
## date --> ascending
## -------------------------------
##
##
## --------------------------------------------------------------------
## After the Sort, first four rows of data
## --------------------------------------------------------------------
## date Apple IBM Intel
## 426 1980-12-01 0.512 7.292 0.291
## 425 1981-01-02 0.424 6.916 0.269
## 424 1981-02-02 0.398 6.995 0.253
## 423 1981-03-02 0.368 6.791 0.262
in standard R, to work with time series data convert the vector of values to an R class specifically called ts for time series create the time series object with R function ts the dates are specified by the ts function, not read from the data because ts is an R function, need to specify the relevant data frame for each specified variable
A <- ts(mydata$Apple, start=c(1980,12), frequency=12)
interlude: how R works view the time series, monthly data from Dec 1980 until May 2016
head(A)
## [1] 0.512 0.424 0.398 0.368 0.426 0.497
use the R class function to know what is the corresponding class such as integer, Date, ts, etc. created an R object of class ts
class(A)
## [1] "ts"
plot is what R calls a generic function, each method is a version of the function adapted to different classes of input view all different plot methods with the R methods function
methods(plot)
## [1] plot,ANY-method plot,color-method plot.acf*
## [4] plot.data.frame* plot.decomposed.ts* plot.default
## [7] plot.dendrogram* plot.density* plot.ecdf
## [10] plot.factor* plot.formula* plot.function
## [13] plot.ggplot* plot.gtable* plot.hclust*
## [16] plot.histogram* plot.HoltWinters* plot.isoreg*
## [19] plot.lm* plot.medpolish* plot.mlm*
## [22] plot.ppr* plot.prcomp* plot.princomp*
## [25] plot.profile.nls* plot.raster* plot.regsubsets*
## [28] plot.spec* plot.stepfun plot.stl*
## [31] plot.table* plot.ts plot.tskernel*
## [34] plot.TukeyHSD*
## see '?methods' for accessing help and source code
standard R plot function, here for class ts
plot(A)
same as explicitly indicating ts method for plot
plot.ts(A)
lessR function LineChart also plots a time series object note that lessR graphics may require first deleting the R graph
lc(A)
## [To view the individual runs: show.runs=TRUE]
many forecasting methods, Holt-Winters is one of the most widely used non-linear methods exponential smoothing with trend and seasonal components called triple exponential smoothing or Holt-Winters exp smoothing means that model of current value is a weighted average of all past observations with exponentially decreasing weights over time R provides the corresponding function, HoltWinters here save into an R object named Ahw
Ahw <- HoltWinters(A)
result is a class of hw
class(Ahw)
## [1] "HoltWinters"
Holt-Winters estimates the trend (a and b) and seasonality (s) display the output by listing the name of the created R object
Ahw
## Holt-Winters exponential smoothing with trend and additive seasonal component.
##
## Call:
## HoltWinters(x = A)
##
## Smoothing parameters:
## alpha: 0.9588847
## beta : 0.009271752
## gamma: 1
##
## Coefficients:
## [,1]
## a 90.071652603
## b 0.433410220
## s1 -0.560771433
## s2 0.566618734
## s3 0.486987162
## s4 -0.180748868
## s5 0.735331266
## s6 0.511771309
## s7 -0.928948610
## s8 -0.802908352
## s9 0.815139939
## s10 0.961134626
## s11 0.001646068
## s12 0.448347397
obtain the forecasts with the R generic predict function here the method adapted for the hw class, predict.HoltWinters
Ahw.p <- predict(Ahw, n.ahead=24)
Ahw.p
## Jan Feb Mar Apr May Jun Jul
## 2016 89.94429 91.50509
## 2017 92.73603 94.78748 95.36689 94.84081 95.72092 95.14521 96.70601
## 2018 97.93695 99.98841 100.56781 100.04173 100.92185
## Aug Sep Oct Nov Dec
## 2016 91.85887 91.62454 92.97403 93.18389 92.17658
## 2017 97.05979 96.82547 98.17496 98.38481 97.37750
## 2018
plot data in the time series object A accommodate room for plot beyond actual data for the forecasted values plot true forecasts in Ahw.p into the future
plot(A, xlim=c(1980, 2018))
lines(Ahw.p, col="red")
include error bands for the forecasts from package/function forecast
library(forecast)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: timeDate
## This is forecast 7.1
create object of class forecast forecast 24 months into the future obtain the forecast plus the .80 and .95 confidence bands
fhw <- forecast(Ahw, h=24)
“print” the contents of fhw, i.e., print.forecast
fhw
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jun 2016 89.94429 86.68704 93.20154 84.96276 94.92582
## Jul 2016 91.50509 86.97226 96.03792 84.57273 98.43746
## Aug 2016 91.85887 86.32110 97.39664 83.38959 100.32815
## Sep 2016 91.62454 85.22359 98.02550 81.83513 101.41396
## Oct 2016 92.97403 85.80017 100.14790 82.00255 103.94552
## Nov 2016 93.18389 85.30068 101.06709 81.12756 105.24021
## Dec 2016 92.17658 83.63165 100.72150 79.10824 105.24491
## Jan 2017 92.73603 83.56661 101.90544 78.71261 106.75944
## Feb 2017 94.78748 85.02356 104.55141 79.85485 109.72012
## Mar 2017 95.36689 85.03319 105.70059 79.56286 111.17092
## Apr 2017 94.84081 83.95810 105.72352 78.19715 111.48447
## May 2017 95.72092 84.30691 107.13494 78.26469 113.17715
## Jun 2017 95.14521 83.17550 107.11493 76.83912 113.45131
## Jul 2017 96.70601 84.23515 109.17688 77.63347 115.77856
## Aug 2017 97.05979 84.09929 110.02029 77.23842 116.88117
## Sep 2017 96.82547 83.38553 110.26540 76.27086 117.38007
## Oct 2017 98.17496 84.26467 112.08524 76.90101 119.44890
## Nov 2017 98.38481 84.01231 112.75731 76.40397 120.36565
## Dec 2017 97.37750 82.55010 112.20489 74.70095 120.05404
## Jan 2018 97.93695 82.66126 113.21263 74.57480 121.29909
## Feb 2018 99.98841 84.27042 115.70639 75.94982 124.02699
## Mar 2018 100.56781 84.41297 116.72265 75.86112 125.27451
## Apr 2018 100.04173 83.45500 116.62846 74.67452 125.40895
## May 2018 100.92185 83.90776 117.93593 74.90105 126.94264
again, check the class just to know
class(fhw)
## [1] "forecast"
plot the forecast object fhw, i.e., plot.forecast
plot(fhw)
equivalent to ggplot
autoplot(fhw)
use R unclass function to see what is in fhw not the print view, but the actual contents stat output of R functions is in the form of an R list
unclass(fhw)
## $method
## [1] "HoltWinters"
##
## $model
## Holt-Winters exponential smoothing with trend and additive seasonal component.
##
## Call:
## HoltWinters(x = A)
##
## Smoothing parameters:
## alpha: 0.9588847
## beta : 0.009271752
## gamma: 1
##
## Coefficients:
## [,1]
## a 90.071652603
## b 0.433410220
## s1 -0.560771433
## s2 0.566618734
## s3 0.486987162
## s4 -0.180748868
## s5 0.735331266
## s6 0.511771309
## s7 -0.928948610
## s8 -0.802908352
## s9 0.815139939
## s10 0.961134626
## s11 0.001646068
## s12 0.448347397
##
## $level
## [1] 80 95
##
## $mean
## Jan Feb Mar Apr May Jun Jul
## 2016 89.94429 91.50509
## 2017 92.73603 94.78748 95.36689 94.84081 95.72092 95.14521 96.70601
## 2018 97.93695 99.98841 100.56781 100.04173 100.92185
## Aug Sep Oct Nov Dec
## 2016 91.85887 91.62454 92.97403 93.18389 92.17658
## 2017 97.05979 96.82547 98.17496 98.38481 97.37750
## 2018
##
## $lower
## 80% 95%
## [1,] 86.68704 84.96276
## [2,] 86.97226 84.57273
## [3,] 86.32110 83.38959
## [4,] 85.22359 81.83513
## [5,] 85.80017 82.00255
## [6,] 85.30068 81.12756
## [7,] 83.63165 79.10824
## [8,] 83.56661 78.71261
## [9,] 85.02356 79.85485
## [10,] 85.03319 79.56286
## [11,] 83.95810 78.19715
## [12,] 84.30691 78.26469
## [13,] 83.17550 76.83912
## [14,] 84.23515 77.63347
## [15,] 84.09929 77.23842
## [16,] 83.38553 76.27086
## [17,] 84.26467 76.90101
## [18,] 84.01231 76.40397
## [19,] 82.55010 74.70095
## [20,] 82.66126 74.57480
## [21,] 84.27042 75.94982
## [22,] 84.41297 75.86112
## [23,] 83.45500 74.67452
## [24,] 83.90776 74.90105
##
## $upper
## 80% 95%
## [1,] 93.20154 94.92582
## [2,] 96.03792 98.43746
## [3,] 97.39664 100.32815
## [4,] 98.02550 101.41396
## [5,] 100.14790 103.94552
## [6,] 101.06709 105.24021
## [7,] 100.72150 105.24491
## [8,] 101.90544 106.75944
## [9,] 104.55141 109.72012
## [10,] 105.70059 111.17092
## [11,] 105.72352 111.48447
## [12,] 107.13494 113.17715
## [13,] 107.11493 113.45131
## [14,] 109.17688 115.77856
## [15,] 110.02029 116.88117
## [16,] 110.26540 117.38007
## [17,] 112.08524 119.44890
## [18,] 112.75731 120.36565
## [19,] 112.20489 120.05404
## [20,] 113.21263 121.29909
## [21,] 115.70639 124.02699
## [22,] 116.72265 125.27451
## [23,] 116.62846 125.40895
## [24,] 117.93593 126.94264
##
## $x
## Jan Feb Mar Apr May Jun Jul Aug
## 1980
## 1981 0.424 0.398 0.368 0.426 0.497 0.390 0.375 0.302
## 1982 0.306 0.274 0.253 0.221 0.210 0.191 0.203 0.270
## 1983 0.613 0.685 0.634 0.758 0.866 0.733 0.523 0.559
## 1984 0.371 0.394 0.371 0.471 0.441 0.398 0.383 0.398
## 1985 0.435 0.371 0.332 0.319 0.261 0.270 0.238 0.225
## 1986 0.347 0.375 0.424 0.454 0.555 0.538 0.469 0.555
## 1987 0.833 1.050 0.968 1.189 1.187 1.217 1.240 1.625
## 1988 1.252 1.299 1.209 1.239 1.256 1.400 1.344 1.210
## 1989 1.148 1.106 1.086 1.189 1.459 1.261 1.215 1.363
## 1990 1.044 1.048 1.240 1.213 1.275 1.383 1.298 1.147
## 1991 1.726 1.784 2.119 1.714 1.468 1.296 1.445 1.659
## 1992 2.032 2.123 1.832 1.891 1.879 1.512 1.473 1.453
## 1993 1.884 1.682 1.634 1.626 1.800 1.256 0.882 0.846
## 1994 1.050 1.174 1.070 0.965 0.945 0.856 1.088 1.173
## 1995 1.312 1.287 1.149 1.247 1.358 1.518 1.471 1.409
## 1996 0.908 0.904 0.807 0.801 0.859 0.690 0.723 0.797
## 1997 0.547 0.534 0.600 0.559 0.547 0.468 0.575 0.715
## 1998 0.602 0.777 0.904 0.900 0.875 0.943 1.138 1.025
## 1999 1.354 1.144 1.181 1.512 1.449 1.522 1.831 2.145
## 2000 3.411 3.768 4.465 4.078 2.761 3.444 3.341 4.007
## 2001 1.422 1.200 1.451 1.676 1.312 1.529 1.235 1.220
## 2002 1.625 1.427 1.556 1.596 1.532 1.165 1.003 0.970
## 2003 0.944 0.987 0.930 0.935 1.180 1.253 1.386 1.487
## 2004 1.483 1.573 1.778 1.695 1.845 2.139 2.126 2.268
## 2005 5.056 5.899 5.480 4.742 5.228 4.840 5.608 6.166
## 2006 9.929 9.006 8.248 9.256 7.860 7.531 8.937 8.922
## 2007 11.273 11.126 12.217 13.123 15.936 16.048 17.326 18.210
## 2008 17.800 16.440 18.870 22.874 24.820 22.018 20.902 22.293
## 2009 11.852 11.744 13.823 16.546 17.859 18.729 21.485 22.119
## 2010 25.255 26.907 30.902 34.333 33.779 33.076 33.828 31.967
## 2011 44.620 46.446 45.828 46.041 45.739 44.140 51.347 50.604
## 2012 60.026 71.330 78.840 76.792 75.970 76.795 80.314 87.853
## 2013 60.428 58.900 59.068 59.084 60.409 53.263 60.785 65.876
## 2014 68.081 71.996 73.433 80.732 87.086 89.495 92.066 99.202
## 2015 113.882 125.359 121.426 122.129 127.666 122.913 118.866 110.998
## 2016 96.229 96.105 108.330 93.173 90.520
## Sep Oct Nov Dec
## 1980 0.512
## 1981 0.229 0.300 0.279 0.332
## 1982 0.274 0.381 0.478 0.448
## 1983 0.347 0.339 0.306 0.366
## 1984 0.377 0.373 0.371 0.437
## 1985 0.236 0.279 0.302 0.330
## 1986 0.503 0.520 0.600 0.608
## 1987 1.700 1.162 0.995 1.267
## 1988 1.312 1.172 1.144 1.224
## 1989 1.363 1.425 1.359 1.083
## 1990 0.899 0.953 1.143 1.337
## 1991 1.550 1.612 1.593 1.769
## 1992 1.426 1.659 1.820 1.892
## 1993 0.747 0.982 1.010 0.937
## 1994 1.092 1.400 1.211 1.268
## 1995 1.221 1.190 1.253 1.048
## 1996 0.729 0.756 0.793 0.686
## 1997 0.713 0.560 0.584 0.431
## 1998 1.253 1.220 1.050 1.346
## 1999 2.081 2.634 3.218 3.380
## 2000 1.693 1.286 1.085 0.978
## 2001 1.020 1.155 1.400 1.440
## 2002 0.953 1.057 1.019 0.942
## 2003 1.362 1.505 1.375 1.405
## 2004 2.548 3.445 4.408 4.234
## 2005 7.050 7.573 8.918 9.453
## 2006 10.123 10.662 12.053 11.156
## 2007 20.181 24.978 23.962 26.047
## 2008 14.946 14.148 12.186 11.223
## 2009 24.373 24.787 26.288 27.711
## 2010 37.313 39.578 40.916 42.416
## 2011 50.143 53.228 50.258 53.257
## 2012 88.099 78.619 77.647 70.601
## 2013 64.461 70.674 75.625 76.298
## 2014 97.508 104.525 115.603 107.292
## 2015 108.576 117.632 116.950 104.058
## 2016
##
## $xname
## [1] "x"
##
## $fitted
## Jan Feb Mar Apr May
## 1981
## 1982 0.3137852 0.2712819 0.2408326 0.2038439 0.1863747
## 1983 0.4353364 0.5750479 0.6527244 0.5909568 0.7234482
## 1984 0.3604098 0.3356838 0.3566484 0.3326203 0.4347505
## 1985 0.4329456 0.4034586 0.3362276 0.2999731 0.2816534
## 1986 0.3270384 0.3133549 0.3384074 0.3909230 0.4152892
## 1987 0.6109812 0.7977710 1.0132604 0.9449183 1.1531482
## 1988 1.2765079 1.2318770 1.2597219 1.2000728 1.2031492
## 1989 1.2317870 1.1329533 1.0637799 1.0764434 1.1500326
## 1990 1.1007629 1.0290584 1.0052021 1.2265948 1.1873507
## 1991 1.3491735 1.7024549 1.7541708 2.0978595 1.7122688
## 1992 1.7958429 2.0043166 2.1058400 1.8032067 1.8766799
## 1993 1.9291217 1.8611229 1.6563694 1.6073956 1.6080547
## 1994 0.9689655 1.0086205 1.1359069 1.0417400 0.9521239
## 1995 1.3069909 1.2797542 1.2470333 1.1224640 1.2313486
## 1996 1.0963134 0.8800469 0.8554186 0.7844170 0.7857562
## 1997 0.7301164 0.5247079 0.4800627 0.5716812 0.5455426
## 1998 0.4738706 0.5758743 0.7224826 0.8710021 0.8890862
## 1999 1.3924336 1.3448123 1.1087171 1.1487662 1.4910879
## 2000 3.4384664 3.4129165 3.7442633 4.4470061 4.0929580
## 2001 1.0175337 1.4016848 1.1890376 1.3778336 1.6006363
## 2002 1.5002025 1.5931239 1.4358172 1.4910696 1.5036549
## 2003 1.0073406 0.9017365 0.9933265 0.8664296 0.8351199
## 2004 1.4725217 1.4489158 1.5770707 1.7167866 1.6171911
## 2005 4.3355929 5.0291294 5.9139290 5.4684498 4.7298788
## 2006 9.6061190 9.9648811 9.0665160 8.2608194 9.2593325
## 2007 11.3678925 11.2785066 11.1714506 12.2566388 13.0609228
## 2008 26.2973536 18.1978035 16.6356352 18.8988859 22.8398260
## 2009 11.0173742 12.0742402 11.9887969 13.8800696 16.4121509
## 2010 27.6273028 25.6725709 27.3017381 31.0616368 34.2705006
## 2011 42.2944307 45.1449547 47.0881098 46.2835772 46.0466526
## 2012 53.1906621 60.4261053 71.6600522 79.2535037 77.1295875
## 2013 71.2440527 61.6675442 59.4639162 59.1540924 59.1555695
## 2014 76.4906664 69.5845914 72.5226052 73.5685063 80.7146128
## 2015 107.7157490 115.5893693 125.9449329 122.4171416 122.6944808
## 2016 105.1472703 98.4611879 96.2506652 108.6088705 94.2390878
## Jun Jul Aug Sep Oct
## 1981
## 1982 0.2849437 0.1804007 0.1274766 0.1906880 0.3452382
## 1983 0.9382476 0.7378067 0.4660957 0.4827798 0.4268295
## 1984 0.5014708 0.3960529 0.3300696 0.3127996 0.4517521
## 1985 0.3172549 0.2691480 0.1886713 0.1401927 0.3031032
## 1986 0.6070718 0.5420174 0.4271050 0.4726253 0.5710335
## 1987 1.2410491 1.2256134 1.2101583 1.5367554 1.7703422
## 1988 1.3072659 1.4068090 1.3345276 1.1295225 1.3459336
## 1989 1.5029908 1.2738155 1.2015008 1.2846029 1.3868515
## 1990 1.3035257 1.3910759 1.2956093 1.0758209 0.9273056
## 1991 1.5114174 1.3080652 1.4318732 1.5755570 1.5861022
## 1992 1.9167213 1.5479088 1.4719813 1.3667885 1.4589879
## 1993 1.8119194 1.3090717 0.8920152 0.7581296 0.7820585
## 1994 0.9266077 0.8909825 1.0901382 1.0845263 1.1383984
## 1995 1.3353246 1.5596513 1.4837294 1.3260444 1.2836111
## 1996 0.8362313 0.7265018 0.7260410 0.7013434 0.7823404
## 1997 0.5157743 0.5048127 0.5772185 0.6145555 0.7615790
## 1998 0.8459270 0.9835329 1.1451073 0.9368234 1.2855092
## 1999 1.4309150 1.5703520 1.8286096 2.0668524 2.1175867
## 2000 2.8123850 3.4928662 3.3699613 3.9181920 1.8373497
## 2001 1.3871986 1.5474145 1.2831935 1.0162704 1.1353134
## 2002 1.6138817 1.1857922 1.0540301 0.7676126 1.0608496
## 2003 1.2261789 1.2663361 1.4325064 1.2950772 1.4709694
## 2004 1.8888701 2.1549121 2.1825549 2.0822100 2.6496580
## 2005 5.2905300 4.8958740 5.6680039 6.0117689 7.1796438
## 2006 7.9805615 7.6536669 8.9888226 8.8322900 10.2370800
## 2007 15.9856147 16.2911650 17.4006166 18.2159590 20.3136316
## 2008 24.8559395 22.4597852 21.0902836 22.3501392 15.5113722
## 2009 17.6425476 19.0207350 21.6158413 21.8392353 24.8547008
## 2010 33.7565578 33.6104580 34.0640325 31.9479629 37.6668343
## 2011 45.7812467 44.8223627 51.3561032 50.9759818 50.6940677
## 2012 76.2282476 77.9777579 80.4143225 88.1759079 89.0551533
## 2013 60.4392447 54.5682575 60.7014531 65.7287536 64.7753653
## 2014 86.7587681 91.2310260 92.4006954 98.9772736 98.3777897
## 2015 127.5434773 125.1048596 119.9050756 111.1076549 109.8217525
## 2016
## Nov Dec
## 1981 0.4243218
## 1982 0.3696778 0.5346117
## 1983 0.3361822 0.3594997
## 1984 0.3734333 0.4263678
## 1985 0.2803382 0.3571425
## 1986 0.5272579 0.6544207
## 1987 1.2034249 1.0597458
## 1988 1.2119752 1.2212790
## 1989 1.4627899 1.4425179
## 1990 0.9809710 1.2029424
## 1991 1.6512593 1.6645823
## 1992 1.6874437 1.8919252
## 1993 1.0010619 1.0738505
## 1994 1.4126483 1.2796173
## 1995 1.1961863 1.3190815
## 1996 0.7619185 0.8427425
## 1997 0.5742956 0.6255201
## 1998 1.2438156 1.0961093
## 1999 2.6411406 3.2700439
## 2000 1.3250753 1.1297205
## 2001 1.1822579 1.4324630
## 2002 1.0925085 1.0513456
## 2003 1.5404533 1.4132014
## 2004 3.4581589 4.4339358
## 2005 7.6434374 8.9205967
## 2006 10.7889826 12.0470342
## 2007 25.0847412 24.0639065
## 2008 14.1475849 12.3255856
## 2009 24.7967970 26.4397180
## 2010 39.6836257 41.1801122
## 2011 53.3700553 50.7521986
## 2012 79.2446170 78.5057640
## 2013 70.8718050 75.9001889
## 2014 104.9201900 115.7627356
## 2015 118.1792100 116.7400073
## 2016
##
## $residuals
## Jan Feb Mar Apr
## 1981
## 1982 -0.00778522120 0.00271807836 0.01216742647 0.01715609788
## 1983 0.17766364005 0.10995211983 -0.01872439528 0.16704323275
## 1984 0.01059023750 0.05831623160 0.01435163498 0.13837968279
## 1985 0.00205442138 -0.03245864353 -0.00422759959 0.01902690215
## 1986 0.01996163662 0.06164508607 0.08559259709 0.06307704203
## 1987 0.22201877148 0.25222897102 -0.04526041921 0.24408173957
## 1988 -0.02450788350 0.06712299488 -0.05072189456 0.03892718947
## 1989 -0.08378698920 -0.02695332587 0.02222014053 0.11255656203
## 1990 -0.05676292092 0.01894162006 0.23479788949 -0.01359483005
## 1991 0.37682648384 0.08154508693 0.36482918342 -0.38385951640
## 1992 0.23615707681 0.11868340856 -0.27383996281 0.08779330668
## 1993 -0.04512169570 -0.17912294226 -0.02236943147 0.01860442997
## 1994 0.08103446766 0.16537954406 -0.06590685523 -0.07673999326
## 1995 0.00500913440 0.00724584072 -0.09803328470 0.12453602057
## 1996 -0.18831342793 0.02395309305 -0.04841857456 0.01658303343
## 1997 -0.18311642981 0.00929210415 0.11993730625 -0.01268120907
## 1998 0.12812939178 0.20112568282 0.18151743326 0.02899789179
## 1999 -0.03843358437 -0.20081227226 0.07228287008 0.36323377148
## 2000 -0.02746637322 0.35508351645 0.72073671287 -0.36900608138
## 2001 0.40446632214 -0.20168476211 0.26196235075 0.29816644861
## 2002 0.12479751968 -0.16612385072 0.12018277355 0.10493041663
## 2003 -0.06334062229 0.08526349621 -0.06332653303 0.06857038447
## 2004 0.01047829208 0.12408423607 0.20092934367 -0.02178658981
## 2005 0.72040708620 0.86987063496 -0.43392897046 -0.72644978321
## 2006 0.32288097504 -0.95888107810 -0.81851601091 0.99518060923
## 2007 -0.09489247718 -0.15250659807 1.04554938424 0.86636118677
## 2008 -8.49735356155 -1.75780348582 2.23436481686 3.97511411025
## 2009 0.83462583873 -0.33024017137 1.83420314957 2.66593041664
## 2010 -2.37230282817 1.23442914853 3.60026188866 3.27136319294
## 2011 2.32556933349 1.30104532538 -1.26010982782 -0.24257715432
## 2012 6.83533791934 10.90389471585 7.17994784070 -2.46150366168
## 2013 -10.81605273044 -2.76754418568 -0.39591617832 -0.07009241512
## 2014 -8.40966638500 2.41140863946 0.91039476630 7.16349368513
## 2015 6.16625102930 9.76963073649 -4.51893287797 -0.28814155317
## 2016 -8.91827025569 -2.35618785571 12.07933482230 -15.43587051968
## May Jun Jul Aug
## 1981
## 1982 0.02362534844 -0.09394370830 0.02259927805 0.14252339503
## 1983 0.14255181777 -0.20524764154 -0.21480669635 0.09290434463
## 1984 0.00624952911 -0.10347081430 -0.01305288465 0.06793037771
## 1985 -0.02065344926 -0.04725489320 -0.03114804374 0.03632867918
## 1986 0.13971079734 -0.06907182208 -0.07301741325 0.12789499893
## 1987 0.03385184166 -0.02404913975 0.01438659937 0.41484166045
## 1988 0.05285084998 0.09273405104 -0.06280896266 -0.12452763612
## 1989 0.30896735356 -0.24199077588 -0.05881552823 0.16149918912
## 1990 0.08764927203 0.07947429052 -0.09307588804 -0.14860934776
## 1991 -0.24426882947 -0.21541744527 0.13693481501 0.22712679214
## 1992 0.00232009928 -0.40472131881 -0.07490876711 -0.01898130681
## 1993 0.19194526217 -0.55591941788 -0.42707168930 -0.04601516928
## 1994 -0.00712387132 -0.07060772549 0.19701754311 0.08286180904
## 1995 0.12665141068 0.18267535657 -0.08865125675 -0.07472937346
## 1996 0.07324377161 -0.14623127779 -0.00350183912 0.07095898507
## 1997 0.00145742490 -0.04777430118 0.07018725494 0.13778149024
## 1998 -0.01408623000 0.09707299620 0.15446710149 -0.12010733833
## 1999 -0.04208792747 0.09108496483 0.26064797686 0.31639038995
## 2000 -1.33195797197 0.63161502919 -0.15186624895 0.63703867886
## 2001 -0.28863633739 0.14180142607 -0.31241446129 -0.06319349672
## 2002 0.02834508843 -0.44888170653 -0.18279220260 -0.08403009441
## 2003 0.34488013313 0.02682107482 0.11966386244 0.05449359789
## 2004 0.22780887516 0.25012994717 -0.02891212548 0.08544512066
## 2005 0.49812122836 -0.45053001851 0.71212595196 0.49799611789
## 2006 -1.39933250979 -0.44956145644 1.28333306918 -0.06682255417
## 2007 2.87507715922 0.06238527059 1.03483502358 0.80938337074
## 2008 1.98017404993 -2.83793952927 -1.55778524828 1.20271638083
## 2009 1.44684906957 1.08645243883 2.46426496013 0.50315866838
## 2010 -0.49150062457 -0.68055784284 0.21754198733 -2.09703251685
## 2011 -0.30765255903 -1.64124673271 6.52463730365 -0.75210316685
## 2012 -1.15958748207 0.56675238649 2.33624212785 7.43867750309
## 2013 1.25343046304 -7.17624465626 6.21674247675 5.17454688478
## 2014 6.37138722238 2.73623186314 0.83497400834 6.80130461827
## 2015 4.97151915423 -4.63047731541 -6.23885955932 -8.90707557015
## 2016 -3.71908777504
## Sep Oct Nov Dec
## 1981 -0.09232184829
## 1982 0.08331200475 0.03576182008 0.10832216993 -0.08661168659
## 1983 -0.13577976830 -0.08782949005 -0.03018219788 0.00650028181
## 1984 0.06420037345 -0.07875213179 -0.00243326929 0.01063217042
## 1985 0.09580731925 -0.02410316531 0.02166180381 -0.02714252274
## 1986 0.03037466125 -0.05103351330 0.07274211528 -0.04642066089
## 1987 0.16324463512 -0.60834219625 -0.20842491599 0.20725418302
## 1988 0.18247753445 -0.17393362075 -0.06797516144 0.00272099674
## 1989 0.07839711991 0.03814848937 -0.10378992881 -0.35951789895
## 1990 -0.17682089193 0.02569441242 0.16202904649 0.13405758566
## 1991 -0.02555704948 0.02589777403 -0.05825926139 0.10441767143
## 1992 0.05921154141 0.20001206906 0.13255626806 0.00007476418
## 1993 -0.01112963534 0.19994153339 0.00893808801 -0.13685053241
## 1994 0.00747367593 0.26160158218 -0.20164833155 -0.01161725098
## 1995 -0.10504442489 -0.09361113822 0.05681373801 -0.27108147114
## 1996 0.02765658017 -0.02634039290 0.03108153806 -0.15674245621
## 1997 0.09844454920 -0.20157901512 0.00970439212 -0.19452011490
## 1998 0.31617662413 -0.06550916844 -0.19381562055 0.24989067131
## 1999 0.01414755992 0.51641329269 0.57685941934 0.10995612496
## 2000 -2.22519198670 -0.55134969743 -0.24007526350 -0.15172053212
## 2001 0.00372964359 0.01968660211 0.21774213891 0.00753701100
## 2002 0.18538736423 -0.00384960027 -0.07350847324 -0.10934561999
## 2003 0.06692282106 0.03403064243 -0.16545333157 -0.00820143227
## 2004 0.46578997246 0.79534202646 0.94984107403 -0.19993578128
## 2005 1.03823114489 0.39335618755 1.27456257983 0.53240332303
## 2006 1.29070997909 0.42491999495 1.26401743385 -0.89103417865
## 2007 1.96504104106 4.66436837840 -1.12274120289 1.98309349401
## 2008 -7.40413920158 -1.36337223861 -1.96158488948 -1.10258555103
## 2009 2.53376471325 -0.06770075507 1.49120297171 1.27128196000
## 2010 5.36503709731 1.91116566165 1.23237428677 1.23588779518
## 2011 -0.83298175509 2.53393226449 -3.11205532623 2.50480140306
## 2012 -0.07690788459 -10.43615333439 -1.59761700887 -7.90476403235
## 2013 -1.26775359435 5.89863470649 4.75319501639 0.39781114567
## 2014 -1.46927364354 6.14721029148 10.68281001903 -8.47073560213
## 2015 -2.53165486244 7.81024749070 -1.22921002785 -12.68200727224
## 2016
the autoplot function from the forecast library is all we need, but for pedagogy, set up the plot directly from ggplot
optional, set up a custom range of dates to provide for the x-axis R class of date R internal date storage: days since the origin of “1970-01-01” for pedagogy, to see how it works
d1 <- as.Date(3987) # "1980-12-01"
d2 <- as.Date(18993) # "2022-01-01"
in practice usually easier to convert a character string to an object of class Date still not needed, but can set up optional specified range of dates for plot
d1 <- as.Date("1980-12-01")
d2 <- as.Date("2022-01-01")
now begin the work of what is needed in ggplot get sequence of dates for the forecast with R seq function
date <- seq(as.Date("2016-06-01"), as.Date("2018-05-01"), by="month")
look at the output of unclass(fhw) to see what is in fhw second column of fhw\(lower and fhw\)upper is .95 confidence bounds fhw$mean contains the predictions use this information to create a new data frame, called df
df <- data.frame(date, as.numeric(fhw$mean), fhw$lower[,2], fhw$upper[,2])
names(df) <- c("date", "Apple", "lower95", "upper95")
head(df, n=3)
now the structure of the df data frame matches the mydata data frame > head(mydata, n=3)
can optionally add the xlim specification to plot a wider x-axis just plot the .95 confidence interval here, using geom_smooth plot the data from data frame mydata plot the forecasts from data frame df
ggplot(mydata, aes(date, Apple)) + geom_line(color="black") +
#xlim(d1, d2) +
geom_line(data=df, aes(date, Apple, color="red")) +
geom_smooth(data=df, mapping=aes(ymin=lower95, ymax=upper95), stat="identity") +
theme(legend.position="none")