R Deep Learning Essentials - Sample Chapter
R Deep Learning Essentials - Sample Chapter
ee
P U B L I S H I N G
C o m m u n i t y
$ 49.99 US
31.99 UK
pl
R Deep Learning
Essentials
Sa
m
D i s t i l l e d
R Deep Learning
Essentials
Build automatic classification and prediction models using
unsupervised learning
E x p e r i e n c e
at Elkhart Group Limited, a statistical consultancy. He earned his PhD from the
University of California, Los Angeles. His research focuses on using advanced
quantitative methods to understand the complex interplays of psychological,
social, and physiological processes in relation to psychological and physical health.
In statistics and data science, Joshua focuses on biostatistics and is interested in
reproducible research and graphical displays of data and statistical models.
Through consulting at Elkhart Group Limited and his former work at the UCLA
Statistical Consulting Group, Joshua has helped a wide array of clients, ranging
from experienced researchers to biotechnology companies. He develops or
codevelops a number of R packages including varian, a package to conduct
Bayesian scale-location structural equation models, and MplusAutomation,
a popular package that links R to the commercial Mplus software.
Preface
This book is about how to train and use deep learning models or deep neural
networks in the R programming language and environment. This book is not
intended to provide an in-depth theoretical coverage of deep neural networks, but
it will give you enough background to help you understand their basics and use
and interpret the results. This book will also show you some of the packages and
functions available to train deep neural networks, optimize their hyperparameters
to improve the accuracy of your model, and generate predictions or otherwise use
the model you built. The book is intended to provide an easy-to-read coverage of
the essentials in order to get going with real-life examples and applications.
Preface
Chapter 6, Tuning and Optimizing Models, explains how to adjust model tuning
parameters to improve and optimize the accuracy and performance of deep learning
models.
Appendix, Bibliography, contains the references for all the citations throughout the
book.
R packages that train deep learning models such as deep belief networks or
deep neural networks
Connecting R and H2O, the main package we will be using for deep learning
[1]
Chapter 1
Figure 1.1
Rather than our visual system having cells, neurons that are activated only upon
seeing the gestalt, or entirety, of a square, we can have cells that recognize horizontal
and vertical lines, as shown in the following:
Figure 1.2
[3]
In this hypothetical case, there may be two neurons, one which is activated when
it senses horizontal lines and another that is activated when it senses vertical lines.
Finally, a higher-order process recognizes that it is seeing a square when both the
lower order neurons are activated simultaneously.
Neural networks share some of these same concepts, with inputs being processed
by a first layer of neurons that may go on to trigger another layer. Neural networks
are sometimes shown as graphical models. In Figure 1.3, Inputs are data represented
as squares. These may be pixels in an image, or different aspects of sounds, or
something else. The next layer of Hidden neurons consists of neurons that recognize
basic features, such as horizontal lines, vertical lines, or curved lines. Finally, the
output may be a neuron that is activated by the simultaneous activation of two of the
hidden neurons. In this book, observed data or features are depicted as squares, and
unobserved or hidden layers as circles:
Figure 1.3
[4]
Chapter 1
Neural networks are used to refer to a broad class of models and algorithms. Hidden
neurons are generated based on some combination of the observed data, similar to
a basis expansion in other statistical techniques; however, rather than choosing the
form of the expansion, the weights used to create the hidden neurons are learned
from the data. Neural networks can involve a variety of activation function(s), which
are transformations of the weighted raw data inputs to create the hidden neurons.
1
=
exp
f
x
(
)
variety of these, the Gaussian form is common:
2 2
In a shallow neural network such as is shown in Figure 1.3, with only a single hidden
layer, from the hidden units to the outputs is essentially a standard regression
or classification problem. The hidden units can be denoted by h, the outputs by
Y. Different outputs can be denoted by subscripts i = 1, , k and may represent
different possible classifications, such as (in our case) a circle or square. The paths
from each hidden unit to each output are the weights and for the ith output are
denoted by wi. These weights are also learned from the data, just like the weights
used to create the hidden layer. For classification, it is common to use a final
T
Yi =
ew i h
wT h
This only scratches the surface of the conceptual and practical aspects of neural
networks. For a slightly more in-depth introduction to neural networks, see
Chapter 11 of Hastie, T., Tibshirani, R., and Friedman, J. (2009), which is freely
available at http://statweb.stanford.edu/~tibs/ElemStatLearn/, Chapter 16
of Murphy, K. P. (2012), and Chapter 5 of Bishop, C. M. (2006). Next, we will turn to a
brief introduction to deep neural networks.
[5]
[6]
Chapter 1
Figure 1.4
[7]
CNNs are most commonly used in image recognition. CNNs work by having each
neuron respond to overlapping subregions of an image. The benefits of CNNs are that
they require comparatively minimal pre-processing yet still do not require too many
parameters through weight sharing (for example, across subregions of an image).
This is particularly valuable for images as they are often not consistent. For example,
imagine ten different people taking a picture of the same desk. Some may be closer or
farther away or at positions resulting in essentially the same image having different
heights, widths, and the amount of image captured around the focal object.
As for neural networks, this description only provides the briefest of overviews as
to what deep neural networks are and some of the use cases to which they can be
applied. For an overview, see Schmidhuber, J. (2015) as well as Chapter 28 of Murphy,
K. P. (2012).
Chapter 1
[9]
Once you have created the R script, you can uncomment and run the code to install
the checkpoint package. You only need to do this once, so when you are done it's
best to comment the code out again so it is not re-installed each time you run the file.
This is the file we will run each time we want to set up our R environment for this
deep learning project. The checkpoint for this book is 20th February 2016 and we are
using R version 3.2.3. Next, we can add library() calls for some packages we will
need to be available by adding the following code to our checkpoint.R script (but
note that these are not run yet!):
## Chapter 1 ##
## Tools
library(RCurl)
library(jsonlite)
library(caret)
library(e1071)
## basic stats packages
library(statmod)
library(MASS)
Chapter 1
Once we have added that code, save the file so that any changes are written to the
disk, and then run the first couple of lines to load the checkpoint package and the call
to checkpoint(). The results should look something like Figure 1.5:
Figure 1.5
[ 11 ]
The checkpoint package asks to create a directory to store specific versions of the
packages used, and then finds all packages and installs them. The next sections show
how to set up some specific R packages for deep learning.
Neural networks
There are several packages in R that can fit basic neural networks. The nnet package
is a recommended package and can fit feed-forward neural networks with one
hidden layer, like the one shown in Figure 1.3. For more details on the nnet package,
see Venables, W. N. and Ripley, B. D. (2002). The neuralnet package also fits shallow
neural networks with one hidden layer, but can train them using back-propagation
and allows custom error and neuron activation functions. Finally, we come to the
RSNNS package, which is an R wrapper of the Stuttgart Neural Network Simulator
(SNNS). The SNNS was originally written in C, but was ported to C++. RSNNS
allows many types of models to fit in R. Common models are available using
convenient wrappers, but the RSNNS package also makes many model components
from SNNS available, making it possible to train a wide variety of models. For more
details on the RSNNS package, see Bergmeir, C., and Bentez, J. M. (2012). We will see
examples of how to use these models in Chapter 2, Training a Prediction Model. For
now, we can install them by adding the following code to the checkpoint.R script
and saving it. Saving is important because, if our changes to the R script are not
written to the disk, the checkpoint() function will not see the changes and will not
find and install the new packages:
## neural networks
library(nnet)
library(neuralnet)
library(RSNNS)
Chapter 1
[ 13 ]
Initializing H2O
To initialize an H2O cluster, we use the h2o.init() function. Initializing a cluster
will also set up a lightweight web server that allows interaction with the software via
a local webpage. Generally, the h2o.init() function has sensible default values, but
we can customize many aspects of it, and it may be particularly good to customize
the number of cores/threads to use as well as how much memory we are willing for
it to use, which can be accomplished as in the following code using the max_mem_
size and nthreads arguments. In the code that follows, we initialize an H2O cluster
to use two threads and up to three gigabytes of memory. After the code, R will
indicate the location of log files, the Java version, and details about the cluster:
cl <- h2o.init(
max_mem_size = "3G",
nthreads = 2)
Note:
C:\Users\jwile\AppData\Local\Temp\RtmpuelhZm/h2o_jwile_started_
from_r.out
C:\Users\jwile\AppData\Local\Temp\RtmpuelhZm/h2o_jwile_started_
from_r.err
3.6.0.8
H2O_started_from_R_jwile_ndx127
2.67 GB
TRUE
[ 14 ]
Chapter 1
Once the cluster is initialized, we can interface with it either using R or using the web
interface available at the local host (127.0.0.1:54321); it is shown in Figure 1.6:
Figure 1.6
We can check the results by typing the R object, h2oiris, which is simply an object
that holds a reference to the H2O data. The R API queries H2O when we try to print
it:
h2oiris
5.1
3.5
1.4
0.2
setosa
4.9
3.0
1.4
0.2
setosa
4.7
3.2
1.3
0.2
setosa
4.6
3.1
1.5
0.2
setosa
5.0
3.6
1.4
0.2
setosa
5.4
3.9
1.7
0.4
setosa
We can also check the levels of factor variables, such as the Species variable, as
shown in the following:
h2o.levels(h2oiris, 5)
[1] setosa
versicolor
In real-world uses, it is more likely that the data already exists somewhere; rather
than load the data into R only to export it into H2O (a costly operation as it creates an
unnecessary copy of the data in R), we can just load data directly into H2O. First we
will create a CSV file based on the built-in mtcars dataset, then we will tell the H2O
instance to read the data using R. Printing again shows the data:
write.csv(mtcars, file = "mtcars.csv")
h2omtcars
C1
hp drat
wt
108
[ 16 ]
Chapter 1
5 Hornet Sportabout 18.7
Valiant 18.1
Finally, the data need not be located on the local disk. We can also ask H2O to
read in data from a URL as shown in this last example, which uses a dataset made
available from the UCLA Statistical Consulting Group:
h2obin <- h2o.importFile(
path = "http://www.ats.ucla.edu/stat/data/binary.csv")
h2obin
admit gre
gpa rank
0 380 3.61
1 660 3.67
1 800 4.00
1 640 3.19
0 520 2.93
1 760 3.00
Summary
This chapter presented a brief introduction to NNs and deep neural networks. Using
multiple hidden layers, deep neural networks have been a revolution in machine
learning by providing a powerful unsupervised learning and feature extraction
component that can be standalone or integrated as part of a supervised model.
There are many applications of such models, and they are being increasingly used
by large companies such as Google, Microsoft, and Facebook. Examples of tasks
for deep learning are image recognition (for example, automatically tagging faces,
or identifying keywords for an image), voice recognition, and text translation (for
example, to go from English to Spanish, or vice versa). Work is even being done
on text recognition such as sentiment analysis to try to identify whether a sentence
or paragraph is generally positive or negative, particularly useful for evaluating
perceptions about a product or service. Imagine being able to scrape reviews and
social media for any mention of your product and being able to analyse whether it
was being discussed more or less favourably than the month or year before!
[ 17 ]
This chapter also showed how to set up R and the necessary software and packages
installed, in a reproducible way to match the versions used in this book.
In the next chapter, we will begin to train neural networks and generate our own
predictions.
[ 18 ]
www.PacktPub.com
Stay Connected: