Using Machine Learning To Secure Iot Systems: Janice Ca Nedo, Anthony Skjellum
Using Machine Learning To Secure Iot Systems: Janice Ca Nedo, Anthony Skjellum
Using Machine Learning To Secure Iot Systems: Janice Ca Nedo, Anthony Skjellum
Abstract—The Internet of Things (IoT) is a massive group of To address the challenges in securing IoT devices, we
devices containing sensors or actuators connected together over propose using machine learning within an IoT gateway to help
wired or wireless networks. With an estimate of over 25 billion secure the system. Machine learning is an area of Artificial
devices connected together by 2020, IoT has been rapidly growing
over the past decade. During the growth, security has been Intelligence (AI) in which computer programs are enabled
identified as one of the weakest areas in IoT. When implementing to learn from experience, examples, and analogies [6]. As
security within an IoT network, there are several challenges learning occurs, the capabilities within the program become
including heterogeneity within the system as well as the quantity more intelligent and the program becomes capable of making
of devices that need to be addressed. informed decisions. Within machine learning, two of the most
To approach the challenges in securing IoT devices, we propose
using machine learning within an IoT gateway to help secure popular approaches are artificial neural networks (ANN) and
the system. We investigate using Artificial Neural Networks in genetic algorithms. ANNs mimic the neurons and synapses
a gateway to detect anomalies in the data sent from the edge within the brain to transfer data for communication, learning,
devices. We are convinced that this approach can improve the and decision making [6]. ANNs are used within IoT systems
security of IoT systems. to monitor the state of IoT devices and to make informed
Index Terms—Internet of Things, Security, Machine Learning
decisions [7]. We propose the use of ANN to learn the healthy
state of a system and connected devices.
The remainder of this paper is organized as follows. Sec-
I. I NTRODUCTION
tion II provides an overview of the state of IoT Security
The Internet of Things (IoT) is a massive group of devices and the use of machine learning within security. Section III
containing sensors or actuators connected together over wired describes our approach to adding machine learning within
or wireless networks[1]. IoT has been rapidly growing over an IoT gateway. Section IV discusses our experimentation
the past decade and, during the growth, security has been and results including success and failures in adding machine
identified as one of the weakest areas in IoT. There are over learning within the gateway. Section V offers our conclusion
six billion estimated devices currently connected to the Internet and outlines our future research plans.
and an estimate of over 25 billion connected by 2020 [2–4].
IoT devices can be divided into two primary groups: edge II. R ELATED W ORK
devices and gateway devices. An edge device is a low-power, In Internet of Things (IoT) Security: Current Status, Chal-
low-resource device containing sensors and/or actuators. Edge lenges, and Prospective Measures, Mahmoud, et al., presented
devices usually have a single purpose, for example, collecting a survey of the current status of IoT security [5]. They
temperature data and reporting it to a gateway. Gateway found that there are gaps within security techniques to protect
devices typically have more resources compared to edge sensor nodes, to maintain trust between devices, and to defend
devices. A gateway device is responsible for connecting the against Man in the Middle attacks, Denial of Service (DoS)
edge devices to the Internet and aggregating data from edge attacks, etc. They concluded that there is currently extensive
devices. With the extensive quantity of devices, amount of data work occurring within IoT authentication and access control
that travels between the devices, and the impact these devices protocols but other work needs to be done as well.
will have in our everyday lives, security is a necessity. Research on the Basic Characteristics, the Key Technolo-
There are several challenges with implementing security gies, the Network Architecture and Security Problems of
within an IoT network [5]. First, IoT systems are hetero- the Internet of Things, Xingmei, et al., introduced the key
geneous. There are different types of devices, methods of concepts, network architecture, and security problems within
communication, types of data being transferred and shared, the Internet of Things [8]. IoT requires intelligent processing
various resource levels of devices, and system configurations. and reliable transmission within the network. To provide this,
Each distinct element adds to the challenge of effectively the network architecture contains three layers: the applica-
secure IoT. A second challenge inheres in the number of tion layer, the transport layer, and the sensing layer. The
devices that are connected together. Billions of devices con- application layer contains the logical link between the user
nected together provides a new research area of focus when and the Internet through intelligent applications. Intelligent
considering nominal function, resiliency, and security as well. applications include smart home furnishings and intelligent
architectures. The application layer uses machine learning, Raspberry Pi 1 . The temperature reading was sent from that
data mining, data processing, and other analytics to process edge devices to the gateway approximately every two seconds.
information from the system and provide an output. The
transport layer consists of network communications including B. Machine Learning Methodology
Wi-Fi, Bluetooth, ZigBee, and 802.15.4. The transport layer Machine learning is the use of algorithms within a program
contains the gateway or gateways that process the information to learn from collected data [6, 9]. Within machine learning
and relay the information across the network. The sensing there are various algorithms that exist to learn from data. We
layer contains edge devices that are composed of a variety chose to implement an artificial neural network to monitor
of sensors and actuators that collect data and send it through the system. An artificial neural network(ANN) is a type of
the transportation layer to the application layer for analysis. machine learning that is modeled after the brain [6]. A solution
The approach proposed by Xingmei, et al. provides a for ANN was first developed in the mid-1980’s. An ANN has
framework for reliable communication, however, as noted by an input layers of neurons (nodes). The neurons then send
the authors, there are many security threats present in the data through synapses and a hidden layer of neurons. The
transport layer. Our approach is to add machine learning within hidden layer learns from the input data through changes in
the transport layer to help determine if there are interruptions the synapses using weights. Weights are adjusted as the neural
in the data transfer and to monitor the edge devices from the network learns from the data. Finally, there is an output of data
sensing layer. This approach will also address the issues raised through output neurons [6, 10].
by Mahmoud, et. al. by addressing the entire system security, To create an ANN, we chose to use R. R is a statistical
not simply the authentication and access control protocols. programming tool that allows for computations. Packages
In Neural Network Approach to Forecast the State of the In- are readily available in R for machine learning, statistics,
ternet of Things Elements, Katenko, et al., investigated the use graphing, probability, etc. We chose to use neuralnet package.
of neural networks to forecast the state of an IoT element [7]. The neuralnet package allows us to create a neural network
Their approach combined a multi-layered perceptron network to use for predictions. When training the neural network, the
along with a probabilistic neural network. They discovered package outputs a plot of the neural network. The lines on
that by using the multi-layer perceptron network to explore the plot are color coded with black lines showing connections
similar values throughout the past, they could then use a between each layer and weights of each connection, and blue
probabilistic neural network to determine the state of the lines showing bias terms added in each step [11].
element. They found they were able to reduce the labor costs
of the IoT administration and emergency resolution through IV. E XPERIMENTATION AND R ESULTS
this technique. We began by collecting approximately 4,000 data samples
While the technique by Katenko, et. al., did in fact reduce from the edge devices over the course of one hour and
labor costs and allowed for monitoring and forecasting an ele- stored the data in a MySQL database within the gateway. We
ment in an IoT network, the need to forecast the entire state of approximate the number of data samples because we ran our
an IoT system was still needed. We propose the use of machine experimentation multiple times, added additional data samples,
learning techniques including ANN, as mentioned above, in and divided the data into random training and testing data
both of the gateways to monitor subsystem components, and sets. We stored the device ID, sensor value, and time stamp of
in the application layer of the whole system to monitor the each data transmission. We chose to begin with a simple neural
state of the entire system. network using the device ID and the sensor value to determine
if the reading is valid. We separated our data into two random
sets. One set is for training the neural network and the second
III. A PPROACH
set is for testing the accuracy of the trained network. The
In examining the approach, we will begin with an overview neural network was then trained with the two input neurons,
of our testbed creation and then discuss our machine learning device ID and sensor value, where each entry was valid. We
methodology. chose to use all valid data under the assumption that the model
is being trained based on normal operations. Figure 1 displays
the weighted lines of the trained neural network. The black
A. Testbed Creation lines connect the layers together and show the connection
weights, and the blue lines show the weight bias. The bias
We began by creating an IoT testbed. We used Arduino is used to shift the curve as necessary for a more accurate
Uno devices to emulate edge devices. Each Arduino Uno was fit. We chose to have a five-layer network with three hidden
connected to an ESP8266 WiFi chip and a temperature sensor. layers.
We then connected 10 edge devices to a Raspberry Pi Model
Once the neural network was trained, the testing data was
3 device to implement a gateway. The Raspberry Pi Model 3
then used within the neural network to verify that each attribute
contains 1GB of RAM which allows for machine learning of
smaller data sets. We were limited in the number of devices 1 In future work we will be able to address this issue and connect additional
we can connect because of the wireless driver used in the devices for a large scale system.
layers within our neural network. We then trained our new
neural network. When we ran the testing set through our new
network, we found that without invalid data against which
to train, the model had difficulty correctly predicting invalid
data points. Our view is that because we trained with such a
small quantity of only valid data (approximately 3,600 data
points), the neural network was unable to consistently predict
the difference between valid and invalid data. We determined
that the model needed to be trained with both valid and invalid
readings 2 . Next, we simulated invalid data and retrained the
neural network with both valid and invalid data. Figure 2 is
the R output for the neural network when trained with three
input neurons. By training with valid and invalid data, we were
able to predict invalid data points successfully. For invalid data
points, we were able to detect 1) correct delay and incorrect
sensor values, or 2) incorrect delay and correct sensor values
or 3) both incorrect delay and incorrect sensor values.
Figure 1. Plot of Neural Network From R with Two Input Neurons
was valid. We were able to run our test data through our neural
network and each attribute was predicted correctly as a 1.
We then manipulated the sensors to add invalid data to the
database for a ten-minute period. The invalid data was then
run against the neural network. The neural network was able
to detect the differences between the valid and invalid data.
Since we used a valid bit equal to 1, and did not train with any
valid bit equal to 0, when the predicted value was not equal
to 1.000000, the data was out of range. In our prediction, the
values that were greater than 1.000000 were not valid readings.
Table I shows predicted values of the altered data points. We
were able to correctly predict validity over 99% of the time.
This data does show valid predictions, however, this is a simple
example.
Table I
P REDICTED VALUES OF T EST DATA FOR T WO I NPUT N EURONS Figure 2. Plot of Neural Network From R with Three Input Neurons