Lauhala Joonas
Lauhala Joonas
Lauhala Joonas
Development
Joonas Lauhala
BACHELOR’S THESIS
September 2020
Tampereen ammattikorkeakoulu
Tietojenkäsittelyn tutkinto-ohjelma
Ohjelmistotuotanto
LAUHALA, JOONAS:
Machine Learning and Mobile Development
LAUHALA, JOONAS:
Machine Learning and Mobile Development
The client of this Bachelors’ thesis was Piceasoft LLC. Piceasoft is a company,
who develops software to manage the mobile device lifecycle. Piceasoft has
multiple products such as Diagnostics, Verify, Switch, Report, Eraser, and
Trade-In. Machine learning is being tested as a part of the continuous applica-
tion research and development process.
The purpose of this thesis was to increase knowledge of how to develop basic
machine learning applications for mobile platforms. The thesis sought answer to
questions: “As a developer, how much do you need to know about machine
learning to start using it in your applications?” and “Why is the development with
a machine learning framework popular?”
A mobile application for Android and iOS platforms that used a machine learn-
ing framework was developed as a result of this thesis. The application classi-
fies handwritten digits by using an existing machine learning model.
The most significant finding of this thesis was that development with a machine
learning framework is simple. The basics of machine learning are often good
enough to start the mobile application development. Also, it was noted that most
of the existing learning resources for machine learning are not written for soft-
ware developers.
TABLE OF CONTENTS
1 Introduction ........................................................................................... 7
1.1 Background .................................................................................... 7
1.2 Client .............................................................................................. 7
1.3 Goal ............................................................................................... 8
2 Machine Learning explained ................................................................. 9
2.1 What is Machine Learning? ............................................................ 9
2.2 Basic requirements: Learning methods ........................................ 11
2.2.1 Supervised learning ............................................................ 11
2.2.2 Unsupervised learning ........................................................ 12
2.3 Basic requirements: Models and training ..................................... 13
2.4 On-device Machine Learning ....................................................... 14
3 Popular Machine Learning frameworks for mobile platform ................ 17
3.1 Google: Tensorflow Lite ............................................................... 17
3.2 Apple: Core ML ............................................................................ 17
3.3 Google: ML Kit for Firebase ......................................................... 18
4 Applications for Machine Learning ...................................................... 20
4.1 General applications .................................................................... 20
4.2 Image classification ...................................................................... 21
4.3 Object detection ........................................................................... 22
5 Using Machine Learning framework in a project ................................. 23
5.1 Foreword and project description ................................................. 23
5.2 Project dependencies and setup .................................................. 23
5.3 The development process in detail............................................... 24
5.3.1 Android-platform ................................................................. 24
5.3.2 iOS-platform ....................................................................... 27
5.4 Project summary .......................................................................... 30
6 Conclusion and discussion ................................................................. 33
References............................................................................................... 35
Appendices .............................................................................................. 36
Appendix 1. Project source code ........................................................ 36
5
API
APK
Abbreviation for Android Package. APK is a package that contains all Android
application resources necessary for the distribution.
Dataset
IDE
Inference function
The real-world equivalent of the reasoning process, but for the computer.
Model
The real-world equivalent of storing past experiences, but for the computer.
SDK
Software platform
Software framework
Android
Linux-based operating system used mostly for mobile devices, such as phones
and tablets, maintained by Google.
iOS
1 INTRODUCTION
1.1 Background
The popularity of machine learning has increased exponentially in the past ten
years. In many fields of study such as Computer science, Engineering and Tech-
nology, Economics, and Medicine, possible use cases for machine learning are
tested continuously. Thus, the basics of developing software that uses machine
learning are becoming more critical by the day. Machine learning has potential
that reaches far into the future.
For the reasons mentioned above, this Bachelors’ thesis aims to improve
knowledge on how to develop basic machine learning applications for mobile plat-
forms. The thesis only covers the basics since machine learning is such a com-
plex topic, and it might be too much to comprehend. Machine learning also has a
long history, and what has led to the current state of machine learning technology
being popular is not essential.
The main goal of this thesis is to combine all the necessary information to get
started with machine learning application development. Because software devel-
opment and machine learning are tightly coupled, the secondary goal of the thesis
is to act as a bridge between the complex world of machine learning and software
development. Machine learning terminology requires some simplifications to
make the transition more transparent to the developers. The core concepts of
machine learning technology are explained in a simplified form.
Piceasoft LLC acts as a client of this Bachelors’ thesis. Piceasoft develops soft-
ware solutions to manage the mobile device lifecycle. As more machine learning-
centric products are planned, a primer for the underlying technology can be useful
for most software developers.
1.2 Client
8
1.3 Goal
The first goal of this thesis is to create an instruction manual of sorts by gathering
everything essential for developing applications, which use machine learning and
available frameworks, including information about machine learning methods,
models, general applications, techniques, suitability, and more.
The thesis seeks an answer to questions, such as: “As a developer, how much
do you need to know about machine learning to start using it in your applica-
tions?”, “What is the relationship between machine learning and traditional soft-
ware development?” “Are there any major differences when using a machine
learning framework on different platforms?” and “Why is development with a ma-
chine learning framework popular?”
9
Machine learning is an iterative process for the computer (Figure 1), meaning that
it takes some time to complete (Gopalakrishnan & Venkateswarlu 2018), just like
the process for humans to learn to ride a bicycle. The process requires much
previous experience, trial, and error to achieve mastery in a specific task.
10
Machine learning simplifies many everyday tasks that would be tedious for a pro-
grammer to code traditionally. Typical tasks that would otherwise require infinite
lines of code because they do not have a well-defined set of rules, meaning that
they are non-existent or too abstract for the programmer to follow. The infor-
mation processing also takes time, and in real-time applications, a way for the
computer to make assumptions based on previous experiences will reduce the
processing time significantly. (Gopalakrishnan & Venkateswarlu 2018.)
Computers were the first ones to use machine learning, but it is proven that mo-
bile development with machine learning implemented on mobile devices is
trendy. Modern mobile devices show a high productive capacity level that is close
enough to perform appropriate tasks to the degree that modern computers do.
(Gopalakrishnan & Venkateswarlu 2018.)
Generally speaking, any application that does one of the following is using ma-
chine learning in some way:
• Speech recognition
• Computer vision and image classification
11
• Gesture recognition
• Translation from one language into another
• Interactive on-device detection of text
• Autonomous vehicles, drone navigation, and robotics
• Patient-monitoring systems and mobile applications interacting with med-
ical devices
(Gopalakrishnan & Venkateswarlu 2018.)
Machine learning can, on a larger scale, be categorized into supervised- and un-
supervised learning methods (Ng 2018). Other methods, like semi-supervised
and reinforced learning (Figure 2), are out-of-scope of this thesis.
The main goal of supervised learning is to create a function that maps inputs to
outputs so well that the unforeseen input (x) can be mapped to output (y) by pre-
dicting where it should belong based on prior knowledge (Ng 2018). A set of sin-
gle-digit numbers is a good example that is also used in the demo project later.
When all ten digits are labeled, and a new image of a number comes in, the model
should be able to predict what number it was (Figure 3).
12
Gopalakrishnan & Venkateswarlu (2018) lists multiple algorithms that are asso-
ciated with the supervised learning, such as K-nearest neighbours, Naive Bayes,
Decision trees, Linear regression, Logistic regression, Support vector machines,
and Random forest. (Gopalakrishnan & Venkateswarlu 2018.)
On the contrary, unsupervised learning does not learn under supervision. The
model learns based on the data that gets fed to it and discovers hidden patterns
in the data (Figure 4). This type is useful when there are enormous amounts of
data, and the patterns we are looking for are unknown. Unsupervised learning
algorithms can, in general, provide useful insights about given data like confirm
what we might already know or, in some cases, predict what is going to happen
next. A suitable example of using unsupervised learning could be customer seg-
mentation. When vast amounts of customer data are available that captures all
customer features, unsupervised learning algorithms could cluster different kinds
13
Machine learning uses models to store past experiences; the models are then
used by the computer to apply the machine learning algorithm to the problem in
hand. Model formats vary between different machine learning frameworks. As a
side-effect, this causes most of the existing model formats to be incompatible with
14
Model training can be done from scratch and can be a tedious, time-consuming
task. Training from scratch can, however, yield more reliable results if done cor-
rectly. One of the prerequisites is that the problem needs to be well-defined.
When the machine learning problem is well-defined, the following conditions are
satisfied: We have the right problem, the right data, and the right criteria for suc-
cess. Sometimes this is the only way to approach a particular issue because ap-
plication data requirements can be precise (Gopalakrishnan & Venkateswarlu
2018).
In both cases, the model requires a useful and extensive dataset, meaning that it
is qualitative but also quantitative. In the end, when training the model, it is all
about following the best practices available. On the list below, there are a few
examples of these best practices.
Ng (2018) says that nowadays, people are spending more time with mobile de-
vices, and it makes much sense to run machine learning models on the device
itself instead of storing it in the cloud (Ng 2018).
The most significant advantage of the on-device model is that model is always
available without the need to send information back and forth (Figure 5). Local-
ness is also the reason why using the model is convenient.
However, the Cloud model is not useless because the Cloud model can be up-
dated seamlessly, without users ever noticing the difference. The Cloud model is
the way to go if the model file is enormous and would drastically increase the
application size (Figure 6).
FIGURE 6: Relationship between the mobile application and the cloud model
16
Training models on a device has still not risen to popularity, since the computing
power and storage space requirements are quite large (Ng 2018). Mobile is not
the preferred platform to train models, the only exception being tiny datasets. The
retraining phase would also become too complicated. (Gopalakrishnan &
Venkateswarlu 2018.). However, Core ML 3 for iOS allows the developer to lev-
erage this functionality for simple tasks.
17
• Android
• iOS
The usage of Tensorflow Lite on Android and iOS platforms is very straightfor-
ward. Both platforms require the framework as a dependency. The only real dif-
ference is that on iOS, the project must use a CocoaPods dependency man-
ager.
• iOS
iOS SDK contains the Core ML framework by default. The developer must im-
port the API within code before its use.
Firebase ML Kit (or ML Kit for Firebase) is a bundle of machine learning frame-
works by Google. It launched in May 2018. ML Kit SDK packages Google Cloud
Vision API, Tensorflow Lite, and Android Neural Networks API together. ML Kit
can be used to detect faces, scan barcodes, label images, and for other useful
purposes. It supports on-device data models along with cloud data models. (Wig-
gers 2018.)
ML Kit for Firebase is an easy way to use machine learning for common prob-
lems, such as reading the contents of a QR code or detecting and reading a text
from an image. A feature of ML Kit that is not common knowledge allows the
framework to load and use custom Tensorflow Lite models. However, because
ML Kit API is more nuanced towards everyday use cases – it does not allow as
much control over the model as Tensorflow Lite does.
• Android
19
• iOS
Firebase ML Kit can be used on Android and iOS platforms to a varying degree.
The Android version of the framework requires only a few project dependencies
to function correctly. Some additional steps must be taken when using the ML Kit
on the iOS platform. For example, the iOS project must use a CocoaPods de-
pendency manager, a few API functionalities require a newer version of Xcode
IDE, and some parts of the framework differ between platforms.
20
Today, many applications rely heavily on using machine learning frameworks for
specific tasks, but still use traditional solutions for most of their functionality.
Machine learning is not a replacement for conventional ways of developing
software but an extension of it. In many cases, the machine learning approach
can simplify the convoluted logic of specific tasks.
• Google Maps
• Facebook
• Snapchat
• Netflix
• Tinder
• Uber
(Gopalakrishnan & Venkateswarlu 2018.)
There are many other popular applications not listed that use machine learning
functionality. What comes to using machine learning functionality in viral applica-
tions is a smart move because, firstly, it can massively improve the user experi-
ence and, secondly, provide information with actual business value to the devel-
oper(s) or the company.
21
A mobile application demo that is implemented during the last chapters of this
thesis uses the image classification technique to categorize handwritten digits to
their corresponding numbers.
22
Object detection is the second computer vision technique briefly covered in this
thesis. It allows the detection of objects within an image or video feed.
Machine learning can be used to detect objects in images (Image 2). For exam-
ple, pictures that contain devices, toys, furniture, and other items. Object detec-
tion also works as a prerequisite for manipulating images based on their content,
like swapping face with another.
IMAGE 2. Bounding box and confidence percentage of “an apple” and “a banana”
in Tensorflow Lite demo app (Object detection | Tensorflow Lite 2020)
23
Application development with machine learning frameworks does not require in-
depth knowledge of machine learning algorithms, model creation/training, or
other intricacies. Only basic mobile application development skills are necessary.
Developers need to know how to use provided API, import model(s), and go from
there. (Gopalakrishnan & Venkateswarlu 2018.)
This demo application classifies digits as they are drawn to the canvas by the
user and tries to predict the correct number. Both platforms use the same MNIST
model for digit classification. The project (or thesis) does not explain how to
choose the best framework for a specific problem, and it is a discussion of its
own. The most popular and preferable frameworks for the platforms were se-
lected.
“Thesis-project” uses the empty template as the basis on both projects. The first
step when setting up the project was to modify the layout of the application that
is visible to the end-user. The application layout was kept similar on both plat-
forms for consistency.
24
An MNIST data set is a subset of NIST, containing a training set of 60 000 and a
test set of 10 000 examples of handwritten digits by 500 different writers. The
numbers are centered, size-normalized, and fixed to 28x28 pixels per image.
Original images from the NIST set are black and white. (LeCun & Cortes & Burges
2013.) The MNIST based machine learning model was used on both platforms.
5.3.1 Android-platform
The Android demo application was developed by using the Android Studio IDE
(Image 3). The Android implementation uses the TensorFlow Lite library as its
machine learning framework.
The first steps of development on the Android-platform were to create a new pro-
ject with Android Studio from an empty template and define application layout,
along with adding the following dependencies to the app-level Gradle-file. The
Android implementation uses the MNIST in a TFLite format (“*.tflite”).
25
After including project dependencies, among other things, a class hierarchy was
set. The Android project contains a total of four classes.
The main activity (MainActivity) opens when the user launches the application.
The main activity loads the application layout and handles user interactions.
While constructing the main activity, a digit classifier (DigitClassifier) will load and
initialize the TFLite-formatted MNIST model in the background (Figure 7). A ref-
erence to the digit classifier (DigitClassifier) is stored in a variable within the main
activity class. Because Android SDK does not provide a built-in view component
to draw lines to the screen, a custom-made drawable view (DrawableView) is
needed to accomplish this task.
26
DigitClassifier(Context context) {
asyncInit(context);
}
return null;
});
}
Each visible component on the application responds to user touch events. For
example, the drawable view component responds to the user given touch coordi-
nates and draws lines inside the view based on those. Under the hood, all buttons
have a click listener (Figure 8), meaning that after the user has pressed the but-
ton, a callback function tied to that specific button gets executed.
break;
}
case R.id.classifyButton: {
mDigitClassifier.asyncClassify(mDrawableView.getRenderView())
.addOnSuccessListener(this::classificationFinished)
.addOnFailureListener(this::classificationFailed);
break;
}
}
}
The classification process starts with a button click. A copy of the user drawn
bitmap gets transformed to suit the MNIST model requirements. The bitmap then
gets sent to the classifier, which uses the MNIST model inference function that
outputs an array of confidence scores for each number. A data class (Classifica-
tion) gets instantiated with the result array (Figure 9), and the number with the
27
highest confidence score is stored within the object along with its confidence
score.
5.3.2 iOS-platform
28
The iOS demo application was developed by using the Xcode IDE (Image 5). The
Core ML library is used as an iOS application machine learning framework.
Vision (Core ML) is a built-in library and only needs import-statement on top of
the class before use. Just like on Android-platform, a class hierarchy was set
containing a total of five classes.
The AppDelegate class works as the entry point when launching the iOS appli-
cation. The SceneDelegate loads the storyboard (or the layout of the iOS appli-
cation) and hands the control to the ViewController class. The ViewController is
responsible for handling user interaction. In the iOS project, the initialization of
the MNIST model happens when the model gets used for the first time (Figure
10).
return VNCoreMLRequest(
model: model,
completionHandler: {request, error in
let classification = Classification(
(request.results as? [VNClassificationObservation])?.first
)
self.classificationResultDelegate?
.OnClassificationReceived(classification: classification)
}
)
}()
All the visible components on the storyboard are linked to the ViewController. The
standard iOS view components do not include a component that would allow the
user to draw to the screen. Because of this, the implementation of a custom draw-
ing component (DrawableView) is required. Each button on the storyboard has a
callback function assigned to it, which gets called when the user presses the but-
ton (Figure 11).
A button press initiates the classification process, which in turn creates a new
classification request with the current drawing (Figure 12). After the classification
request has finished, the delegate class (Classification) will receive the results.
do {
try handler.perform([self.classificationRequest])
} catch {
self.classificationResultDelegate?
.OnClassificationReceived(classification: nil)
}
}
}
As can be seen from the simplicity of the application, creating a basic application
that uses a machine learning framework is not complicated for a developer who
already knows the basics of software development. The application allows users
to draw digits and submit them for classification via a press of a button, as initially
intended. The application displays the classification results to the user in the form
of a text.
Both applications can classify user has drawn single-digit numbers correctly (Im-
age 7 and Image 8), but for some unknown reason, the iOS application can clas-
sify most numbers a bit more reliably. A possible cause is a way that the draw-
ings/images are handled; despite being very similar, the final implementations
differ lightly.
In the case where the user draws a “cross” to the screen (Image 9), the model
seems to think that it is equal to number four after classification. The reason for
this being because the model is not trained to detect any other characters than
numbers. A possible way to mitigate this issue would be to retrain the model and
make it distinguish between single-digit numbers and other characters.
Because almost every field of study is going towards using machine learning in
one way or another, this Bachelors’ thesis concludes that knowing about machine
learning at a basic level is very useful. The existing resources for gathering fun-
damental knowledge include various books and internet articles. However, most
of those resources seem to be written for data scientists and mathematicians who
understand the algorithms behind the machine learning much better than a typical
software developer would. In the case that machine learning was explained in
more detail, the page count of this thesis would have increased drastically.
Machine learning is a complex yet exciting topic. For the most part, machine
learning remains a black box, but developing applications which use machine
learning via a framework is not very difficult. The essential software development
skills are often good enough. In the grand scheme of things, traditional software
development is not going to change much because of machine learning. How-
ever, machine learning will make some tasks more straightforward than ever be-
fore. Although some cross-platform frameworks can require more tweaking when
trying to keep things similar.
The next step would have been to delve deeper into the machine learning terri-
tory. For example, improvements to machine learning related terminology would
be necessary, and the inclusion of more advanced terminology. Supervised and
unsupervised, semi-supervised, and reinforced learning methods must be ex-
34
A significant improvement to the demo application would have been to retrain the
MNIST model to distinguish between numbers (from zero to nine) but also detect
non-numbers. Another possibility would have been to reduce the number of erro-
neous classifications by disregarding any drawing as non-number if its confi-
dence score is below a threshold value of 55 percent, for example. A final solution
to this problem would have been to build a custom-trained model for classifying
digits. In this particular case, the final solution comes with a huge downside,
which is the time that would need to be used for training the model when consid-
ering that the existing MNIST model is already well-trained with tens of thousands
of samples, building the own model from the ground up is a waste of time.
References
LeCun Y. & Cortes C. & Burges C. 2013. The MNIST Database of handwritten
digits. Published 14.5.2013. Read 27.8.2020.
http://yann.lecun.com/exdb/mnist/
Ng, K. 2018. Machine learning projects for mobile applications: build Android
and IOS applications using TensorFlow Lite and Core ML. Birmingham: Packt
Publishing.
Object detection | Tensorflow Lite. 2020. Digital article. Modified 4.6.2020. Read
1.8.2020.
https://www.tensorflow.org/lite/models/object_detection/overview
Wiggers, K. 2018. Apple’s Core ML 2 vs. Google’s ML Kit: What’s the differ-
ence? Digital article. Published 5.6.2018. Read 1.8.2020.
https://venturebeat.com/2018/06/05/apples-core-ml-2-vs-googles-ml-kit-whats-
the-difference/
36
Appendices
The complete source code for Android and iOS mobile applications is publicly
available on the GitHub version control service.
https://github.com/Jindetta/Thesis-project-for-Android
https://github.com/Jindetta/Thesis-project-for-iOS