Machine Learning on Kubernetes: A practical handbook for building and using a complete open source machine learning platform on Kubernetes
By Faisal Masood and Ross Brigoli
()
About this ebook
MLOps is an emerging field that aims to bring repeatability, automation, and standardization of the software engineering domain to data science and machine learning engineering. By implementing MLOps with Kubernetes, data scientists, IT professionals, and data engineers can collaborate and build machine learning solutions that deliver business value for their organization.
You'll begin by understanding the different components of a machine learning project. Then, you'll design and build a practical end-to-end machine learning project using open source software. As you progress, you'll understand the basics of MLOps and the value it can bring to machine learning projects. You will also gain experience in building, configuring, and using an open source, containerized machine learning platform. In later chapters, you will prepare data, build and deploy machine learning models, and automate workflow tasks using the same platform. Finally, the exercises in this book will help you get hands-on experience in Kubernetes and open source tools, such as JupyterHub, MLflow, and Airflow.
By the end of this book, you'll have learned how to effectively build, train, and deploy a machine learning model using the machine learning platform you built.
Related to Machine Learning on Kubernetes
Related ebooks
Accelerating DevSecOps on AWS: Create secure CI/CD pipelines using Chaos and AIOps Rating: 0 out of 5 stars0 ratingsThe Kubernetes Operator Framework Book: Overcome complex Kubernetes cluster management challenges with automation toolkits Rating: 0 out of 5 stars0 ratingsBig Data on Kubernetes: A practical guide to building efficient and scalable data solutions Rating: 0 out of 5 stars0 ratingsArchitecting Cloud-Native Serverless Solutions: Design, build, and operate serverless solutions on cloud and open source platforms Rating: 0 out of 5 stars0 ratingsHybrid Cloud Management with Red Hat CloudForms Rating: 0 out of 5 stars0 ratingsAzure for Developers.: Implement rich Azure PaaS ecosystems using containers, serverless services, and storage solutions Rating: 0 out of 5 stars0 ratingsThe Azure Cloud Native Architecture Mapbook: Explore Microsoft Cloud's infrastructure, application, data, and security architecture Rating: 0 out of 5 stars0 ratingsAzure Stack Hub Demystified: Building hybrid cloud, IaaS, and PaaS solutions Rating: 0 out of 5 stars0 ratingsKubernetes in Production Best Practices: Build and manage highly available production-ready Kubernetes clusters Rating: 0 out of 5 stars0 ratingsAzure Containers Explained: Leverage Azure container technologies for effective application migration and deployment Rating: 0 out of 5 stars0 ratingsHands-On Microservices with Kubernetes: Build, deploy, and manage scalable microservices on Kubernetes Rating: 5 out of 5 stars5/5A Developer's Guide to .NET in Azure: Build quick, scalable cloud-native applications and microservices with .NET 6.0 and Azure Rating: 0 out of 5 stars0 ratingsMastering AWS CloudFormation: Build resilient and production-ready infrastructure in Amazon Web Services with CloudFormation Rating: 0 out of 5 stars0 ratingsMLOps with Red Hat OpenShift: A cloud-native approach to machine learning operations Rating: 0 out of 5 stars0 ratingsMastering Azure Machine Learning.: Execute large-scale end-to-end machine learning with Azure Rating: 0 out of 5 stars0 ratingsCloud Native with Kubernetes: Deploy, configure, and run modern cloud native applications on Kubernetes Rating: 0 out of 5 stars0 ratingsHands-On Azure for Developers: Implement rich Azure PaaS ecosystems using containers, serverless services, and storage solutions Rating: 0 out of 5 stars0 ratingsThe Kubernetes Bible: The definitive guide to deploying and managing Kubernetes across major cloud platforms Rating: 4 out of 5 stars4/5Windows Azure programming patterns for Start-ups Rating: 0 out of 5 stars0 ratingsLearning AWS Rating: 4 out of 5 stars4/5Machine Learning Engineering on AWS: Build, scale, and secure machine learning systems and MLOps pipelines in production Rating: 0 out of 5 stars0 ratingsLearning Docker Rating: 5 out of 5 stars5/5The Machine Learning Solutions Architect Handbook: Create machine learning platforms to run solutions in an enterprise setting Rating: 0 out of 5 stars0 ratingsRust Web Programming: A hands-on guide to developing fast and secure web apps with the Rust programming language Rating: 0 out of 5 stars0 ratingsKubernetes on AWS: Deploy and manage production-ready Kubernetes clusters on AWS Rating: 0 out of 5 stars0 ratings
Computers For You
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution Rating: 4 out of 5 stars4/5The Invisible Rainbow: A History of Electricity and Life Rating: 5 out of 5 stars5/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5Elon Musk Rating: 4 out of 5 stars4/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 4 out of 5 stars4/5Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 5 out of 5 stars5/5Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsHow to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally Rating: 4 out of 5 stars4/5Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls Rating: 4 out of 5 stars4/5Computer Science I Essentials Rating: 5 out of 5 stars5/5The Hacker Crackdown: Law and Disorder on the Electronic Frontier Rating: 4 out of 5 stars4/5Uncanny Valley: A Memoir Rating: 4 out of 5 stars4/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsCompTia Security 701: Fundamentals of Security Rating: 0 out of 5 stars0 ratingsPeople Skills for Analytical Thinkers Rating: 5 out of 5 stars5/5CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling Rating: 0 out of 5 stars0 ratingsCreating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5
Reviews for Machine Learning on Kubernetes
0 ratings0 reviews
Book preview
Machine Learning on Kubernetes - Faisal Masood
BIRMINGHAM—MUMBAI
Machine Learning on Kubernetes
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Dhruv Jagdish Kataria
Senior Editor: David Sugarman
Content Development Editor: Priyanka Soam
Technical Editor: Devanshi Ayare
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Nilesh Mohite
Marketing Coordinators: Shifa Ansari, Abeer Riyaz Dawe
First published: June 2022
Production reference: 1190522
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80324-180-7
www.packt.com
To my daughter, Yleana Zorelle – hopefully, this book will help you understand what Papa does for a living.
Ross Brigoli
To my wife, Bushra Arif – without your support, none of this would have become a reality
Faisal Masood
Contributors
About the authors
Faisal Masood is a principal architect at Red Hat. He has been helping teams to design and build data science and application platforms using OpenShift, Red Hat's enterprise Kubernetes offering. Faisal has over 20 years of experience in building software and has been building microservices since the pre-Kubernetes era.
Ross Brigoli is an associate principal architect at Red Hat. He has been designing and building software in various industries for over 18 years. He has designed and built data platforms and workflow automation platforms. Before Red Hat, Ross led a data engineering team as an architect in the financial services industry. He currently designs and builds microservices architectures and machine learning solutions on OpenShift.
About the reviewers
Audrey Reznik is a senior principal software engineer in the Red Hat Cloud Services – OpenShift Data Science team focusing on managed services, AI/ML workloads, and next-generation platforms. She has been working in the IT Industry for over 20 years in full stack development relating to data science roles. As a former technical advisor and data scientist, Audrey has been instrumental in educating data scientists and developers about what the OpenShift platform is and how to use OpenShift containers (images) to organize, develop, train, and deploy intelligent applications using MLOps. She is passionate about data science and, in particular, the current opportunities with machine learning and open source technologies.
Cory Latschkowski has made a number of major stops in various IT fields over the past two decades, including high-performance computing (HPC), cybersecurity, data science, and container platform design. Much of his experience was acquired within large organizations, including one Fortune 100 company. His last name is pronounced Latch - cow - ski. His passions are pretty moderate, but he will admit to a love of automation, Kubernetes, RTFM, and bacon. To learn more about his personal bank security questions, ping him on GitHub.
Shahebaz Sayed is a highly skilled certified cloud computing engineer with exceptional development ability and extensive knowledge of scripting and data serialization languages. Shahebaz has expertise in all three major clouds – AWS, Azure, and GCP. He also has extensive experience with technologies such as Kubernetes, Terraform, Docker, and others from the DevOps domain. Shahebaz is also certified with global certifications, including AWS Certified DevOps Engineer Professional, AWS Solution Architect Associate, Azure DevOps Expert, Azure Developer Associate, and Kubernetes CKA. He has also worked with Packt as a technical reviewer on multiple projects, including AWS Automation Cookbook, Kubernetes on AWS, and Kubernetes for Serverless Applications.
Table of Contents
Preface
Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why)
Chapter 1: Challenges in Machine Learning
Understanding ML
Delivering ML value
Choosing the right approach
The importance of data
Facing the challenges of adopting ML
Focusing on the big picture
Breaking down silos
Fail-fast culture
An overview of the ML platform
Summary
Further reading
Chapter 2: Understanding MLOps
Comparing ML to traditional programming
Exploring the benefits of DevOps
Understanding MLOps
ML
DevOps
ML project life cycle
Fast feedback loop
Collaborating over the project life cycle
The role of OSS in ML projects
Running ML projects on Kubernetes
Summary
Further reading
Chapter 3: Exploring Kubernetes
Technical requirements
Exploring Kubernetes major components
Control plane
Worker nodes
Kubernetes objects required to run an application
Becoming cloud-agnostic through Kubernetes
Understanding Operators
Setting up your local Kubernetes environment
Installing kubectl
Installing minikube
Installing OLM
Provisioning a VM on GCP
Summary
Part 2: The Building Blocks of an MLOps Platform and How to Build One on Kubernetes
Chapter 4: The Anatomy of a Machine Learning Platform
Technical requirements
Defining a self-service platform
Exploring the data engineering components
Data engineer workflow
Exploring the model development components
Understanding the data scientist workflow
Security, monitoring, and automation
Introducing ODH
Installing the ODH operator on Kubernetes
Enabling the ingress controller on the Kubernetes cluster
Installing Keycloak on Kubernetes
Summary
Further reading
Chapter 5: Data Engineering
Technical requirements
Configuring Keycloak for authentication
Importing the Keycloak configuration for the ODH components
Creating a Keycloak user
Configuring ODH components
Installing ODH
Understanding and using JupyterHub
Validating the JupyterHub installation
Running your first Jupyter notebook
Understanding the basics of Apache Spark
Understanding Apache Spark job execution
Understanding how ODH provisions Apache Spark cluster on-demand
Creating a Spark cluster
Understanding how JupyterHub creates a Spark cluster
Writing and running a Spark application from Jupyter Notebook
Summary
Chapter 6: Machine Learning Engineering
Technical requirements
Understanding ML engineering
Using a custom notebook image
Building a custom notebook container image
Introducing MLflow
Understanding MLflow components
Validating the MLflow installation
Using MLFlow as an experiment tracking system
Adding custom data to the experiment run
Using MLFlow as a model registry system
Summary
Chapter 7: Model Deployment and Automation
Technical requirements
Understanding model inferencing with Seldon Core
Wrapping the model using Python
Containerizing the model
Deploying the model using the Seldon controller
Packaging, running, and monitoring a model using Seldon Core
Introducing Apache Airflow
Understanding DAG
Exploring Airflow features
Understanding Airflow components
Validating the Airflow installation
Configuring the Airflow DAG repository
Configuring Airflow runtime images
Automating ML model deployments in Airflow
Creating the pipeline by using the pipeline editor
Summary
Part 3: How to Use the MLOps Platform and Build a Full End-to-End Project Using the New Platform
Chapter 8: Building a Complete ML Project Using the Platform
Reviewing the complete picture of the ML platform
Understanding the business problem
Data collection, processing, and cleaning
Understanding data sources, location, and the format
Understanding data processing and cleaning
Performing exploratory data analysis
Understanding sample data
Understanding feature engineering
Data augmentation
Building and evaluating the ML model
Selecting evaluation criteria
Building the model
Deploying the model
Reproducibility
Summary
Chapter 9: Building Your Data Pipeline
Technical requirements
Automated provisioning of a Spark cluster for development
Writing a Spark data pipeline
Preparing the environment
Understanding data
Designing and building the pipeline
Using the Spark UI to monitor your data pipeline
Building and executing a data pipeline using Airflow
Understanding the data pipeline DAG
Building and running the DAG
Summary
Chapter 10: Building, Deploying, and Monitoring Your Model
Technical requirements
Visualizing and exploring data using JupyterHub
Building and tuning your model using JupyterHub
Tracking model experiments and versioning using MLflow
Tracking model experiments
Versioning models
Deploying the model as a service
Calling your model
Monitoring your model
Understanding monitoring components
Configuring Grafana and a dashboard
Summary
Chapter 11: Machine Learning on Kubernetes
Identifying ML platform use cases
Considering AutoML
Commercial platforms
ODH
Operationalizing ML
Setting the business expectations
Dealing with dirty real-world data
Dealing with incorrect results
Maintaining continuous delivery
Managing security
Adhering to compliance policies
Applying governance
Running on Kubernetes
Avoiding vendor lock-ins
Considering other Kubernetes platforms
Roadmap
Summary
Further reading
Other Books You May Enjoy
Preface
Machine Learning (ML) is the new black. Organizations are investing in adopting and uplifting their ML capabilities to build new products and improve customer experience. The focus of this book is on assisting organizations and teams to get business value out of ML initiatives. By implementing MLOps with Kubernetes, data scientists, IT operations professionals, and data engineers will be able to collaborate and build ML solutions that create tangible outcomes for their business. This book enables teams to take a practical approach to work together to bring the software engineering discipline to the ML project life cycle.
You'll begin by understanding why MLOps is important and discover the different components of an ML project. Later in the book, you'll design and build a practical end-to-end MLOps project that'll use the most popular OSS components. As you progress, you'll get to grips with the basics of MLOps and the value it can bring to your ML projects, as well as gaining experience in building, configuring, and using an open source, containerized ML platform on Kubernetes. Finally, you'll learn how to prepare data, build and deploy models quickly, and automate tasks for an efficient ML pipeline using a common platform. The exercises in this book will help you get hands-on with using Kubernetes and integrating it with OSS, such as JupyterHub, MLflow, and Airflow.
By the end of this book, you'll have learned how to effectively build, train, and deploy an ML model using the ML platform you built.
Who this book is for
This book is for data scientists, data engineers, IT platform owners, AI product owners, and data architects who want to use open source components to compose an ML platform. Although this book starts with the basics, a good understanding of Python and Kubernetes, along with knowledge of the basic concepts of data science and data engineering, will help you grasp the topics covered in this book much better.
What this book covers
Chapter 1, Challenges in Machine Learning, discusses the challenges organizations face in adopting ML and why a good number of ML initiatives may not deliver the expected outcomes. The chapter further discusses the top few reasons why organizations face these challenges.
Chapter 2, Understanding MLOps, continues building on the identified set of problems from Chapter 1, Challenges in Machine Learning, and discusses how we can tackle the challenges in adopting ML. The chapter will provide the definition of MLOps and how it helps organizations to get value out of their ML initiatives. The chapter also provides a blueprint on how companies can adopt MLOps in their ML projects.
Chapter 3, Exploring Kubernetes, first describes why we have chosen Kubernetes as the basis for MLOps in this book. The chapter further defines the core concept of Kubernetes and assists you in creating an environment where the code can be tested. The world is changing fast and part of this high-velocity disruption is the availability of the cloud and cloud-based solutions. This chapter provides an overview of how the Kubernetes-based platform can give you the flexibility to run your solution anywhere.
Chapter 4, The Anatomy of a Machine Learning Platform, takes a 1,000-foot view of what an ML platform looks like. You already know what problems MLOps solves. This chapter defines the components of an MLOps platform in a technology-agnostic way. You will build a solid foundation on the core components of an MLOps platform.
Chapter 5, Data Engineering, covers an important part of any ML project that is often missed. A good number of ML tutorials/books start with a clean dataset, maybe a CSV file to build your model against. The real world is different. Data comes in many shapes and sizes and it is important that you have a well-defined strategy to harvest, process, and prepare data at scale. This chapter will define the role data engineering plays in a successful ML project. It will discuss OSS tools that can provide the basis for data engineering. The chapter will then talk about how you can install these toolsets on the Kubernetes platform.
Chapter 6, Machine Learning Engineering, will move the discussion to the model building tuning and deployment activities of an ML development life cycle. The chapter will discuss providing a self-service solution to data scientists so they can work more efficiently and collaborate with data engineering teams and fellow data scientists using the same platform. It will also discuss OSS tools that can provide the basis for model development. The chapter will then talk about how you can install these toolsets on the Kubernetes platform.
Chapter 7, Model Deployment and Automation, covers the deployment phase of the ML project life cycle. The model you build knows the data you provided to it. In the real world, however, the data changes. This chapter discusses the tools and techniques to monitor your model performance. This performance data could be used to decide whether the model needs retraining on a new dataset or whether it's time to build a new model for the given problem.
Chapter 8, Building a Complete ML Project Using the Platform, will define a typical ML project and how each component of the platform is utilized in every step of the project life cycle. The chapter will define the outcomes and requirements of the project and focus on how the MLOps platform facilitates the project life cycle.
Chapter 9, Building Your Data Pipeline, will show how a Spark cluster can be used to ingest and process data. The chapter will show how the platform enables the data engineer to read the raw data from any storage, process it, and write it back to another storage. The main focus is to demonstrate how a Spark cluster can be created on-demand and how workloads could be isolated in a shared environment.
Chapter 10, Building, Deploying, and Monitoring Your Model, will show how the JuyterHub server can be used to build, train, and tune models on the platform. The chapter will show how the platform enables the data scientist to perform the modeling activities in a self-serving fashion. This chapter will also introduce MLflow as the model experiment tracking and model registry component. Now you have a working model, how do you want to share this model for the other teams to consume? This chapter will show how the Seldon Core component allows non-programmers to expose their models as REST APIs. You will see how the deployed APIs automatically scale out using the Kubernetes capabilities.
Chapter 11, Machine Learning on Kubernetes, will take you through some of the key ideas to bring forth with you to further your knowledge on the subject. This chapter will cover identifying use cases for the ML platform, operationalizing ML, and running on Kubernetes.
To get the most out of this book
You will need a basic working knowledge of Kubernetes and Python to get the most out of this book's technical exercises. The platform uses multiple software components to cover the full ML development life cycle. You will need the recommended hardware to run all the components with ease.
Running the platform requires a good amount of compute resources. If you do not have the required number of CPU cores and memory on your desktop or laptop computer, we recommend running a virtual machine on Google Cloud or any other cloud platform.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
A good follow-up after you finish with this book is to create a proof of concept within your team or organization using the platform. Assess the benefits and learn how you can further optimize your organization's data science and ML project life cycle.
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Machine-Learning-on-Kubernetes. If there's an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781803241807_ColorImages.pdf.
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Notice that you will need to adjust the following command and change the quay.io/ml-on-k8s/ part before executing the command."
A block of code is set as follows:
docker tag scikit-notebook:v1.1.0 quay.io/ml-on-k8s/scikit-notebook:v1.1.0
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
gcloud compute project-info add-metadata --metadata enable-oslogin=FALSE
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: The installer will present the following License Agreement screen. Click I Agree.
Tips or Important Notes
Appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Reviews
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
Share Your Thoughts
Once you've read Machine Learning on Kubernetes, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.
Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why)
In this section, we will define what MLOps is and why it is critical to the success of your AI journey. You will go through the challenges organizations may encounter in their AI journey and how MLOps can assist in overcoming those challenges.
The last chapter of this section will provide a refresher on Kubernetes and the role it plays in bringing MLOps to the OSS community. This is by no means a guide to Kubernetes, and you should consult other sources for a guide on Kubernetes.
This section comprises the following chapters:
Chapter 1, Challenges in Machine Learning
Chapter 2, Understanding MLOps
Chapter 3, Exploring Kubernetes
Chapter 1: Challenges in Machine Learning
Many people believe that artificial intelligence (AI) is all about the idea of a humanoid robot or an intelligent computer program that takes over humanity. The shocking news is that we are not even close to this. A better term for such incredible machines is human-like intelligence or artificial general intelligence (AGI).
So, what is AI? A more straightforward answer would be a system that uses a combination of data and algorithms to make predictions. AI practitioners call it machine learning or ML. A particular subset of ML algorithms, called deep learning (DL), refers to an ML algorithm that uses a series of steps, or layers, of computation (Goodfellow, Bengio, and Courville, 2017). This technique employs deep neural networks (DNNs) with multiple layers of artificial neurons that mimic the architecture of the human brain. Though it sounds complicated enough, it does not always mean that all DL systems will have a better performance compared to other AI algorithms or even a traditional programming approach.
ML is not always about DL. Sometimes, a basic statistical model may be a better fit for a problem you are trying to solve than a complex DNN. One of the challenges of implementing ML is about selecting the right approach. Moreover, delivering an ML project comes with other challenges, not only on the business and technology side but also in people and processes. These challenges are the primary reasons why most ML initiatives fail to deliver their expected value.
In this chapter, we will revisit a basic understanding of ML and understand the challenges in delivering ML projects that can lead to a project not delivering its promised value.
The following topics will be covered:
Understanding ML
Delivering ML value
Choosing the right approach
Facing the challenges of adopting ML
An overview of the ML platform
Understanding ML
In traditional computer programming, a human programmer must write a clear set of instructions in order for a computer program to perform an operation or provide an answer to a question. In ML, however, a human (usually an ML engineer or data scientist) uses data and an