Basics of Machine Learning
Basics of Machine Learning
Basics of Machine Learning
<date/time> <footer> 1
Machine learning
Machine learning is the branch of Artificial Intelligence that
focuses on developing models and algorithms that let computers
learn from data and improve from previous experience without
being explicitly programmed for every task.
ML teaches the systems to think and understand like humans by
learning from the data.
Machine learning is generally a training system to learn from past
experiences and improve performance over time.
Machine learning helps to predict massive amounts of data.
It helps to deliver fast and accurate results to get profitable
opportunities.
02/04/2024 2
Types of Machine Learning
There are several types of machine learning, each with special
characteristics and applications.
Some of the main types of machine learning algorithms are as
follows:
Supervised Machine Learning
Unsupervised Machine Learning
Semi-Supervised Machine Learning
Reinforcement Learning
04/02/2024 3
02/04/2024 4
1. Supervised Machine Learning
Supervised learning is defined as when a model gets trained on a
“Labelled Dataset”.
Labelled datasets have both input and output parameters.
In Supervised Learning, algorithms learn to map points between
inputs and correct outputs.
It has both training and validation datasets labelled.
02/04/2024 5
02/04/2024 6
Example: Consider a scenario where you have to build an image
classifier to differentiate between cats and dogs.
If you feed the datasets of dogs and cats labelled images to the
algorithm, the machine will learn to classify between a dog or a
cat from these labeled images.
When we input new dog or cat images that it has never seen
before, it will use the learned algorithms and predict whether it is a
dog or a cat. This is how supervised learning works, and this is
particularly an image classification.
02/04/2024 7
There are two main categories of supervised learning that are
mentioned below:
Classification
Regression
02/04/2024 8
Classification
Classification deals with predicting categorical target variables,
which represent discrete classes or labels. For instance,
classifying emails as spam or not spam, or predicting whether a
patient has a high risk of heart disease. Classification algorithms
learn to map the input features to one of the predefined classes.
Here are some classification algorithms:
Logistic Regression
Support Vector Machine
Random Forest
Decision Tree
K-Nearest Neighbors (KNN)
Naive Bayes
02/04/2024 9
Regression
Regression, on the other hand, deals with predicting continuous
target variables, which represent numerical values. For example,
predicting the price of a house based on its size, location, and
amenities, or forecasting the sales of a product. Regression
algorithms learn to map the input features to a continuous
numerical value.
Here are some regression algorithms:
Linear Regression
Polynomial Regression
Ridge Regression
Lasso Regression
Decision tree
Random Forest
02/04/2024 10
Advantages of Supervised Machine Learning
Supervised Learning models can have high accuracy as they are
trained on labelled data.
The process of decision-making in supervised learning models is
often interpretable.
It can often be used in pre-trained models which saves time and
resources when developing new models from scratch.
02/04/2024 11
Disadvantages of Supervised Machine Learning
It has limitations in knowing patterns and may struggle with
unseen or unexpected patterns that are not present in the training
data.
It can be time-consuming and costly as it relies on labeled data
only.
It may lead to poor generalizations based on new data.
02/04/2024 12
Applications of Supervised Learning
•
Supervised learning is used in a wide variety of applications, including:
Image classification: Identify objects, faces, and other features in images.
Natural language processing: Extract information from text, such as sentiment,
entities, and relationships.
Speech recognition: Convert spoken language into text.
Recommendation systems: Make personalized recommendations to users.
Predictive analytics: Predict outcomes, such as sales, customer churn, and
stock prices.
Medical diagnosis: Detect diseases and other medical conditions.
Fraud detection: Identify fraudulent transactions.
•
02/04/2024 13
Autonomous vehicles: Recognize and respond to objects in the environment.
Email spam detection: Classify emails as spam or not spam.
Quality control in manufacturing: Inspect products for defects.
Credit scoring: Assess the risk of a borrower defaulting on a loan.
Gaming: Recognize characters, analyze player behavior, and create NPCs.
Customer support: Automate customer support tasks.
Weather forecasting: Make predictions for temperature, precipitation, and other
meteorological parameters.
Sports analytics: Analyze player performance, make game predictions, and
optimize strategies.
02/04/2024 14
Unsupervised Machine Learning
Unsupervised learning is a type of machine learning technique in
which an algorithm discovers patterns and relationships using
unlabeled data.
Unlike supervised learning, unsupervised learning doesn’t involve
providing the algorithm with labeled target outputs.
The primary goal of Unsupervised learning is often to discover
hidden patterns, similarities, or clusters within the data, which can
then be used for various purposes, such as data exploration,
visualization, dimensionality reduction, and more.
02/04/2024 15
02/04/2024 16
Example: Consider that you have a dataset that contains
information about the purchases you made from the shop.
Through clustering, the algorithm can group the same purchasing
behavior among you and other customers, which reveals potential
customers without predefined labels.
This type of information can help businesses get target customers
as well as identify outliers.
02/04/2024 17
There are two main categories of unsupervised learning that are mentioned
below:
Clustering
Association
Clustering
Clustering is the process of grouping data points into clusters based on
their similarity. This technique is useful for identifying patterns and
relationships in data without the need for labeled examples.
02/04/2024 18
Here are some clustering algorithms:
K-Means Clustering algorithm
Mean-shift algorithm
DBSCAN Algorithm
Principal Component Analysis
Independent Component Analysis
02/04/2024 19
Association
Association rule learning is a technique for discovering
relationships between items in a dataset. It identifies rules that
indicate the presence of one item implies the presence of another
item with a specific probability.
Here are some association rule learning algorithms:
Apriori Algorithm
Eclat
FP-growth Algorithm
02/04/2024 20
Advantages of Unsupervised Machine Learning
It helps to discover hidden patterns and various relationships
between the data.
Used for tasks such as customer segmentation, anomaly
detection, and data exploration.
It does not require labeled data and reduces the effort of data
labeling.
02/04/2024 21
Disadvantages of Unsupervised Machine
Learning
Without using labels, it may be difficult to predict the quality of the
model’s output.
Cluster Interpretability may not be clear and may not have
meaningful interpretations.
It has techniques such as autoencoders and dimensionality
reduction that can be used to extract meaningful features from
raw data.
02/04/2024 22
Applications of Unsupervised Learning
Clustering: Group similar data points into clusters.
Anomaly detection: Identify outliers or anomalies in data.
Dimensionality reduction: Reduce the dimensionality of data while
preserving its essential information.
Recommendation systems: Suggest products, movies, or content to
users based on their historical behavior or preferences.
Topic modeling: Discover latent topics within a collection of documents.
Density estimation: Estimate the probability density function of data.
Image and video compression: Reduce the amount of storage required
for multimedia content.
02/04/2024 23
Data preprocessing: Help with data preprocessing tasks such as
data cleaning, imputation of missing values, and data scaling.
Market basket analysis: Discover associations between products.
Genomic data analysis: Identify patterns or group genes with
similar expression profiles.
Image segmentation: Segment images into meaningful regions.
Community detection in social networks: Identify communities or
groups of individuals with similar interests or connections.
Customer behavior analysis: Uncover patterns and insights for
better marketing and product recommendations.
02/04/2024 24
3. Semi-Supervised Learning
Semisupervised learning is a machine learning algorithm that works
between the supervised and unsupervised learning so it uses both labelled
and unlabelled data.
It’s particularly useful when obtaining labeled data is costly, time-
consuming, or resource-intensive. This approach is useful when the dataset
is expensive and time-consuming. Semi-supervised learning is chosen
when labeled data requires skills and relevant resources in order to train or
learn from it.
We use these techniques when we are dealing with data that is a little bit
labeled and the rest large portion of it is unlabeled.
We can use the unsupervised techniques to predict labels and then feed
these labels to supervised techniques. This technique is mostly applicable
in the case of image data sets where usually all images are not labeled.
02/04/2024 25
m
02/04/2024 26
Example: Consider that we are building a language translation
model, having labeled translations for every sentence pair can be
resources intensive. It allows the models to learn from labeled and
unlabeled sentence pairs, making them more accurate. This
technique has led to significant improvements in the quality of
machine translation services.
02/04/2024 27
Advantages of Semi- Supervised Machine
Learning
It leads to better generalization as compared to supervised
learning, as it takes both labeled and unlabeled data.
Can be applied to a wide range of data.
02/04/2024 28
Disadvantages of Semi- Supervised Machine
Learning
Semi-supervised methods can be more complex to implement
compared to other approaches.
It still requires some labeled data that might not always be
available or easy to obtain.
The unlabeled data can impact the model performance
accordingly.
02/04/2024 29
Applications of Semi-Supervised Learning
Image Classification and Object Recognition: Improve the accuracy of models by
combining a small set of labeled images with a larger set of unlabeled images.
Natural Language Processing (NLP): Enhance the performance of language models and
classifiers by combining a small set of labeled text data with a vast amount of unlabeled
text.
Speech Recognition: Improve the accuracy of speech recognition by leveraging a limited
amount of transcribed speech data and a more extensive set of unlabeled audio.
Recommendation Systems: Improve the accuracy of personalized recommendations by
supplementing a sparse set of user-item interactions (labeled data) with a wealth of
unlabeled user behavior data.
Healthcare and Medical Imaging: Enhance medical image analysis by utilizing a small
set of labeled medical images alongside a larger set of unlabeled images.
02/04/2024 30
Reinforcement Machine Learning
algorithm is a learning method that interacts with the environment
by producing actions and discovering errors. Trial, error, and delay
are the most relevant characteristics of reinforcement learning. In
this technique, the model keeps on increasing its performance
using Reward Feedback to learn the behavior or pattern. These
algorithms are specific to a particular problem e.g. Google Self
Driving car, AlphaGo where a bot competes with humans and
even itself to get better and better performers in Go Game. Each
time we feed in data, they learn and add the data to their
knowledge which is training data. So, the more it learns the better
it gets trained and hence experienced.
02/04/2024 31
Here are some of most common reinforcement learning algorithms:
Q-learning: Q-learning is a model-free RL algorithm that learns a Q-
function, which maps states to actions. The Q-function estimates the
expected reward of taking a particular action in a given state.
SARSA (State-Action-Reward-State-Action): SARSA is another model-
free RL algorithm that learns a Q-function. However, unlike Q-learning,
SARSA updates the Q-function for the action that was actually taken,
rather than the optimal action.
Deep Q-learning: Deep Q-learning is a combination of Q-learning and
deep learning. Deep Q-learning uses a neural network to represent the
Q-function, which allows it to learn complex relationships between
states and actions.
02/04/2024 32
Example: Consider that you are training an AI agent to play a game like chess. The agent explores
different moves and receives positive or negative feedback based on the outcome. Reinforcement
Learning also finds applications in which they learn to perform tasks by interacting with their
surroundings.
02/04/2024 33
Types of Reinforcement Machine Learning
There are two main types of reinforcement learning:
Positive reinforcement
Rewards the agent for taking a desired action.
Encourages the agent to repeat the behavior.
Examples: Giving a treat to a dog for sitting, providing a point in a game for a correct
answer.
Negative reinforcement
Removes an undesirable stimulus to encourage a desired behavior.
Discourages the agent from repeating the behavior.
Examples: Turning off a loud buzzer when a lever is pressed, avoiding a penalty by
completing a task.
02/04/2024 34
Advantages of Reinforcement Machine Learning
It has autonomous decision-making that is well-suited for tasks
and that can learn to make a sequence of decisions, like robotics
and game-playing.
This technique is preferred to achieve long-term results that are
very difficult to achieve.
It is used to solve a complex problems that cannot be solved by
conventional techniques.
02/04/2024 35
Disadvantages of Reinforcement Machine
Learning
Training Reinforcement Learning agents can be computationally
expensive and time-consuming.
Reinforcement learning is not preferable to solving simple
problems.
It needs a lot of data and a lot of computation, which makes it
impractical and costly.
02/04/2024 36
Applications of Reinforcement Machine Learning
Game Playing: RL can teach agents to play games, even complex
ones.
Robotics: RL can teach robots to perform tasks autonomously.
Autonomous Vehicles: RL can help self-driving cars navigate and
make decisions.
Recommendation Systems: RL can enhance recommendation
algorithms by learning user preferences.
Healthcare: RL can be used to optimize treatment plans and drug
discovery.
Natural Language Processing (NLP): RL can be used in dialogue
systems and chatbots.
02/04/2024 37
Energy Management: RL can be used to optimize energy
consumption.
Game AI: RL can be used to create more intelligent and adaptive
NPCs in video games.
Adaptive Personal Assistants: RL can be used to improve
personal assistants.
Virtual Reality (VR) and Augmented Reality (AR): RL can be used
to create immersive and interactive experiences.
Industrial Control: RL can be used to optimize industrial
processes.
Education: RL can be used to create adaptive learning systems.
02/04/2024 38