BookSlides 11 The Art of Machine Learning For Predictive Data Analytics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Fundamentals of Machine Learning for


Predictive Data Analytics
Chapter 11: The Art of Machine Learning for Predictive
Data Analytics

John Kelleher and Brian Mac Namee and Aoife D’Arcy

[email protected] [email protected] [email protected]


Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

1 Different Perspectives on Prediction Models

2 Choosing a Machine Learning Approach


Matching Machine Learning Approaches to Projects
Matching Machine Learning Approaches to Data

3 Your Next Steps


Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Predictive data analytics projects use machine learning to


build models that capture the relationships in large
datasets between descriptive features and a target feature.
Machine Learning ≈ inductive learning
1 a model learned by induction is not guaranteed to be
correct.
2 learning cannot occur unless the learning process is biased
in some way.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

En masse all of the questions that must be answered to


successfully complete a predictive data analytics project
can seem overwhelming.
This is why we recommend using the CRISP-DM process
to manage a project through its lifecycle.
CRISP-DM Open Questions Chapter
Business
What is the organizational problem be- Chapter 2
Understanding
ing addressed? In what ways could a
prediction model address the organiza-
tional problem? Do we have situational
fluency? What is the capacity of the or-
ganization to utilize the output of a pre-
diction model? What data is available?
Data
What is the prediction subject? What are Chapter 2
Understanding
the domain concepts? What is the target
feature? What descriptive features will be
used?
CRISP-DM Open Questions Chapter
Data
Are there data quality issues? Chapter 3
Preparation
How will we handle missing val-
ues? How will we normalize our
features? What features will we
include?
Modelling
What types of models will we Chapters 4, 5, 6
use? How will we set the pa- and 7
rameters of the machine learn-
ing algorithms? Have underfit-
ting or overfitting occurred?
CRISP-DM Open Questions Chapter
Evaluation
What evaluation process will Chapter 8
we follow? What performance
measures will we use? Is the
model fit for purpose?
Deployment
How will we continue to eval- Section 8.4.6 and
uate the model after deploy- Chapters 9 and
ment? How will the model be in- 10
tegrated into the organization?
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Different Perspectives on
Prediction Models
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

X
H(t, D) = − (P(t = l) × log2 (P(t = l))) (1)
l∈levels(t)

v
u m
uX
dist(q, d) = t (q[i] − d[i])2 (2)
i=1

P(q|t = l)P(t = l)
P(t = l|q) = (3)
P(q)
n
1X
L2 (Mw , D) = (ti − Mw (di ))2 (4)
2
i=1
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

parametric versus non-parametric


generally describes whether the the size of the domain
representation used to define a model is solely
determined by the number of features in the domain or if it
is affected by the number of instances in the dataset.
parametric model the size of the domain representation is
independent of the number of instances in the dataset
(e.g., naive Bayes’, regression models)
non-parametric model the number of parameters used by
the model increases as the number of instances increases
(e.g., decision trees)
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

generative versus discriminative


A model is generative if it can be used to generate data
that will have the same characteristics as the dataset from
which the model was produced.
A discriminative models learn the boundary between
classes rather than the characteristics of the distributions
of the different classes.

Generative and discriminative models attempt to learn


different things!
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

A generative model works by:


1 learning P(d|tl ) (class conditional densities) and P(tl );
2 then using Bayes’ theorem to compute P(tl |d);
3 applying a decision rule over the class posteriors to return
a target level.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

A discriminative model works by:


1 learning the class posterior probability P(tl |d) directly from
the data
2 and then applying a decision rule over the class posteriors
to return a target level.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

0.5

1.0
P(t=l1|d) P(t=l2|d)
P(d|t=l1) P(d|t=l2)
0.4

0.8
Class Conditional Density

Class Posteriors
0.3

0.6
0.2

0.4
0.1

0.2
0.0

0.0
0 2 4 6 8 10 0 2 4 6 8 10
d d

(a) (b)

Figure: (a) The class conditional densities for two classes (’l1’,’l2’)
with a single descriptive feature d. (b) The class posterior
probabilities plotted for each class for different values of d.
P(t = ’l1’|d) is not affected by the multimodal structure of the
corresponding class conditional density P(d|t = ’l1’). Based on
Figure 1.27 from (Bishop, 2006).
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Table: A taxonomy of models based on the parametric versus


non-parametric and generative versus discriminative distinctions.
Parametric/ Generative/
Model Non-Parametric Discriminative
k nearest neighbor Non-Parametric Generative
Decision Trees Non-Parametric Discriminative
Bagging/Boosting Parametric1 Discriminative
Naive Bayes Parametric Generative
Bayesian Network Parametric Generative
Linear Regression Parametric Discriminative
Logistic Regression Parametric Discriminative
SVM Non-Parametric Discriminative

1
While the individual models in an ensemble could be non-parametric (for
example when decision trees are used) the ensemble model itself is
considered parametric.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Choosing a Machine Learning


Approach
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

there is not one best approach that always outperforms the


others; no free lunch theorem.
We can see the assumptions encoded in each algorithm
reflected in the distinctive characteristics of the decision
boundaries that they learn for categorical prediction tasks.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

‘bad’ ‘bad’ ‘bad’


Data Sets
F2

F2
‘good’ F2 ‘good’
‘good’

F1 F1 F1
F1

F2
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Typically choose a number of different approaches and to


run experiments to evaluate which best the particular
project.
There are two questions to consider in the selection of a
set of initial approaches:
1 Does a machine learning approach match the requirements
of the project?
2 Is the approach suitable for the type of prediction we want
to make and the types of descriptive features we are using?
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Matching Machine Learning Approaches to Projects

Project Requirements
Accuracy
Prediction speed
Capacity for retraining
Interpretability
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Matching Machine Learning Approaches to Data

Data Considerations
continuous target → error based models
categorical target → information and probability models
continuous descriptive features → (+cat. target) similarity
based models / (+cont. target) error based models
categorical descriptive features → information and
probability models
lots of descriptive features (curse of dimensionality) →
feature selection
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Your Next Steps


Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

The easy part of a predictive data analytics project is


building the models.
What makes predictive data analytics difficult, but also
fascinating, is figuring out how to answer all the questions
that surround the modelling phase of a project.
Intuition, experience, and experimentation!
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Key tasks for an analyst


become situationally fluent so that we can converse with
experts in the application domain;
explore the data to understand it correctly;
spend time cleaning the data;
think hard about the best ways to represent features;
spend time designing the evaluation process correctly.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

Machine learning is a huge topic.


There are lots of topics we haven’t covered: deep
learning, graphical models, multi-label classification,
association mining, clustering, . . .
But, we hope this course has given you the knowledge and
skills that you will need to explore machine learning
yourself.
Different Perspectives on Prediction Models Choosing a Machine Learning Approach Your Next Steps

1 Different Perspectives on Prediction Models

2 Choosing a Machine Learning Approach


Matching Machine Learning Approaches to Projects
Matching Machine Learning Approaches to Data

3 Your Next Steps

You might also like