0% found this document useful (0 votes)
1K views2 pages

Machine Learning Using Python

This document provides an overview of a machine learning course using Python. The course covers topics such as accessing and preparing data with Pandas, data exploration and visualization, machine learning algorithms for regression and classification like linear regression, logistic regression, decision trees and KNN. Ensemble methods like random forests and bagging/boosting are also covered. The course includes hands-on exercises using real-world datasets and case studies in areas like store sales prediction, customer churn prediction, and customer segmentation. The course is taught over 2 days and requires basic Python and data analysis skills.

Uploaded by

Narendra Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
1K views2 pages

Machine Learning Using Python

This document provides an overview of a machine learning course using Python. The course covers topics such as accessing and preparing data with Pandas, data exploration and visualization, machine learning algorithms for regression and classification like linear regression, logistic regression, decision trees and KNN. Ensemble methods like random forests and bagging/boosting are also covered. The course includes hands-on exercises using real-world datasets and case studies in areas like store sales prediction, customer churn prediction, and customer segmentation. The course is taught over 2 days and requires basic Python and data analysis skills.

Uploaded by

Narendra Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 2

Machine Learning using Python

Topics:
Data Science is emerging as a hot new profession and academic
discipline and machine learning is a key area in data science. Harvard • Accessing, preparing and exploring
Business Review says Data Scientist is the Sexiest Job of the 21st data with Pandas & Scipy
Century. But demand for data scientists is racing ahead of supply. • Data Exploration and visualization &
People with the necessary skills are scarce, primarily because the Basic Statistical Analysis
discipline is so new. This course is designed to give a start and • Machine Learning Basics – Loss
introduction to this new discipline. This course is spread across 2 days Function, Gradient Descent, Bias
and will have a plenty of hands on exercises using real world datasets. Variance Trade Off, Underfit and
Overfit of Models
• Building linear and non-linear
models for Regression and
classification problems
Pre-requisites: • Regularization and Parameter
Tuning
ü A basic understanding of data and programming • Applying various algorithms - Linear
ü Programming knowledge using Python is essential Regression, Logistic Regression,
Decision Trees, KNN
Hardware & Software: • Ensemble Methods – Bagging and
Boosting, Random Forest
ü A desktop or notebook with 64 bit OS (Windows/Mac)
• Understanding Model Evaluation
ü 8 GB RAM Metrics
ü High speed Internet connection 256 kbps+

ü Latest Anaconda Continuum Platform for Python 3.5 Duration: 2 Days



Instructor Profile

Manaranjan Pradhan has about 16+ years of industry experience working on Cloud computing, Big Data, Data Science &
Machine Learning. He has worked with TCS, HP, and iGATE and worked on large scale projects for customers like
Motorola, Home Depot, CKWB Bank, P&G in the roles of solution and technical architect. He is a freelance who provides
consulting and training on Cloud Computing, Big data & Data Science including Machine Learning. He has been teaching
Big Data and Machine Learning for 3 years and has trained more than 500 people from several large MNCs including EMC,
CISCO, HP, YODLEE, YAHOO, SAMSUNG, VeriSign, Success Factors & Goldman Sachs etc. He is also a guest lecture on Big
Data and Machine Learning at IIM Bangalore.

He is an alumni of Indian Institute of Management (IIM), Bangalore and has completed certification on Business
Analytics and Intelligence program. He has data science and scalable machine learning certifications from Coursera and
edx.org.

He had published an Analytics Case in Harvard Business Publishing :
https://cb.hbsp.harvard.edu/cbmp/product/IMB573-PDF-ENG

He writes his blog at http://www.awesomestats.in/

Connect with him on Linked in http://in.linkedin.com/pub/manaranjan-pradhan/a/6bb/314




Machine Learning using Python

Introduction to Data Science
Introduction to Data Science and
Setting up Python Environment for Data Analysis
Setting up data analysis
environment
Overview of Data Analysis Stack - Numpy, Pandas, Matplotlib, scipy and Scikit-learn

Loading data from Different Sources
Accessing and preparing data
Data manipulation - Filtering, Grouping, Ordering, Joining
with Pandas
Dealing with missing Data

Histograms, Bar charts
Data Exploration, Visualizations
Density Plots, Box Plots, Scatter Plots, Heat Maps
& Statistical Analysis
Understanding Basic Statistics, Distributions, Correlations

Understanding loss function and gradient descent approach for loss minimization
Linear Regression, Logistic Regression
Algorithms for Regression and
Decision Trees, KNN
Classification Problems
Bias & Variance Trade-off
Regularization & Parameter Tuning

Random Forest
Ensemble Methods
Bagging & Boosting

K-means clustering
Clustering
Finding optimal number of clusters

Creating Training, validation and Test Data Sets
Model Evaluation Cross validations
Understanding Evaluation Metrics: RMSE, R-square, ROC, Confusion Matrix,
Precision, Recall, Accuracy etc.

Following are the Case studies that will be explained using the above techniques.

Store Sales Prediction:


Store managers need to predict their daily sales for up to several weeks in advance to ensure they do not end up with empty shelves,
which could mean unhappy customers. Similarly, store managers also do not want to end up with lots of leftover inventories, which
could mean additional overhead cost to the store. Store sales are influenced by many factors like seasonality, competition and
promotions and where they are located.

• Explore the data understand effect of various factors on store sales.


• Build a predictive model to forecast the store sales.
• Evaluate the model accuracy.


Customer Churn Prediction:
Companies invest significant amount of money to acquire new customers in anticipation of future revenues. Losing customers mean
loss of initial investment on acquisition and loss of possible future revenue. So, it is important for companies to predict early signs if a
customer is about to churn. And then engage or offer incentives to customers to retain them.

• Understand factors influencing churn


• Build a model to predict if a customer is about churn
• Predict the probabilities of a customer churning in future
• Evaluate the model accuracy.

Customer Segmentation:
RFM analysis is a customer segmentation technique that can help retailers maximize the return on their marketing investments.
Under RFM analysis, each customer is scored based on three factors, Recency, Frequency, and Monetary value.

• Calculate RFM attributes for each customer


• Create customer segments using clustering techniques
• Find the optimal number of clusters or segments

You might also like