Machine Learning Using Python
Machine Learning Using Python
Topics:
Data Science is emerging as a hot new profession and academic
discipline and machine learning is a key area in data science. Harvard • Accessing, preparing and exploring
Business Review says Data Scientist is the Sexiest Job of the 21st data with Pandas & Scipy
Century. But demand for data scientists is racing ahead of supply. • Data Exploration and visualization &
People with the necessary skills are scarce, primarily because the Basic Statistical Analysis
discipline is so new. This course is designed to give a start and • Machine Learning Basics – Loss
introduction to this new discipline. This course is spread across 2 days Function, Gradient Descent, Bias
and will have a plenty of hands on exercises using real world datasets. Variance Trade Off, Underfit and
Overfit of Models
• Building linear and non-linear
models for Regression and
classification problems
Pre-requisites: • Regularization and Parameter
Tuning
ü A basic understanding of data and programming • Applying various algorithms - Linear
ü Programming knowledge using Python is essential Regression, Logistic Regression,
Decision Trees, KNN
Hardware & Software: • Ensemble Methods – Bagging and
Boosting, Random Forest
ü A desktop or notebook with 64 bit OS (Windows/Mac)
• Understanding Model Evaluation
ü 8 GB RAM Metrics
ü High speed Internet connection 256 kbps+
ü Latest Anaconda Continuum Platform for Python 3.5 Duration: 2 Days
Instructor Profile
Manaranjan Pradhan has about 16+ years of industry experience working on Cloud computing, Big Data, Data Science &
Machine Learning. He has worked with TCS, HP, and iGATE and worked on large scale projects for customers like
Motorola, Home Depot, CKWB Bank, P&G in the roles of solution and technical architect. He is a freelance who provides
consulting and training on Cloud Computing, Big data & Data Science including Machine Learning. He has been teaching
Big Data and Machine Learning for 3 years and has trained more than 500 people from several large MNCs including EMC,
CISCO, HP, YODLEE, YAHOO, SAMSUNG, VeriSign, Success Factors & Goldman Sachs etc. He is also a guest lecture on Big
Data and Machine Learning at IIM Bangalore.
He is an alumni of Indian Institute of Management (IIM), Bangalore and has completed certification on Business
Analytics and Intelligence program. He has data science and scalable machine learning certifications from Coursera and
edx.org.
He had published an Analytics Case in Harvard Business Publishing :
https://cb.hbsp.harvard.edu/cbmp/product/IMB573-PDF-ENG
Machine Learning using Python
Introduction to Data Science
Introduction to Data Science and
Setting up Python Environment for Data Analysis
Setting up data analysis
environment
Overview of Data Analysis Stack - Numpy, Pandas, Matplotlib, scipy and Scikit-learn
Loading data from Different Sources
Accessing and preparing data
Data manipulation - Filtering, Grouping, Ordering, Joining
with Pandas
Dealing with missing Data
Histograms, Bar charts
Data Exploration, Visualizations
Density Plots, Box Plots, Scatter Plots, Heat Maps
& Statistical Analysis
Understanding Basic Statistics, Distributions, Correlations
Understanding loss function and gradient descent approach for loss minimization
Linear Regression, Logistic Regression
Algorithms for Regression and
Decision Trees, KNN
Classification Problems
Bias & Variance Trade-off
Regularization & Parameter Tuning
Random Forest
Ensemble Methods
Bagging & Boosting
K-means clustering
Clustering
Finding optimal number of clusters
Creating Training, validation and Test Data Sets
Model Evaluation Cross validations
Understanding Evaluation Metrics: RMSE, R-square, ROC, Confusion Matrix,
Precision, Recall, Accuracy etc.
Following are the Case studies that will be explained using the above techniques.
Customer Churn Prediction:
Companies invest significant amount of money to acquire new customers in anticipation of future revenues. Losing customers mean
loss of initial investment on acquisition and loss of possible future revenue. So, it is important for companies to predict early signs if a
customer is about to churn. And then engage or offer incentives to customers to retain them.
Customer Segmentation:
RFM analysis is a customer segmentation technique that can help retailers maximize the return on their marketing investments.
Under RFM analysis, each customer is scored based on three factors, Recency, Frequency, and Monetary value.