Data Science Syllabus
Data Science Syllabus
Data Science Syllabus
Data Science Overview, Data Science – Why all the excitement? Demand for Data
Science Professionals, Brief Introduction to Big data and Data Analytics, Life cycle of
data science, what does Data scientist Do. Tools and Technologies used in data Science
STATISTICS
FUNDAMENTALS OF MATHEMATICS AND PROBABILITY – Basic
understanding of linear algebra, linear regression, Matrices and Vectors, Addition and
Multiplication of matrices, Fundamentals of Probability, Probability distributed function
and cumulative distributed function, Problem solving using R for vector manipulation,
Problem solving for probability assignments.
MACHINE LEARNING
INTRODUCTION TO MACHINE LEARNING -What is Machine Learning? What is
the Challenge? Introduction to Supervised Learning, Unsupervised Learning, what is
Reinforcement Learning?
DECISION TREES AND SUPERVISED LEARNING – Decision Tree, data set, How
to build decision tree? Understanding Kart Model, Classification Rules - Over fitting
Problem, Stopping Criteria And Pruning, How to find final size of Trees? Model a
decision Tree, Naive Bayes, Random Forests and Support Vector Machines,
Interpretation of Model Outputs, Business Case Study for Kart Model, Business Case
Study for Random Forest, and Business Case Study for SVM
FILE I/O AND EXCEPTIONAL HANDLING – Opening and Closing Files, Open
Function, File Object Attributes, Close Method, Read, Write, Seek. Exception Handling, the
try-finally Clause, Raising an Exceptions, User-Defined Exceptions Regular Expression-
Search and Replace, Regular Expression Modifiers, Regular Expression Patterns and Re
module
NUMPY – Introduction to NumPy, Array Creation, Printing Arrays, Basic Operations-
Indexing, Slicing and Iterating Shape Manipulation – Changing shape, stacking and splitting
of array Vector stacking
PANDAS -Introduction to Pandas, Importing data into Python, Pandas Data Frames,
Indexing Data Frames, Basic Operations With Data frame, Renaming Columns, Subletting
and filtering a data frame.