Lecturezero
Lecturezero
Lecturezero
(INT232)
Lecture #0
Course Overview
• L T P:
2 0 2
• Text Book
1. DATA ANALYTICS USING R BY SEEMA ACHARYA
• Reference Books:
1. DATA ANALYSIS : USING STATISTICS AND PROBABILITY WITH R
LANGUAGE BY BISHNU PARTHA SARATHI, BHATTACHERJEE
VANDANA
2. DATA SCIENCE AND MACHINE LEARNING IN R BY REEMA THAREJA
3. DATA ANALYTICS BY ANIL MAHESHWARI
2
Marks Breakup
• Credits:- 3
• Marks Breakup:
Activity Marks
Attendance 5
Continuous Assessment 45
End-Term Practical (ETP) 50
Total 100
3
Detail of academic task
• AT1: Quiz
• AT2: Assignment Based Test
• AT3: Project
Introduction to basics
Unit 2 Vectors and matrices Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
material
INT232: Data Science Factors (website)
Toolbox: R
Programming Data frames
Lists
INT232:Data Science Unit 3 R syntax Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R material
Programming Data input and output in (website)
R
INT232: Data Science Unit 4 Advanced R Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R programming material
Programming (website)
Data manipulation with
R using
INT232: Data Science Unit 5 Text mining in R Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R material
Programming Social media data (website)
mining
INT232: Data Science Unit 6 Data Visualization With Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R R material
Programming (website)
9
Program Outcomes
• PO1
Engineering Knowledge:: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
• PO2
Problem Analysis:: Identify, formulate, research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
• PO3
Design/development of solutions:: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations.
• PO4
Conduct investigations of complex problems:: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.
10
Program Outcomes
• PO5
Modern tool usage:: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.
• PO6
The engineer and society:: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
• PO7
Environment and sustainability:: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.
• PO8
Ethics:: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
• PO9
Individual and team work:: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
11
Program Outcomes
• PO10
Communication:: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give
and receive clear instructions.
• PO11
Project management and finance:: Demonstrate knowledge and understanding of the
engineering, management principles and apply the same to one’s own work, as a member
or a leader in a team, manage projects efficiently in respective disciplines and
multidisciplinary environments after consideration of economic and financial factors.
• PO12
Life-long learning:: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological
change.
• PO13
Competitive Skills:: Ability to compete in national and international technical events and
building the competitive spirit along with having a good digital footprint.
12
Revised Bloom’s taxonomy (RBT)
13
WHY
R- programming
?
14
R Studio
• R provides a wide variety of statistical
(linear and nonlinear modelling, classical
statistical tests, time-series analysis,
classification, clustering)
• One of R’s strengths is the ease with
which well-designed publication-quality
plots can be produced, including
mathematical symbols and formulae
where needed.
• The R environment consists of an integrated suite of software
facilities designed for data manipulation, calculation, and graphical
display. The environment features:
• A high-performance data storage and handling facility
• A suite of operators for array calculations(an array is created with the help of
the array() function This array() function takes a vector as an input and to create an array it
uses vectors values with fixed dimension.) mainly matrices
• A vast, easily understandable, integrated collection of intermediate
tools dedicated to data analysis
• Graphical facilities for data analysis and display that work either for
on-screen or hardcopy
• The well-developed, simple, and effective programming language,
featuring user-defined recursive functions, loops, conditionals, and
input and output facilities.
16
About R
• Data analysis
• Statistical inference
• Machine learning algorithms
Unit 1- Installation And Development
Environment Overview, Introduction To Basics
24
Unit 5- Text mining in R
• Text Mining Functions
• String Functions Used In R
• Analyzing Text Data For Mining
Social Media Data Mining
• Facebook Data Analysis
• Twitter Data Analysis
25
Unit 6- DATA VISUALIZATION WITH R
• Explanation And Implementation Of Basic Types Of
Graphs (SCATTER PLOT, LINE CHART, BAR CHART, PIE
CHART)
• Explanation And Implementation Of Advanced Types
Of Graphs (Word Cloud, Heat Map, Bollinger Band,
Donut Chart Etc.)
• Dynamic Visualization Using Ggplots
• Advanced Visualization Using PLOTLY
• Implementation Of DASHBOARDS Using
RMARKDOWN
26
Learning Outcomes
• Use and get to know about the essential data structures,
functions and packages used in R
• Students will learn about the basic commands and packages
provided by the R tool.
• Students will learn how to use the advanced R functions for
Analysis.
• Learn about various text mining functions in R.
• Use and customize the various graphical packages for creating
various types of graphs, plots and charts
• Analyze real life business problems by using various statistical
methods
• Integrate data to provide mashed-up dashboards
27
MOOCs
• R Programming
https://www.coursera.org/learn/r-programming
28
Questions???
29