Lecturezero

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

R Programming

(INT232)
Lecture #0
Course Overview
• L T P:

2 0 2

• Text Book
1. DATA ANALYTICS USING R BY SEEMA ACHARYA
• Reference Books:
1. DATA ANALYSIS : USING STATISTICS AND PROBABILITY WITH R
LANGUAGE BY BISHNU PARTHA SARATHI, BHATTACHERJEE
VANDANA
2. DATA SCIENCE AND MACHINE LEARNING IN R BY REEMA THAREJA
3. DATA ANALYTICS BY ANIL MAHESHWARI

2
Marks Breakup
• Credits:- 3
• Marks Breakup:
Activity Marks
Attendance 5
Continuous Assessment 45
End-Term Practical (ETP) 50
Total 100

* 2 Best CA out of 3 CA each of 30 marks

3
Detail of academic task
• AT1: Quiz
• AT2: Assignment Based Test
• AT3: Project

*** best 2 out of 3 ***


OPEN EDUCATIONAL RESOURCE
Course Course Unit Broad topic/Sub OER Title of *%age unit Source URL
Code Title mappe Topic Type OER mapped with
d OER
(approx)
INT232: Data Science Unit 1 Installation and Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox: R development material
Programming environment overview (website)

Introduction to basics
Unit 2 Vectors and matrices Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
material
INT232: Data Science Factors (website)
Toolbox: R
Programming Data frames

Lists
INT232:Data Science Unit 3 R syntax Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R material
Programming Data input and output in (website)
R
INT232: Data Science Unit 4 Advanced R Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R programming material
Programming (website)
Data manipulation with
R using
INT232: Data Science Unit 5 Text mining in R Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R material
Programming Social media data (website)
mining
INT232: Data Science Unit 6 Data Visualization With Reading INT232 100 https://www.tutorialspoint.com/r/index.htm
Toolbox:R R material
Programming (website)

Lovely Professional University, Phagwara 5


Data Science
Data science is the study of data to extract meaningful insights for
business. It is a multidisciplinary approach that combines principles
and practices from the fields of mathematics, statistics, artificial
intelligence, and computer engineering to analyze large amounts of
data.
This analysis helps data scientists to ask and answer questions like
what happened, why it happened, what will happen, and what can be
done with the results.

Lovely Professional University, Phagwara 6


Future as a Data Science
• Artificial intelligence and machine learning innovations have
made data processing faster and more efficient. Industry
demand has created an ecosystem of courses, degrees, and
job positions within the field of data science.
• Because of the cross-functional skillset and expertise
required, data science shows strong projected growth over
the coming decades.

INT510, Lovely Professional University,


7
Phagwara
Job Opportunity
• Data Analyst [Data Visualizations]
• Data Engineers [Building and testing big data
ecosystem, Design and manage DBMS]
• Database Admisnistrator- [working on DB and
manage the data]
• Machine Learning Engineer
• Business Analyst [they identify how the Big
Data can be linked to actionable business insights
for business growth.]

INT510, Lovely Professional University,


8
Phagwara
Course Outcomes
• CO1 :: Analyze and configure R software for statistical programming
environment [for analyzing] and describe
• generic programming language concepts implemented in a high-level
statistical language
• CO2 :: Demonstrate the programs in the R environment to create custom
analytical models to meet the dynamic business needs
• CO3 :: Evaluate and verify the analysis findings by using various packages
in R programming
• CO4 :: Visualize and customize the various graphical packages for creating
various types of graphs, plots and charts.
• CO5 :: Review advanced data science concepts using predictive analytics
fundamentals
• CO6 :: Appraise and verify the analysis findings by conducting various
statistical tests

9
Program Outcomes
• PO1
Engineering Knowledge:: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
• PO2
Problem Analysis:: Identify, formulate, research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
• PO3
Design/development of solutions:: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with
appropriate consideration for the public health and safety, and the cultural, societal, and
environmental considerations.
• PO4
Conduct investigations of complex problems:: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.

10
Program Outcomes
• PO5
Modern tool usage:: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.
• PO6
The engineer and society:: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
• PO7
Environment and sustainability:: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.
• PO8
Ethics:: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
• PO9
Individual and team work:: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.

11
Program Outcomes
• PO10
Communication:: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give
and receive clear instructions.
• PO11
Project management and finance:: Demonstrate knowledge and understanding of the
engineering, management principles and apply the same to one’s own work, as a member
or a leader in a team, manage projects efficiently in respective disciplines and
multidisciplinary environments after consideration of economic and financial factors.
• PO12
Life-long learning:: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological
change.
• PO13
Competitive Skills:: Ability to compete in national and international technical events and
building the competitive spirit along with having a good digital footprint.

12
Revised Bloom’s taxonomy (RBT)

13
WHY
R- programming
?
14
R Studio
• R provides a wide variety of statistical
(linear and nonlinear modelling, classical
statistical tests, time-series analysis,
classification, clustering)
• One of R’s strengths is the ease with
which well-designed publication-quality
plots can be produced, including
mathematical symbols and formulae
where needed.
• The R environment consists of an integrated suite of software
facilities designed for data manipulation, calculation, and graphical
display. The environment features:
• A high-performance data storage and handling facility
• A suite of operators for array calculations(an array is created with the help of
the array() function This array() function takes a vector as an input and to create an array it
uses vectors values with fixed dimension.) mainly matrices
• A vast, easily understandable, integrated collection of intermediate
tools dedicated to data analysis
• Graphical facilities for data analysis and display that work either for
on-screen or hardcopy
• The well-developed, simple, and effective programming language,
featuring user-defined recursive functions, loops, conditionals, and
input and output facilities.
16
About R

INT510, Lovely Professional University,


17
Phagwara
History of R

INT510, Lovely Professional University,


18
Phagwara
Features of R

INT510, Lovely Professional University,


19
Phagwara
What is R Used For?

• Although R is a popular language used by


many programmers, it is especially effective
when used for

• Data analysis
• Statistical inference
• Machine learning algorithms
Unit 1- Installation And Development
Environment Overview, Introduction To Basics

• Downloading And Installing R From CRAN


• Installing R On Your Windows Computer
• Installation R studio
• Libraries In R And R Studio
• Installing Packages
• Using R Reference Card
• Discover The Basic Data Types And Operators
In R 21
Unit 2- Detailed Data Types
• Vectors And Matrices : Learn How To Work
With Vectors And Matrices In
• R Factors : R Stores Categorical Data In
Factors, Learn How To Create Subset And
Compare Categorical Data
• Data Frames : Creating, Merging, Naming,
Filtering, Indexing And Selection In Data
Frames
• Lists : Naming, Extracting, Adding, Deleting
Components From Lists, Sub Setting A List [A
list in R is a generic object consisting of an ordered collection of
objects. Lists are one-dimensional, heterogeneous data structures.] 22
Unit 3- R Syntax And Data Input And
Output
• Conditional Statements
• Loops
• Functions And Packages In R
• CSV (Comma Separated Values) Files,
• Excel Files And SQL With R
Unit 4- Advanced R programming and Data
manipulation
• Mathematical Functions
• Apply Family Of Functions
• Regular Expressions
• Dates And Timestamps
• Data Filters
• Handling Missing Data
• Dplyr (a data manipulation package)
• Tidyr (simplify the process of creating tidy data)
• Pipe

24
Unit 5- Text mining in R
• Text Mining Functions
• String Functions Used In R
• Analyzing Text Data For Mining
Social Media Data Mining
• Facebook Data Analysis
• Twitter Data Analysis

25
Unit 6- DATA VISUALIZATION WITH R
• Explanation And Implementation Of Basic Types Of
Graphs (SCATTER PLOT, LINE CHART, BAR CHART, PIE
CHART)
• Explanation And Implementation Of Advanced Types
Of Graphs (Word Cloud, Heat Map, Bollinger Band,
Donut Chart Etc.)
• Dynamic Visualization Using Ggplots
• Advanced Visualization Using PLOTLY
• Implementation Of DASHBOARDS Using
RMARKDOWN

26
Learning Outcomes
• Use and get to know about the essential data structures,
functions and packages used in R
• Students will learn about the basic commands and packages
provided by the R tool.
• Students will learn how to use the advanced R functions for
Analysis.
• Learn about various text mining functions in R.
• Use and customize the various graphical packages for creating
various types of graphs, plots and charts
• Analyze real life business problems by using various statistical
methods
• Integrate data to provide mashed-up dashboards
27
MOOCs
• R Programming
https://www.coursera.org/learn/r-programming

28
Questions???

29

You might also like