0% found this document useful (0 votes)
14 views4 pages

R Commands: Firsty, We Need To Install Package

The document provides commands for importing, cleaning, manipulating, and visualizing data using R. Some key commands include: - getwd() and setwd() to get and set the working directory - read.csv() and read_excel() to import data from CSV and Excel files - is.na(), sum(is.na()), and na.omit() to check for and remove missing values - mean() and var() to calculate descriptive statistics - plot(), hist(), and ggplot2 package to create visualizations - cbind() and rbind() to bind vectors as columns or rows - dimnames to add row and column names to a matrix

Uploaded by

Namit Baser
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
14 views4 pages

R Commands: Firsty, We Need To Install Package

The document provides commands for importing, cleaning, manipulating, and visualizing data using R. Some key commands include: - getwd() and setwd() to get and set the working directory - read.csv() and read_excel() to import data from CSV and Excel files - is.na(), sum(is.na()), and na.omit() to check for and remove missing values - mean() and var() to calculate descriptive statistics - plot(), hist(), and ggplot2 package to create visualizations - cbind() and rbind() to bind vectors as columns or rows - dimnames to add row and column names to a matrix

Uploaded by

Namit Baser
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 4

R Commands

To get the working directory getwd()


To set up the working directory setwd()
To create a row vector v<- c(1,2,3)
To create a matrix M1<- matrix(1:20,nrow=4,ncol=5)

To enter data row wise


M1<- matrix(1:20, nrow=4,ncol=5,byrow=TRUE)
To get the value of a vector at particular position v1[2]
To access the particular package from library library(ggplot2)
To know the number of variables have been executed ls()

To get the data from datasets datasets:: mtcars


To view the datasets View(mtcars)
To store datasets into own file datasets:: mtcars
View(mtcars)

File1<- data(mtcars)
View(mtcars)
To know about which class data belongs to class(File1)
Import the data without header file Myfile<- read.csv(file.choose(), sep=””, header=FALSE)
To store the data from one data base to another Myfile<- as.data.frame(mtcars)

It will store the data from mtcars database to Myfile


database
To have the first 6 rows of datasheet head(Myfile)
To have last 6 rows of datasheet tail(Myfile)
To find out the structure of datafile str(Myfile)
To get to know about the descriptive statistics of summary(Myfile)
datafile
To get to know descriptive statistics of particular summary(Myfile$mpg)
variable in a datafile
The $ is used to symbolize the variable you want to
know the stastics. Here mpg is variable in data file
named Myfile
To get the variance of particular variable in datafile var(Myfile$mpg)

To get the standard variance of particular variable sqrt(var(Myfile$mpg))

Importing the Datafile Advertising.txt advertising<- read.csv(file.choose(), sep=” ”)


then write
advertising
Importing the Excel File Firsty, we need to install package
Install.packages(“readxl”)

Then load the package


R Commands
library(“readxl”)

for opening excel


my_data<- read_excel(file.choose())

Cleaning of Data
 Mismatch Data
 Missing Values
 Irrelevant Data
 Outliers
 Infeasible value (Ex: Age can never be negative)
 Redundant Data

To check missing values (overall dataset named is.na(Advertising)


Advertising)
Does the data set contains missing values any(is.na(Advertising))

It will give output as either True or False


To know total number of missing values in the sum(is.na (Advertising))
data set
To find out the exact location of missing value which(is.na(Advertising))

Remove Missing Value


If data set has missing values we can work on it in two ways:

 Eliminate the whole row which has the missing value


 To replace the missing values by a mean value
To remove the row which has Advertising_new<- na.omit(Advertising)
missing value Advertising_new
To replace the missing value by sum((Advertising$NewspaperAds),na.rm=TRUE)
the mean value at the position This will give the total of all observation other that NA
where NA is written
mean((Advertising$NewspaperAds),na.rm=TRUE)
This will provide the mean of all observations under Newspaper
Ads no including NA
R Commands

Avg_newspaper= mean((Advertising$NewspaperAds),na.rm
=TRUE)
Substituting the value of mean in Avg_newspaper variable

Advertising$NewspaperAds[is.na(Advertising$NewspaperAds)]<-
Avg_newspaper
Putting the mean at the position where NA is written
View(Advertising$NewspaperAds)

To Rename the header of column


Renaming the header row for each column names(Advertising)<- c(“City”, “TVAds”,
“RadioAds”, “NewspaperAds”, “Sales”,
“StoreType”)

Convert into different datatype


Converting numeric into integer Ex:
V1<-10
This is not an integer, it is stored in numeric form
To convert into integer
V1<-as.integer(10)
Now it is stored in form as integer.

OR

Put “L” at last to convert into integer


V1<- 10L
Now it will enter data as integer form.

Visualization
Before plotting up the graphs…we just need to load the package
For that use command:
library(ggplot2)
R Commands
To create a histogram hist(Advertising$RadioAds)
To create a scatter plot plot(Advertising$RadioAds)

Binding the rows/columns


Binding two vectors in column-wise x<-21:23
y<- 7:9

cbind(x,y)
This will only work when both vectors are of
same size

O/p will be
21 7
22 8
23 9
Binding vectors in row-wise x<-21:23
y<- 7:9

rbind(x,y)
This will only work when both vectors are of
same size

O/p will be
21 22 23
7 8 9

Dimension Names of a Matrix


to allocate names to rows and columns of m<-matrix(1:6,nrow=2, dimnames =
matrix list(c("a","b"),c("c","d","e")))

You might also like