0% found this document useful (0 votes)
59 views2 pages

Quiz 2 Formula Sheet

This document contains formulas and R commands useful for Quiz 2. It lists: 1) Commonly used R commands for loading data, appending columns, calculating summary statistics, and plotting; 2) Formulas for independent and bivariate distributions, covariance, correlation, properties of covariance, conditional expectations, boxplots, the central limit theorem, and confidence intervals. 3) A formula sheet is also provided listing key formulas for independent and bivariate distributions, covariance, correlation, properties of covariance, conditional expectations, the central limit theorem, and confidence intervals.

Uploaded by

dawn
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
59 views2 pages

Quiz 2 Formula Sheet

This document contains formulas and R commands useful for Quiz 2. It lists: 1) Commonly used R commands for loading data, appending columns, calculating summary statistics, and plotting; 2) Formulas for independent and bivariate distributions, covariance, correlation, properties of covariance, conditional expectations, boxplots, the central limit theorem, and confidence intervals. 3) A formula sheet is also provided listing key formulas for independent and bivariate distributions, covariance, correlation, properties of covariance, conditional expectations, the central limit theorem, and confidence intervals.

Uploaded by

dawn
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 2

AB1202. Quiz 2 formula sheet.

R-command list (you may also use the built-in help file in R during the quiz)

 read.csv(file.choose(),header=T). Load a csv file from the computer


 A$newcolumn <- oldcolumn. Append a column to an existing data structure A under the
column name “newcolum”
 rowsum(): returns the sums by the values of a group variable
 library(“datasets”) : Load the package for illustrative datasets
 cov() : return the covariance
 cor() : return the (sample) correlation coefficient.
 mean(): calculate mean
 median() returns the median value
 var(): calculate sample variance
 sd(): calculate sample standard deviation
 range(): return the minimum and maximum
 summary(): give you a summary table of quantiles, median, mean, etc.
 quantile(): return Q1 to Q4 of the sample
 quantile(x,p=0.25): return the first quartile
 quantile(x,p=0.75): return the third quartile
 IQR() or quantile(xx,p=0.75)- quantile(xx,p=0.25): return the IQR
 boxplot(X~Y): making the boxplot of X according to the groups in Y
 sample(c(x,y,z), n, prob = c( a, b, c), replace = T) : draws a random sample (size=n) from this
distribution Pr(X=x)=a, Pr(X=y)=b, Pr(X=z)=c.
 For loop: for (index in index_vector) { commands you want to repeat}
 hist(): generate a histogram
 qt(q,n-1): return t-score with cumulative probability of q and n-1 degree of freedom
 qnorm(q,u,s): return the value of a normally distributed RV (mean=u, standard deviation=s)
with cumulative probability q.
 rnorm(n,u,s): draw a random sample of size n from a normal distribution of mean u and
standard deviation s.
 runif(n,min,max): draw a random sample of size n from a continuous uniform
 set.seed(x): set the seed value to be x.

Formula list:

 When ( X , Y ) are independent RVs, bivariate distribution f ( x , y )=f ( x ) f ( y )

 Definition of covariance: Cov ( X , Y )=E [ ( X−μ X ) ( Y −μ Y ) ]

Cov ( X ,Y )
 Definition of correlation: ρ XY =
σXσY

2 2
 Properties of covariance: Var ( a X 1 +bX 2 ) =a Var ( X 1 ) +b Var ( X 2 ) +2 ab ¿ σ X σ Y )
 Conditional expectations
For discrete random variables:
E ( X ∨Y = y )= ∑ xf ( X=x∨Y = y )
allvalues
of X

E ( g( X)∨Y = y ) = ∑ g( x ) f ( X=x∨Y = y)
allvalues
of X

 Box-plot:
Lower limit =Q 1−¿ 1.5*IQR or the min value, whichever higher

Upper limit =Q 3+1.5*IQR or the max value, whichever lower

 By CLT, for a random sample X 1 , X 2 , … , X n follows an iid distribution 𝑓(𝑥)with mean 𝜇 and
variance σ 2, the distribution of sample mean X´ N will become close to a normal distribution
with mean 𝜇 and variance σ 2/𝑛 when the sample size n is very large

100− p
 The p% confidence interval ( q= p+ %)
2
σ σ
[ x́n −z q × , x́ n + z q × ] when population variance σ 2 is known
√n √n
Sn Sn
[ x́ n−t q (n−1)× , x́ n +t q (n−1) × ] when population variance σ 2is unknown
√n √n

You might also like