R Programming
R Programming
R Programming
c Grethe Hystad
Copyright
Chapter 3
The Normal Distribution
In this chapter we will discuss the following topics:
The normal density and the density curve with the R-function named dnorm (d ensity).
The cumulative normal distribution function with the R-function named pnorm (probability).
The Quantiles with the R-function named qnorm (quantile).
There are three arguments to the functions dnorm(x, , ) and pnorm(x, , ), where x is
an observation from a normal distribution that has mean and standard deviation . The
argument p to the function qnorm(p, , ) is the proportion of observations in the normal
distribution that are less than or equal to the corresponding quantile x.
The density for a continuous distribution measures the probability of getting a value
close to x. Continuous random variables have a density at a point since they have no
probability at a single point, P (X = x). For the normal distribution we compute the
density using the function dnorm(x, , ).
The function pnorm(x, , ) computes the proportion of observations in the normal
distribution that are less than or equal to x; that is P (X x), where X is N (, ).
The function pnorm(x, , , lower.tail = F ALSE) computes the proportion of observations in the normal distribution that are greater than or equal to x; that is P (X x),
where X is N (, ).
The function qnorm(p, , ) returns the quantile for which there is a probability of p
of getting a value less than or equal to it. Thus, the quantile is the value x such that
P (X x) = p for a given p. I other words, qnorm converts proportions to quantiles
while pnorm converts quantiles to proportions which means that qnorm and pnorm
are inverse functions of each other.
Solution. We want to find the proportion of observations in the distribution that are less
than or equal to 310. That is, the area under the curve to the left of the x-value 310. This
can be done as:
> pnorm(310,527,105)
[1] 0.01938279
Thus, P (X 310) = 0.019.
Problem. The amount of monsoon rain in Tucson is approximately Normally distributed
with mean 5.89 inches and standard deviation 2.23 inches. (Data from 1895-2013) [1].
In what percents of all years is the monsoon rainfall in Tucson between 4 inches and 7 inches?
Solution. Let X be the amount of rain in inches. Here we want to find P (4 X 7). This
can be written as P (X 7) P (X 4), where X is N (5.89, 2.23). In R this can be done as:
> pnorm(7,5.89,2.23)-pnorm(4,5.89,2.23)
[1] 0.4923238
Thus, in 49.23% of all years is the monsoon rainfall in Tucson between 4 inches and 7 inches.
Solution. Here we want to find the value of z such that the area to the left of z is 0.25.
That is, we wish to find z such that P (Z z) = 0.25, where Z is standard normal:
> qnorm(0.25)
[1] -0.6744898
Thus, the 1. quartile is 0.674.
Problem. The math SAT scores among U.S. college students is approximately normally
distributed with a mean of 500 and standard deviation of 100. Alf scored 600? What was
his percentile? (The percentile is the value for which a specified proportion of observations
given in percents fall below it.)
Solution. Let X be the SAT score for the student Alf. We want to find P (X 600). Thus,
in R we obtain:
> pnorm(600,500,100)
[1] 0.8413447
Hence, Alfs SAT score is the 84.13 percentile.
Alternatively, we can first standardize the score such that z =
compute P (Z 1.00), where Z is standard normal. We obtain:
600500
100
> pnorm(1)
[1] 0.8413447
References
[1]
00