Lecture 7 9
Lecture 7 9
Lecture 7 9
Measures of Variation
The first quartile is the middle observation of the lower half, and the third quartile is the
middle observation of the upper half. This process is demonstrated in Example 2, below.
The interquartile range is a useful measure of variability and is given by the lower and
upper quartiles. The interquartile range is not vulnerable to outliers and, whatever the
distribution of the data, we know that 50% of observations lie within the interquartile range.
Sample Variance
Summary:
What are the 4 main measures of variability?
Range: The difference between the highest and lowest values
Interquartile range: The range of the middle half of a distribution
Standard Deviation: Average distance from the mean
Variance: Average of squared distances from the mean.
Exercises:
1. Find sample variance and the standard deviation for the following data set: 1245, 1255, 1654,
1547, 1787, 1989, 1878, 2011, 2145, 2545, 2656.
2. You survey households in your area to find the average rent they are paying. Find the sample
variance and the standard deviation from the following data in pesos: 1550, 1700, 900, 850, 1000,
950.
3. Find the range, interquartile range, sample variance, standard deviation of the following data:
132, 144,895, 441, 623, 325, 366, 412, 530, 332, 225, 239, 661, 754, 354, 554, 874, 771, 664, 334,
198, 178, 213, 423, 324, 133, 534, 654, 541, 367, 569, 756, 881, 159.
LECTURE 8
What is Probability?
Probability denotes the possibility of the outcome of any random event. The meaning of
this term is to check the extent to which any event is likely to happen. For example, when
we flip a coin in the air, what is the possibility of getting a head? The answer to this
question is based on the number of possible outcomes. Here the possibility is either head
or tail will be the outcome. So, the probability of a head to come as a result is ½.
The probability is the measure of the likelihood of an event to happen. It measures
the certainty of the event. The formula for probability is given by:
P(E) = Number of Favourable Outcomes/Number of total outcomes
P(E) = n(E)/n(S)
Here,
n(E) = Number of event favorable to event E
n(S) = Total number of outcomes
Workshop
1. There are 18 tickets marked with numbers 1 to 18. What‘s the probability of selecting
a ticket having the following property:
a) even number
b) number divisible by 3
c) prime number
d) number divisible by 6
2. Determine the probability of following results when throwing 2 playing cubes (a red one
and a blue one):
a) sum equals to 8
b) sum divisible by 5
c) even sum
What is Statistics?
Statistics is the study of the collection, analysis, interpretation, presentation, and
organization of data. It is a method of collecting and summarizing the data. This has many
applications from a small scale to large scale. Whether it is the study of the population of
the country or its economy, statistics are used for all such data analysis. Statistics has a
huge scope in many fields such as sociology, psychology, geology, weather forecasting,
etc. The data collected here for analysis could be quantitative or qualitative. Quantitative
data are also of two types such as: discrete and continuous. Discrete data has a fixed
value whereas continuous data is not a fixed data but has a range. There are many terms
and formulas used in this concept.
Random Experiment
An experiment whose result cannot be predicted, until it is noticed is called a random
experiment. For example, when we throw a dice randomly, the result is uncertain to us.
We can get any output between 1 to 6. Hence, this experiment is random.
Sample Space
A sample space is the set of all possible results or outcomes of a random experiment.
Suppose, if we have thrown a dice, randomly, then the sample space for this experiment
will be all possible outcomes of throwing a dice, such as;
Random Variables
The variables which denote the possible outcomes of a random experiment are called
random variables. They are of two types:
• Discrete Random Variables
• Continuous Random Variables
Discrete random variables take only those distinct values which are countable. Whereas
continuous random variables could take an infinite number of possible values.
Independent Event
When the probability of occurrence of one event has no impact on the probability of
another event, then both the events are termed as independent of each other. For
example, if you flip a coin and at the same time you throw a dice, the probability of
getting a ‘head’ is independent of the probability of getting a 6 in dice.
Mean
Mean of a random variable is the average of the random values of the possible
outcomes of a random experiment. In simple terms, it is the expectation of the possible
outcomes of the random experiment, repeated again and again or n number of times. It
is also called the expectation of a random variable.
Expected Value
Expected value is the mean of a random variable. It is the assumed value which is
considered for a random experiment. It is also called expectation, mathematical
expectation or first moment. For example, if we roll a dice having six faces, then the
expected value will be the average value of all the possible outcomes, i.e. 3.5.
Variance
Basically, the variance tells us how the values of the random variable are spread around
the mean value. It specifies the distribution of the sample space across the mean.
Exercises
Example 1: A bucket contains 5 blue, 4 green and 5 red balls. Sudheer is asked to pick
2 balls randomly from the bucket without replacement and then one more ball is to be
picked. What is the probability he picked 2 green balls and 1 blue ball?
Probability of drawing
Example 2: What is the probability that Ram will choose a marble at random and that it
is not black if the bowl contains 3 red, 2 black and 5 green marbles.
Find the number of marbles that are not black and divide by the total number of
marbles.
= 8 /10
= 4/5
Probability Distribution
Outcome Probability
Heads Tails
.5 .5
A frequency distribution describes a specific sample or dataset. It’s the number of times
each possible value of a variable occurs in the dataset. The number of times a value
occurs in a sample is determined by its probability of occurrence. Probability is a number
between 0 and 1 that says how likely something is to occur:
The higher the probability of a value, the higher its frequency in a sample. More
specifically, the probability of a value is its relative frequency in an infinitely large sample.
Infinitely large samples are impossible in real life, so probability distributions are
theoretical. They’re idealized versions of frequency distributions that aim to describe the
population the sample was drawn from.
Probability distributions are used to describe the populations of real-life variables, like
coin tosses or the weight of chicken eggs. They’re also used in hypothesis testing to
determine p values.
She can get a rough idea of the probability of different egg sizes directly from this
frequency distribution. For example, she can see that there’s a high probability of an egg
being around 1.9 oz., and there’s a low probability of an egg being bigger than 2.1 oz.
Suppose the farmer wants more precise probability estimates. One option is to improve
her estimates by weighing many more eggs.
A better option is to recognize that egg size appears
to follow a common probability distribution called a
normal distribution. The farmer can make an
idealized version of the egg weight distribution by
assuming the weights are normally distributed:
Random Variable
Let's give them the values Heads=0 and Tails=1 and we have a Random Variable "X":
In Short:
Note: We could choose Heads=100 and Tails=150 or other values if we want! It is our
choice.
So:
A Random Variable has a whole set of values and it could take on any of those values,
randomly.
Example: X = {0, 1, 2, 3}
• X could be 0, 1, 2, or 3 randomly.
• And they might each have a different probability.
We use a capital letter, like X or Y, We can show the probability of any one
to avoid confusion with the Algebra value using this style:
type of variable.
P(X = value) = probability of that value
X = {1, 2, 3, 4, 5, 6}
Sample Space
In this case they are all equally likely, so
A Random Variable's set of values is the probability of any one is 1/6
the Sample Space.
P(X = 1) = 1/6
Example: Throw a die once
P(X = 2) = 1/6
Random Variable X = "The score
shown on the top face". P(X = 3) = 1/6
P(X = 6) = 1/6
Workshops:
In this case, there could be 0 Heads (if all the coins land Tails up), 1 Head, 2 Heads or 3
Heads.
Looking at the table we see just 1 case of Three Heads, but 3 cases of Two Heads, 3
cases of One Head, and 1 case of Zero Heads. So:
P(X = 3) = 1/8
P(X = 2) = 3/8
P(X = 1) = 3/8
P(X = 0) = 1/8
The Random Variable is X = "The sum of the scores on the two dice".
Let's make a table of all possible values:
There are 6 × 6 = 36 possible outcomes, and the Sample Space (which is the sum of the
scores on the two dice) is {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Let's count how often each value occurs, and work out the probabilities:
A Range of Values
We could also calculate the probability that a Random Variable takes on a range of
values.
Example (continued) What is the probability that the sum of the scores is 5, 6, 7 or 8?
=(4+5+6+5)/36
=20/36
=5/9
Solving
• X is the Random Variable "The sum of the scores on the two dice".
• x is a value that X can take.