Difference Between (Median, Mean, Mode, Range, Midrange) (Descriptive Statistics)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

 Individuals : are the fixed things (identifier )in the dataset that the variables belongs to

 Categorical variable: is a variable that has a fixed rang of variable Ex: colors: red or green or ..
And coffe type: hot/cold
 Quantitative variable: that is a number and not a categorical.
Statistics types:

1- Descriptive (it’s all about describing an existing data)


2- Inferential (it’s all about MAKING a conclusion about the data)

Frequency tables:

 Two way frequency table: most commonly used to represent the (frequency) relationships between
categorical datas
 Two-way relative frequency table: used to represent the percentage of data comparing to it’s column total
of the frequency table and to check if there is an association between two variables.
 You can visualize frequency table by drawing dot plots

Graphs:

 Pictograph: is a graph to represent data with pictures related to the data


 Bar graph
 Other graphs….

Difference between (median, mean, mode, range, midrange) (Descriptive statistics):


 Median(middle) : is the middle value among a set of values after ordering dataset
Ex: [10, 32, 54, 32, 39] --> the median= 54
Ex: if you have 2 middle numbers take the mean of them (even data set)
 Mean(Average) : is the usual average of numbers (the sum/ the number of them)
 Mean is always the balancing point between numbers
Ex: mean of {3,2,1} == 2
 (the total distance below the mean equal above the mean always)
Ex: {1,3,4,5,7} , above= (1 distance from 5 and , 3 from 7 == 4 ), below=(1 from 3 , 3 from 1 == 4)
 Mode : the is most common score
Ex: [100, 100, 100, 334, 66, 7] --> the mode= 100
 Range : is the difference between the highest number and the lower one (to measure data set spread)
 Midrange : is the difference between the highest number and lowest / 2
 Mean absolute deviation : is the average distance between each data point of a data set and its mean
IQR (interquartile range):
 is the range between the middle of the first half of a data set after taking the median and the middle of the
second half (75% - 25% called quartiles) (to measure spread of middle have of data set spread ) = q3 -q1

Outliers: are the data points that has a great values comparing to the rest of datapoints at the dataset
 can be calculated by : < Q1 – 1.5 * IQR or > Q3 + 1.5 * IQR

To measure spread of data (dispersion):


1- Range
2- Population Variance (σ²)= is the average of the sum of all differences between each datapoint and its
mean

3- Population Standard deviation(σ)= (is the amount by which a single measurement differs from the mean.)
the square root of variance

4- IQR (if there is outliers and you use median instead of mean for central tendency)

Variance and standard deviation for sample:


1- Sample Variance: is equal to the population variance but instead of division by N we use n-1
https://www.statisticshowto.com/wp-content/uploads/2009/08/computational.png
2- Standard deviation: is equal to the population standard deviation but instead of division by N we use n-1

Percentile and cumulative relative frequency graph:

 Percentile: is percentage of the data points that is below or at and below the data in the question
 cumulative relative frequency graph: it describes a data and their percentiles on x and y axis
Z-score: measures exactly how many standard deviations above or below the mean a data point is.
1- Z-score = data point – mean / standard deviation
2- A positive z-score says the data point is above average/ A negative z-score says the data point is below
average / A z-score close to 000 says the data point is close to average.

Z- table: to measure what proportion (percentage) of data is below a certain data in a Normal distribution

1- First: you get the z-score of the point


2- Second: you search with z-score at z table about what percentage of the values below that point of total
distribution

Affect on dataset transformation on (mean, standard deviation, median, IQR):


1- (sum – subtraction):
 Mean, median : will increase/ decrease by the amount of (sum – subtraction)
 IQR, standard deviation: will be the same
2- (multiplication – division):
 mean, standard deviation , median , IQR : will increase/ decrease by the amount of (multiplication –
division)

Distribution:

1- Marginal distribution
2- Conditional distribution

Each distribution has:

1- Center
2- Spread/variability

Distribution shapes:
1- Right tailed distribution (right skewed) --> the mean is greater than median
2- Left tailed distribution (left skewed) --> the mean is smaller than median
3- Symmetrical (normal distribution) --> mean is at the center and takes bell shape

 Standard normal distribution (mean= 0 , standard deviation= 1)


 We can use histogram to displays distribution of data by grouping data into "bins" of equal width. Each bin
is plotted as a bar whose height corresponds to how many data points are in that bin

Density Curves (PDF):


1- Is a way of representing a distribution of continuous random variables which the area below the curve
equal 1 and represent all the values of the distribution data points, and the probability is area under the
curve

Empirical Rule (68-95-99.7 Rule):

1- The area of the shape from 1 standard deviation above and below the mean of normal distribution equal to
68% of the whole area.
2- The area of the shape from 2 standard deviation above and below the mean of normal distribution equal to
95% of the whole area.
3- The area of the shape from 3 standard deviation above and below the mean of normal distribution equal to
99.7% of the whole area.
Scatterplot (bivariate relationship): plot to analysis relationship between two variables to test if there is a linear
relationship (correlation) between them or not weather it’s positive or negative relationship and strength and
outliers

 Types of correlation: (positive correlation – negative correlation)


 Sometimes the data points in a scatter plot form distinct groups, these groups are called clusters
Correlation coefficient (ρ in population) (r in sample):

 measures the direction and strength of a linear relationship


 It always has a value between -1 minus, 1 and 1.
 Strong positive linear relationships have values of r closer to 1.
 Strong negative linear relationships have values of r closer to -1
 Weaker relationships have values of r closer to 0

Linear regression:

 When we see a relationship in a scatterplot, we can use a line to summarize the relationship in the data.
We can also use that line to make predictions in the data. This process is called linear regression.
 We draw line between al data and generate its equation to make predictions
Residuals:
 A residual is a measure of how well a line fits an individual data point
 The vertical distance is known as a residual. For data points above the line, the residual is positive, and for
data points below the line, the residual is negative
 The sum of squares residuals can give sense if a particular regression line best fit comparing to another line
 squared residual called squared error
 The sum of squared residuals and called: prediction error and we have to choose line eliminate that error
 R = real point – expected (from line)

Residual plot:
 It to plot the residuals distribution to check if the line is the best fit or not

R-squared (coefficient of determination):

 R-squared measures how much prediction error we eliminated


 Also called : coefficient of determination (range from 0 to 1 and indicate if line good fit or not)
 Is the percentage of variation that described by the variation in x (regression line)
 Equal = 1 – total percentage of variation that not described by the variation in x (regression line)

 To calculate the percentage of variation that not described by the variation in x (regression line)
1- Is equal = total squared errors from line / total variation in y
2- Total squared errors from line = are the total variation in y not describe by regression line
3- Note: Total variation in y = the sum of variations of each point from the mean of y
 To calculate the percentage of variation that described by the variation in x (called coefficient of
determination )
1- Equal = 1 – total percentage of variation that not described by the variation in x (regression line)

Standard deviation of residuals (mean square error):

1- Is the square root of the mean of squared residuals (error)

Calculating equation of regression line:


1- Least squares regression:
 Calculate mean for x axis and y axis, and standard deviation for x axis and y axis, and correlation coefficient
 The intersection of the two mean lines is a point in the regression line
 The equation of a line is : y= mx +b
 M is the slope and equal = r (std of y/ std of x)
 After calculating the slope with the mean point of x, y we can get b
2- Regression equation for m and b:
 Search at khan academy for the equations for (m,b)

Stem and leaf plot:

 A stem and leaf plot display numerical data by splitting each data point into a "leaf" (usually the last digit)
and a "stem" (the leading digit or digits).

.
Population vs samples:

 Population: Is the whole/ all of data sets


 Sample: is a small group of a population
 Random sample types:
1- Simple random sample:  Every member and set of members has an equal chance of being included in the
sample. By using random number generators and other techniques
2- Stratified random sample: The population is first split into groups. The overall sample consists of some
members from every group. The members from each group are chosen randomly.
3- Cluster random sample:  The population is first split into groups. The overall sample consists of every
member from some of the groups. The groups are selected at random.
4- Systematic random sample: Members of the population are put in some order. A starting point is selected
at random, and every nth member is selected to be in the sample

 Bias Sample types:


1- Under coverage:  is when the researcher systematically excludes members of the population from being in
the sample
2- Response bias: Response bias is when people are systematically dishonest when answering a question
based on the way of asking the question
3- Voluntary response sample: is when let members of the population choose whether or not they would be
in the sample.
4- convivence sample: chose a sample that was available without using any randomization
5- Non response: is when people chosen for the sample cannot be reached or refuse to participate.

 Types of statistical studies:


1- Sample study: when we take a random sample to estimate a parameter for whole population by estimating
parameter from the sample
2- Observational study: we measure or survey members of a sample without trying to affect them to test if
there is a correlation between two factors or not
3- Experiment study: we assign people or things to groups and apply some treatment to one of the groups,
while the other group does not receive the treatment to find a causality of an observation (group and
treatment group)

 Experiment setup:
 Each experiment consists of:
1- Explanatory variable: variable explains changes in another variable
2- Response variable: measures the result of a study
3- Treatment group: a group who receive a treatment which we want test it’s effect
4- Control group: a group doesn’t receive a treatment for comparison with treatment group

Note: When we say there's potential bias, we should also be able to argue if the results will probably be an
overestimate or an underestimate. 

Causality vs correlation:
 Causality: for example: a causes b
 Correlation: a and b are observed at the same time whenever I see a I see b and reverse

Probability:
 Probability is simply how likely something is to happen.(chance)
 Probability of a condition = (number of ways the condition can happen) / (total number of
outcomes)
 The probability of event A for example is often written as P(A)
 If P(A) > P(B), then event AAA has a higher chance of occurring than event B
 If P(A) = P(B), then events A and B are equally likely to occur.
Theoretical vs experimental probability:
 Theoretical probability: is what we expect to happen (the ideal form of probability)
 experimental probability: is what actually happens when we try it out. ()
 The more the experimental probability conducted the more it closes the theoretical one
Sample space:
 Is a set that contains all the different outcome possibilities of an experiment
Sample space for compound events:
 Is a set that contains the different outcome possibilities of an experiment of compound events
(when an event has multiple different outcome like p(HH) or P(HHH) )
Compound probability of independent events:
 Compound means For example: when calculating the probability of getting two heads in row from
flipping a coin twice P(HH) ,= ¼
 Independent: means the probability of the second flip is not dependent on the first flip
so for example: P(HH)= P(H1) * p(H2) = ½ * ½ = ¼
 Another way to calculate P(HH) is: P((n)numbers of H) = P(H) **n
Conditional probability (Dependent Event) :
 Is when one event is dependent on the outcome of one event
 Ex: if we select between 2 tag names to give a first name a first prize and the second the second
prize then the probability of the second dependent about what you will pick for the first,
 When we calculate probabilities involving one event AND another event dependent on the first,
we multiply their probabilities: P (A and B) = P(B∣A). P(A) = P(A∣B). P(B)
 P (A and B) = P(B∣A). P(A) = P(A∣B). P(B) called Bayes' theorem used to describes the probability of an
event, based on prior knowledge of conditions that might be related to the event
Probabilities involving "at least one" success:
 Rule: P(at least 1 success)= 1- P(all failures)
 Ex: surgeries involving implants sometimes result in the patient's body rejecting the implant. A certain
surgery has a rejection rate of 11% percent. The rest of the patients successfully accept the implant.

Assume that the results for each patient are independent., In a random sample of 8 of these surgeries, find

the probability that at least one patient rejects the implant

Solution: P(accept)=0.89

P(at least one rejects)=1−P(all 8 accept)

Probability without equally likely events:


 Ex: for a coin flip the head is more heavier than the tail so the probability of head is greater than
tail for example = 60% , and the tail will be 40%
 When calculating P(HH) = P(H1) * P(H2), when calculating P(HTH) = P(H1) * p(T2) * P(H3)

permutation:
 Number of possibilities (scenarios) of arranging a member of set into sequence
 Ex: number of possibilities of making 5 peoples sit on 3 chairs
 Law: (n!)/ (n-k)! --> since n is the number of people and k is number of chairs
Combination:
 Ex: Is how many ways combination (set) from same combination on the possibilities of making 5 peoples sit
on 3 chairs
 Or how many ways you can pick k things from n number without care about the order
 Law: = permutation/ k! --> number of ways we can arrange k number in k spaces
Binomial coefficient (combination):
 To calculate the total number of possible outcomes
 For example, when calculating how many possibilities of getting exactly 3 heads after flipping 5
coins
Solution: the first head of three has 5 places to move on the 5 then the second has 4 then the
third has 3 so the number of possibilities has total: 5*4*3 = 60 possibilities of having exactly 3
heads with taking the order into consideration
 If we want to not consider the possibilities of the ordering, we divide by factorial of number of
heads so: 3! = 3*2*1= 6 , then the total possibilities without taking order into consideration will
be= 60/6 = 10 possibilities
 It’s general rule like as: combination
Note: when calculating the probability of unfair coin for example with giving head probability of 80%
and need to calculate the probability of 4 heads out of 6 coins P (4/6 H): (binomial probability)
 First we will calculate the probability for one probability for example: HHHTHT
 Then calculate how many possibilities have 4 head after flipping 6 time by using binomial coef
 Then multiply the number of possibilities by the probability of one possibility to get the over all
probability of them

Random variables types:


1- Discrete (represented by PMF)
2- Continuous (represented by PDF)
Expected value: is the mean of a set of a random variables Ex: E(x)= mean of x
 E(X) = mean= X1 * p(X1) + X2 * p(X2) + X3 * p(X3)+ X4 * p(X4)+….
 Var(X)= (x1 - E(X))**2* p(X1) + (x2 - E(X))**2* p(X2)+ (x3 - E(X))**2* p(X3)+….
 Std(x) = square root(Var(X))

Operations on combining random variables:


 Suppose we have variable x, and random variable Y
1- E(x+y) = E(x) + E(Y)
2- E(x-y) = E(x) – E(Y)
3- Var(x+y) = var(x) + var(y) --> only for independent random variables
4- Var(x-y) = var(x)+ var(y)
Binomial random variable:
 Made up of finite number of independent trails
 Number represent how many successes in finite number of trials
 Is a random variable that satisfies the following condition:
1- Made of independent trails
2- Each trail can be classified as either success or failure
3- Generated from fixed number for trails
4- Probability of each success on each trail is constant
Ex: number of heads after 10 flip of a coin, and p(H)=60%, p(T)= 40%
 Each trail is independent, and each success has be success (head) or fail, the fixed number of trails
is 10 , the probability is constant equal to 60% and 40%

10% Rule:
 When taking a sample for example you make questionnaire for people at mall before taking their
exit then these sample are dependence and you can’t make it independence since you can’t
prevent people to get out of the gates since if the sample size is equal or less than 10% of the
population size then we can consider it and deal with it as a independence variable (binomial
variable )

Binomial probability distribution:


 Is getting a probability of binomial variable
 Formula : P(exactly k successes)= nCk ⋅ p**k⋅ (1−p)**n−k
Expected value of binomial variables:
 E(x)= np --> n is number of trials, p is probability of success
Variance of binomial variables:
 Var(x)= p(1-p)

Geometric random variable:


 It represents how many trials it takes until success
 Ex: how many rolls until we get to 6
 Ex: if you withdraw a card from cardplay to get a king and replace the card to the set again what is
the probability at the 5th withdraw with know the probability of witdrawing a kind= 1/13
Solution: p(x=5) = p(not king) * p(not king)* p(not king)* p(not king)*p(king)
Expected value of geometric random variable:
 Ex(x)= 1/p

Central Limit Theorem:


 The central limit theorem states that if you have a population with mean μ and standard deviation
σ and take sufficiently large random samples from the population with replacement , then the
distribution of the sample means will be approximately normally distributed.
 This distribution called sample distribution of sample mean
 Their mean is the same as the mean of the population
 The more the number of sample size the more the distribution less skewed and tight to the mean
and the more it be normal distribution if n > 30
 The variance of the sample distribution of sample mean = variance of population / (n) number of
sample
 (standard error of the mean) is the standard deviation of the sample distribution of sample mean
= square root (population variance / n )
Sampling distribution of sample probotion:
 We will use bernouli and binomial to be calculated
 Watch: https://www.khanacademy.org/math/statistics-probability/sampling-distributions-library/sample-
proportions/v/sampling-distribution-of-sample-proportion-part-1?modal=1

Confidence intervals and margin error:


 To calculate the confidence interval you have to calculate the margin of error which is 2 standard
error
 Each sample will has it’s own interval for specifce confidence like 95%
 We can interpret it with: 95% of the sample intervals contains the population proportion
 Or: the population proportion is between 95% of a sample proportion

You might also like