Biostats Lesson 3
Biostats Lesson 3
Biostats Lesson 3
REVERSED → most of the values are at the → Pareto charts represent a frequency distribution for
J-SHAPED lower part of the x-axis. categorical variables and the frequencies are
arranged from highest to lowest.
→ It doesn’t denote that they are continuous- so dapat
they are not sticking to each other it is still not a
histogram.
RIGHT-SKEWED → The skew refers to the tail.
→ positively skewed.
TIME SERIES GRAPH COMPOUND TIME
SERIES GRAPH
LEFT-SKEWED → negatively skewed
BIMODAL two modes; two bars stand out PLATYKURTIC Flat; heavy tails
TRADITIONAL STATISTICS → Widely used measure of central tendency
→ Layman’s concept of average
There are three concepts:
→ Affected by presence of outliers in the data.
Measures of how they are centered or their
Central Tendency average. Sample (statistic) x̄ = Σx / n
Population (parameter) µ = Σx / N
Measures of how dispersed the data are.
Variation • Mean for Ungrouped data:
Measures of describing the position of the
Comes from raw data.
Position data value in relation to other
The normal way of computing.
values in relation to the data set.
• Mean for Grouped data:
• Average and mean are synonymous with each other
Comes form the frequency distribution table.
except in statistics because average is an all-
Makes use of the summation of all the
encompassing term which can be mean, median, or
(midpoint multiplied by the frequency).
mode.
Only an approximation.
PARAMETER VS STATISTIC
Midpoint = lower + upper / 2
Statistic characteristic or measure obtained by
using the data values from a sample.
Parameter a characteristic or measure obtained
by using all the data values from a
specific population.
▪ n = sample size
▪ N = population size
• rounding should not be done until the final answer (2) Median
is calculated.
→ middlemost value.
• the calculated mean/SD should have 1 decimal
→ Obtained by sorting the values from the lowest to
place higher than the data values.
highest and getting the value in the middle.
→ Preferred as the typical value (or center) than mean
MEASURES OF CENTRAL TENDENCY when distribution is skewed.
˃ Median is less affected by outliers
→ describes where the distribution may be “centered” than the Mean.
or the bulk of the data.
→ There are various concepts of center:
(1) Mean
→ Most frequently occurring value, most typical → Simplest measure of dispersion, used to get a quick
→ Most descriptive when distributions are highly- idea of the spread.
peaked (leptokurtic) – suggesting large → The difference between the highest and lowest
concentration on a single value. value.
→ Can be used in categorical data. → waste of information: rest of the values are not
used.
(4) Midrange
MEASURES OF VARIATION
(3) Standard Deviation
→ Measure the spread or variability of the values from
each other. → an extra step to variation: square root of the
variance.
→ measured back in the unit as that of the data values.
Sample (statistic) s
Population (parameter) σ
→ Ratio of the standard deviation to the mean ▪ Quartiles divide the data into 4 equal parts.
→ Used to compare the measure of spread between
sets of data that are measured in different units. ▪ Decile divides the data into 10 equal parts.
→ Expressed in percentage. (Cvar = %)
MEASURES OF POSITION
Boxplot