STAT 1770 Lab 2-2
STAT 1770 Lab 2-2
STAT 1770 Lab 2-2
January 22
Submission Guidelines: Although group discussions are encouraged, each student must submit
their work individually to Crowdmark by the end of the day.
The concepts introduced in this tutorial will be vital for understanding future lectures and for
completing the upcoming assignment. Make sure you are comfortable with them before moving on
to more complex topics.
• Bar Plot: A bar plot represents categorical data with rectangular bars. The length of each
bar is proportional to the count or frequency of the category it represents.
• Pie Chart: A pie chart displays categorical data in the form of a circle divided into sectors.
Each sector represents a category, and its size is proportional to the count or frequency of
that category.
• Histogram: A histogram is similar to a bar plot but is used for continuous or numeric
data. The data is divided into bins, and the frequency or count of data points in each bin is
represented by the height of the bar.
• Polygon Plots: A polygon plot connects the midpoints of each bin in a histogram with
straight lines, creating a polygon. It is often used to highlight the distribution shape of the
data.
• Standard Deviation: The standard deviation, often denoted by σ or s, measures the amount
of variation or dispersion in a set of values. A low standard deviation means that the values
tend to be close to the mean, while a high standard deviation means the values are spread
out over a wider range.
• Mode: The mode is the value that appears most frequently in a data set. A data set may
have one mode (unimodal), two modes (bimodal), or multiple modes (multimodal).
1
• Median: The median is the middle number in a sorted list of numbers. If the list has an odd
number of observations, the median is the middle number. If the list has an even number of
observations, the median is the average of the two middle numbers.
• Percentile: The P th percentile is the value below which a given percentage (P ) of the data
falls. For example, the 20th percentile is the value below which 20% of the data falls.
• Q1 and Q3 (First and Third Quartiles): Q1 (25th percentile) is the middle value in the
first half of the data set (25 percent of the data points are below that). Similarly, Q3 (75th
percentile) is the middle value in the second half of the data set (75 percent of the data points
are below that).
• Suspected Outliers: Outliers are data points that are significantly different from other
observations. They could be due to variability in the data or errors. In a box-and-whisker
plot, outliers are usually observed as points that fall outside the ”whiskers.”
• Box Plot: A box plot is a graphical representation of statistical data based on a five-number
summary (minimum, first quartile (Q1 ), median, third quartile (Q3 ), and maximum). It can
also show outliers. The box represents the interquartile range (the range between Q1 and
Q3 ), the line inside the box is the median, and the lines or ”whiskers” extend to the smallest
and largest observations in the data.
2
Question: Consider the following data set representing the number of hours studied by
students for a final exam: 10, 12, 12, 15, 15, 15, 15, 17, 18, 20, 20, 20, 23, 23, 26, 26, 26, 27,
27, 40.
You are expected to follow the materials and methods taught in the lectures. You
are not allowed to use Excel.