Python Data Visualization
Python Data Visualization
WITH
Quizzes & Assignments to test and reinforce key concepts, with step-by-step
solutions
Interactive demos to keep you engaged and apply your skills throughout the
course
*Copyright Maven Analytics, LLC
COURSE
OUTLINE
Cover key data visualization best practices for clear communication, with
1 Intro to Data tips for
choosing the right chart, formatting it effectively, and using it to tell a story
Visualization s
Introduce the Matplotlib library and use it to build & customize several ,
2 Matplotlib chart type including line charts, bar charts, pie charts, scatterplots, and
histograms
Fundamentals
PROJECT: Visualizing Coffee Industry
Data
3 Advanced Apply advanced customization techniques in Matplotlib, including multi-
chart figures, custom layouts & colors, style sheets, and more
Customization
PROJECT: Consolidating Coffee Industry Data into a
Report
Visualize data with Seaborn, another Python library that introduces new
4 Data Viz with chart
Seaborn types and layouts, and interacts will with Matplotlib
Your task is to effectively visualize data from these industries to deliver key
THE insights to M C G’s clients.
ASSIGNMENT This will range from analyzing hotel customer demographics to understanding
the major players in the global coffee industry.
NOTE: You can rename your folder by clicking “Rename” in the top left
corner
2) Open your new coursework folder and launch your first Jupyter
notebook!
NOTE: You can rename your notebook by clicking on the title at the top of the
screen
In this section we’ll cover key data visualization best practices for clear
communication,
with tips for choosing the right chart, formatting it effectively, and using it to tell a
story
Prefrontal Visual
Cortex
• Located in the frontal lobe
Cortex
• Located in the occipital lobe
• Responsible for cognitive • Responsible for visual
functioning & problem perception & understanding
solving • Helps us make sense of
• Helps us make sense of colors,
non-visual patterns, shapes, sizes, etc.
information (like raw • Instantaneous & subconscious
data)
• Slow & conscious
Data visualization puts both our prefrontal and visual cortex to work,
combining
the power of cognition (slow and conscious) and perception (instantaneous)
*Copyright Maven Analytics, LLC
THE TEN SECOND
RULE
In 10 seconds, what can you learn from the data
below?
0 10
TIME’S
UP!
Comparison
& Avoid using
composition too many
bins!
“Perfection is achieved not when there is nothing more to add, but when there is nothing left to take
away”
Antoine de Saint-Exupery
*Copyright Maven Analytics, LLC
STORYTELLING
Descriptive titles and data labels can be used to tell a clear story within your
visuals
What does
each line
represent?
What are
these
values?
What does
each
period
represent?
Tell a story with the data to guide the user to the insights
• Use titles, strategic labels, and callouts to create a clear narrative
*Copyright Maven Analytics, LLC
*Copyright Maven Analytics, LLC
INTRO TO
MATPLOTLIB
In this section we’ll introduce the Matplotlib library and use it to build & customize
several
chart types, including line charts, bar charts, pie charts, scatterplots, and histograms
Matplotlib is an open-source Python library built for data visualization that lets
you produce a wide variety of highly customizable charts & graphs
Matplotlib can plot many data types, including base Python sequences,
NumPy Arrays, and Pandas Series & DataFrames
Charts are created with the plot() Charts are created by defining a plot
function, and modified with object, and modified using figure &
additional functions axis methods
1. Create the figure object and assign it
to
the ‘fig’ variable
2. Add a chart, or axis, object to the
figure
and assign it to the ‘ax’ variable
3. Call the axis plot() method to draw
the
chart
We’ll mostly focus on the
Object-Oriented
approach, as it provides
more clear control over
customization
Hi!
Thanks!
section02_assignments.ipynb
Hi!
I need someone who knows M atplotlib for help with Plot The
some client work.
DataFrame
Can you plot Lodging Revenue and Other Revenue over
time for our hotel client?
Thanks!
section02_solutions.ipynb
You can change the legend location with the “loc” or “bbox_to_anchor”
arguments
• “loc” lets you set a predetermined location option
• “bbox_to_anchor” lets you set specific (x, y) coordinates
1
best (default)
upper right
upper left
upper center
lower right
lower left
lower center
center right
center left 0
center bbo
0 1
x
*Copyright Maven Analytics, LLC
LEGEND LOCATION
You can change the legend location with the “loc” or “bbox_to_anchor”
arguments
• “loc” lets you set a predetermined location option
• “bbox_to_anchor” lets you set specific (x, y) coordinates
You can change the legend location with the “loc” or “bbox_to_anchor”
arguments
• “loc” lets you set a predetermined location option
• “bbox_to_anchor” lets you set specific (x, y) coordinates
You can change the legend location with the “loc” or “bbox_to_anchor”
arguments
• “loc” lets you set a predetermined location option
• “bbox_to_anchor” lets you set specific (x, y) coordinates
You can change the legend location with the “loc” or “bbox_to_anchor”
arguments
• “loc” lets you set a predetermined location option
• “bbox_to_anchor” lets you set specific (x, y) coordinates
For a more info on annotations, visit: https://matplotlib.org/stable/tutorials/text/annotations.html#sphx-glr-tutorials-text-annotations- *Copyright Maven Analytics, LLC
REMOVING CHART
BORDERS
You can remove specific chart borders with
ax.spines[].set_visible(False)
Hi there!
The data you plotted earlier looks good, but can you clean
up the chart a little bit? I want it to to look polished for
our client. This is my last day in my summer internship
and I want to get hired back!
Thanks!
section02_assignments.ipynb
Hi there!
The data you plotted earlier looks good, but can you
clean up the chart a little bit! Want to to look polished
for our client.
This is my last day in my summer internship and I want
to get hired back!
Thanks!
section02_solutions.ipynb
PRO TIPS
Pivot tabular data to turn each unique series into a DataFrame column, and set the datetime as the
index Divide your series by the appropriate units while plotting to simplify the y-axis scale
Hey again,
Thanks!
section02_assignments.ipynb
Hey again,
Thanks!
section02_solutions.ipynb
Values in a single
Categories as the column
index
PRO TIPS
Use .groupby() and .agg() to aggregate your data by category and push the labels into the
index Use Seaborn or the Pandas plot API for grouped bar charts
Hello,
-S section02_assignments.ipynb
Hello,
-S section02_solutions.ipynb
Hello,
-S
section02_assignments.ipynb
Hello,
-S
section02_solutions.ipynb
Values in a single
column
Labels as the
index
PRO TIPS
Keep the number of slices low (<7) to enhance readability – you can group “others” into a single
slice Use bar charts if you want to compare the categories – pies are for showing how they make
Hello,
Need it ASAP.
Thx
section02_assignments.ipynb
Hello,
Need it ASAP.
Thx
section02_solutions.ipynb
PRO TIPS
Modify the alpha (transparency) level to make overlapping points more visible
Bubble charts can be useful in some cases, but they often add confusion rather than
clarity
*Copyright Maven Analytics, LLC
SCATTERPLOTS
numerical
series
PRO TIPS
Modify the alpha (transparency) level to plot multiple distributions on the same
axis Set density=True to use relative frequencies on the y-axis (percent of total)
section02_assignments.ipynb
section02_solutions.ipynb
Matplotlib has two methods for plotting data: PyPlot API & Object
Oriented
• Both can visualize many data types (lists, DataFrames, etc.), but object-oriented plots are easier to fully
customize
section03_coffee_project_part1.ipynb
Subplots let you create a grid of equally sized charts in a single figure
• fig, ax = plt.subplots(rows, columns) – this creates a grid with the specified rows &
columns
Column Column
0 1
Subplots let you create a grid of equally sized charts in a single figure
• fig, ax = plt.subplots(rows, columns) – this creates a grid with the specified rows &
columns
(0, (0,
0) 1)
(1, (1,
0) 1)
Specify ax[row][column] to
create and modify individual
subplots
*Copyright Maven Analytics, LLC
SUBPLOTS
Subplots let you create a grid of equally sized charts in a single figure
• fig, ax = plt.subplots(rows, columns) – this creates a grid with the specified rows &
columns
Use the “sharex “& “sharey” arguments to set the same axis limits on all the
plots
• This is set as “none” by default, but can be set to “all”, “row”, or “col”
Subplots can be any chart type, and do not have to be the same
type
Hey there,
more! Wendy
Section04_assignments.ipynb
Hey there,
more! Wendy
Section04_solutions.ipynb
You can build layouts with charts of varying sizes by setting a gridspec
object
• This creates a grid with a specified number of rows & columns
Row 0
Row 1
Row 2
Row 3
Row 4
Row 5
Row 6
Row 7
You can build layouts with charts of varying sizes by setting a gridspec
object
• This creates a grid with a specified number of rows & columns
• Each axis, or chart, can then occupy a group of squares in the grid
Row 0
Row 1 ax1
Row 2
Use a slice to specify the Row
ranges of rows and columns 3
for each axis Row
4
Row
5
Row
6
Row
7 *Copyright Maven Analytics, LLC
GRIDSPEC
You can build layouts with charts of varying sizes by setting a gridspec
object
• This creates a grid with a specified number of rows & columns
• Each axis, or chart, can then occupy a group of squares in the grid
Row 0
Row 3
Row 4
Row 5
Row 6
Row 7
You can build layouts with charts of varying sizes by setting a gridspec
object
• This creates a grid with a specified number of rows & columns
• Each axis, or chart, can then occupy a group of squares in the grid
Row 0
Row 3
Row 4
Row 5 ax3
Row 6
Row 7
You can build layouts with charts of varying sizes by setting a gridspec
object
• This creates a grid with a specified number of rows & columns
• Each axis, or chart, can then occupy a group of squares in the grid
You can build layouts with charts of varying sizes by setting a gridspec
object
• This creates a grid with a specified number of rows & columns
• Each axis, or chart, can then occupy a group of squares in the grid
Hi there,
Thanks!
section04_assignments.ipynb
Hi there,
Thanks!
section04_solutions.ipynb
You can also loop through a list of colors to pass them to separate series in a
plot
Hi again,
Love the layout, HATE the colors! Let’s show some polish by
getting away from the defaults.
Thanks,
Sarah
section04_assignments.ipynb
Thanks,
Sarah
section04_solutions.ipynb
The “fivethirtyeight”
style has larger font
sizing, and adds
gridlines and a
background color
Hi,
Thx
-S
section04_assignments.ipynb
Hi,
Thx
-S
section04_solutions.ipynb
Viewing the parameters of a style sheet can help format charts properly and
provide inspiration for your own formatting changes
Clarissa
section05_coffee_project_part2.ipynb
We’ll cover integration with Matplotlib later, which is where you’ll be able
to leverage the chart formatting skills you’ve learned throughout the
course
Note that Seaborn automatically aggregates the data for the plot, using unique category values as the
labels for the bars, the mean of each category for the bar length, and the column headers as the axis
labels
*Copyright Maven Analytics, LLC
BAR
CHARTS
Bar charts can be created in Seaborn with sns.barplot()
• Simply specify the desired category labels and series values as “x” & “y”
arguments
Hi,
Thanks
section06_assignments.ipynb
Hi,
The build a bar chart with the average room nights stayed
for
our top 5 countries.
Thanks
section06_solutions.ipynb
Q1 Media Q3
n
Min Q3+1.5*IQR
Boxplot
statistics:
• M edian (50th percentile) Max
IQR
Hi,
Sarah
section06_assignments.ipynb
Hi,
Sarah
section06_solutions.ipynb
Creates a scatterplot and adds the distribution for each variable sns.jointplot(x, y, kind,
data)
Creates a matrix of scatterplots comparing multiple variables,
and shows the distribution for each one sns.pairplot(cols
)
Hi there,
First for all the data and then for each top 5 country.
Best,
Wendy
section06_assignments.ipynb
Hi there,
First for all the data and then for each top 5 country.
Best,
Wendy
section06_solutions.ipynb
Hi there,
matrix. Thanks,
Wendy
section06_assignments.ipynb
Hi there,
matrix. Thanks,
Wendy
section06_solutions.ipynb
This plots a
histogram of “price”
for each “color” in
the DataFrame
Seaborn adds new chart types that are useful in exploring data
• Boxplots, violin plots, and linear model plots help profile data and identify relationships between
variables
Thanks
section07_final_project.ipynb