Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 43
Matplotlib in Python
Introduction to Data Visualization in Python
• Matplotlib is a powerful plotting library in Python used for creating static, animated, and interactive visualizations. • It was originally designed to emulate plotting abilities of Matlab but in Python • Matplotlib is popular due to its ease of use, extensive documentation, and wide range of plotting capabilities. • Many other packages use Matplotlib for data visualization, including pandas, NumPy, and SciPy. • Other libraries include seaborn, Altair, ggpy, Bokeh, plotly • While some are built on top of Matplotlib, while others are independent In Matplotlib, a figure is the top-level container that holds all the elements of a plot. It represents the entire window or page where the plot is drawn. The parts of a Matplotlib figure include: • Figures (the canvas) • Axes (The co-ordinate system) • Axis (X-Y Axis) • Marker • Lines to Figures • Matplotlib Title • Axis labels • Ticks and tick labels • Legend • Gridlines • Spines (Borders of the plot area) • The package is imported into the Python script by adding the following statement: from matplotlib import pyplot as plt • Here pyplot() is the most important function in matplotlib library, which is used to plot 2D data. Pyplot in Matplotlib • Pyplot is a Matplotlib module that provides a MATLAB-like interface. • Each pyplot function makes some changes to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. • The various plots we can utilize using Pyplot are Line Plot, Histogram, Scatter, 3D Plot, Image, Contour, and Polar Basic Functions for Chart Creation • Use plot() function of matplotlib.pyplot to plot the graph. This function is used to draw the graph. It takes x value, y value, format string(line style and color) as an argument. • Use show() function of matplotlib.pyplot to show the graph window. This function is used to display the graph. It does not take any argument. • Use title() function of matplotlib.pyplot to give title to graph. It takes string to be displayed as title as argument. • Use xlabel() function of matplotlib.pyplot to give label to x-axis. It takes string to be displayed as label of x-axis as argument. • Use ylabel() function of matplotlib.pyplot to give label to y-axis. It takes string to be displayed as label of y-axis as argument. • Use savefig() function of matplotlib.pyplot to save the result in a file. • Use annotate() function of matplotlib.pyplot to highlight some specific locations in the chart. • Use legend() function of matplotlib.pyplot to apply legend in the chart. • The subplot() function allows you to plot different things in the same figure. Its first argument specify height, second specify the width and third argument specify the active subplot. • Use bar() function to generate if we want to draw bar graph in place of line graph. E.g. plt.bar(x, y, color = 'g', align = 'center') • Use hist() function for graphical representation of the frequency distribution of data. Rectangles of equal horizontal size corresponding to class interval called bin and variable height corresponding to frequency. It takes the input array and bins as two parameters. The successive elements in bin array act as the boundary of each bin. Example # importing matplotlib module from matplotlib import pyplot as plt Note: Remember to use plt.savefig() # x-axis values before the plt.show() function x = [5, 2, 9, 4, 7] # Y-axis values y = [10, 5, 8, 4, 2] # Function to plot plt.plot(x,y) # function to show the plot plt.savefig(“line_plot.png”) plt.show() import matplotlib.pyplot as plt # Define X and Y data points X = [12, 34, 23, 45, 67, 89] Y = [1, 3, 67, 78, 7, 5] # Plot the graph using matplotlib plt.plot(X, Y,marker='o', markerfacecolor='r’) plt.xlabel(“X-Axis”) plt.ylable(“Y-Axis”) # Add gridlines to the plot plt.grid(color = 'green', linestyle = '--', linewidth = 0.5) # `plt.grid()` also works # displaying the title plt.title(label='Number of Users of a particular Language’, fontweight=10, pad='2.0’) # Function to view the plot plt.show() Plotting Multiple Lines in a Line Plot import matplotlib.pyplot as plt import numpy as np # create data x = [1,2,3,4,5] y = [3,3,3,3,3] # plot lines plt.plot(x, y, label = "line 1", linestyle="-") plt.plot(y, x, label = "line 2", linestyle="--") plt.plot(x, np.sin(x), label = "curve 1", linestyle="-.") plt.plot(x, np.cos(x), label = "curve 2", linestyle=":") plt.legend() plt.show() Bar Plot from matplotlib import pyplot as plt # x-axis values x = [5, 2, 9, 4, 7] # Y-axis values y = [10, 5, 8, 4, 2] # Function to plot the bar plt.bar(x,y) # function to show the plot plt.show() Horizontal Bar Chart import matplotlib.pyplot as plt y=['one', 'two', 'three', 'four', 'five'] # getting values against each value of y x=[5,24,35,67,12] plt.barh(y, x) # setting label of y-axis plt.ylabel("pen sold") # setting label of x-axis plt.xlabel("price") plt.title("Horizontal bar graph") plt.show() Stacked Bar Chart import matplotlib.pyplot as plt import pandas as pd data=[['A', 10, 20, 10, 26], ['B', 20, 25, 15, 21], ['C', 12, 15, 19, 6],['D', 10, 18, 11, 19]] df = pd.DataFrame(data,columns=['Team', 'Round 1', 'Round 2', 'Round 3', 'Round 4’]) print(df) # plot data in stack manner of bar type df.plot(x='Team', kind='bar', stacked=True, title='Stacked Bar Graph by dataframe’) plt.show() 2 Bar Plots in a graph from matplotlib import pyplot as plt from matplotlib import style style.use('ggplot’) plt.bar([0.25,1.25,2.25,3.25,4.25],[50,40,70,80,20], label="BMW", color='g’, width=.5) #1st bar plt.bar([.75,1.75,2.75,3.75,4.75],[80,20,20,50,60], label="Audi", color='r’, width=.5) #2nd bar plt.legend() #legend plt.xlabel('Days’) #x-axis label plt.ylabel('Distance (kms)’) #y-axis label plt.title('Information’) #chart title plt.show() Histogram from matplotlib import pyplot as plt # Y-axis values y = [10, 5, 5,8, 4,10,10, 2] # Function to plot histogram plt.hist(y) # Function to show the plot plt.show() Plotting 2 histograms in the same graph import matplotlib.pyplot as plt # giving two age groups data age_g1 = [1, 3, 5, 10, 15, 17, 18, 16, 19, 21, 23, 28, 30, 31, 33, 38, 32, 40, 45, 43, 49, 55, 53, 63, 66, 85, 80, 57, 75, 93, 95] age_g2 = [6, 4, 15, 17, 19, 21, 28, 23, 31, 36, 39, 32, 50, 56, 59, 74, 79, 34, 98, 97, 95, 67, 69, 92, 45, 55, 77,76, 85] # plotting first histogram plt.hist(age_g1, label='Age group1', bins=14, edgecolor='red') # plotting second histogram plt.hist(age_g2, label="Age group2", bins=14, edgecolor='yellow') plt.legend() # Showing the plot using plt.show() plt.show() Scatter Plot from matplotlib import pyplot as plt x = [5, 2, 9, 4, 7] # Y-axis values y = [10, 5, 8, 4, 2] # Function to plot scatter plt.scatter(x, y) # function to show the plot plt.show() Another example for scatter plot import matplotlib.pyplot as plt from matplotlib import style style.use('ggplot’) #importing style from ggplot x = [1,1.5,2,2.5,3,3.5,3.6] y=[7.5,8,8.5,9,9.5,10,10.5] x1=[8,8.5,9,9.5,10,10.5,11] y1=[3,3.5,3.7,4,4.5,5,5.2] plt.scatter(x,y, label='high income low saving',color='r’) #1st scatter plot plt.scatter(x1,y1,label='low income high savings',color=‘b’) # 2nd scatter plot plt.xlabel('saving*100’) # x-axis label plt.ylabel('income*1000’) #y-axis label plt.title('Scatter Plot’) #chart title plt.legend() #legend plt.show() #plot display Pie Plot in Python import matplotlib.pyplot as plt slices = [7,2,2,13] #slices in pie plot activities = ['sleeping’, 'eating’, 'working’, 'playing’] #lables of pie plot cols = ['c','m','r',’b’] #colors in pie plot plt.pie(slices, labels=activities, colors=cols) plt.title('Pie Plot') #Plot title plt.show() #Displaying the plot The seaborn library in Python • Seaborn is a library mostly used for statistical plotting in Python. • It is built on top of Matplotlib and provides beautiful default styles and color palettes to make statistical plots more attractive. Plotting using seaborn • We will be plotting a simple line plot using the iris dataset. • Iris dataset contains five columns such as Petal Length, Petal Width, Sepal Length, Sepal Width and Species Type. • It is a preloaded dataset in Python seaborn Step 1-> pip install seaborn Step 2-> import seaborn as sns Step 3-> sns.load_dataset(“iris”) The iris dataset, it is a dataframe. Creating a Basic Line Plot with seaborn in Python # importing packages import seaborn as sns # loading dataset data = sns.load_dataset("iris") # draw lineplot sns.lineplot(x="sepal_length", y="sepal_width", data=data) Using seaborn with Matplotlib import seaborn as sns import matplotlib.pyplot as plt # loading dataset data = sns.load_dataset("iris") # draw lineplot sns.lineplot(x="sepal_length", y="sepal_width", data=data) # setting the x limit of the plot plt.xlim(5) plt.show() Heatmap • Heatmap is defined as a graphical representation of data using colors to visualize the value of the matrix. • In this, to represent more common values or higher activities brighter colors basically reddish colors are used and to represent less common or activity values, darker colors are preferred. Basic Heatmap in Python import numpy as np import seaborn as sns import matplotlib.pyplot as plt # generating 2-D 10x10 matrix of random numbers from 1 to 100 data = np.random.randint(low = 1, high = 100, size = (10, 10)) print("The data to be plotted:\n") print(data) # plotting the heatmap hm = sns.heatmap(data = data, annot=True) #adding data values in the heatmap # displaying the plotted heatmap plt.show() seaborn.heatmap() function Syntax: seaborn.heatmap(data, *, vmin=None, vmax=None, cmap=None, center=None, annot_kws=None, linewidths=0, linecolor=’white’, cbar=Tru e, **kwargs) Important Parameters: • data: 2D dataset that can be coerced into an ndarray. • vmin, vmax: Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments. • cmap: The mapping from data values to color space. • center: The value at which to center the colormap when plotting divergent data. • annot: If True, write the data value in each cell. • fmt: String formatting code to use when adding annotations. • linewidths: Width of the lines that will divide each cell. • linecolor: Color of the lines that will divide each cell. • cbar: Whether to draw a colorbar. All the parameters except data are optional. Suggested Reads • Neural Data Science in Python — Neural Data Science in Python • Python Plotting With Matplotlib (Guide) – Real Python • Getting Started with Python Matplotlib – An Overview – GeeksforGeeks • Python Seaborn Tutorial – GeeksforGeeks • Subplots in Python ( Matplotlib Subplots - How to create multiple plots in same figure in Py thon? - Machine Learning Plus )