Unit3-v1-Plotting and Visualization.pptx

yerrasaniayyapparedd 52 views 73 slides Aug 02, 2024
Slide 1
Slide 1 of 73
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73

About This Presentation

routing and switching room at Cisco, a dream was born...
Today, Cisco certifications are the gold standard in IT training. We’ve issued more than 4 million certifications so far. In the next 30 years, we aim to train over 10 million more people in our pledge to close the IT skills gap and reshape ...


Slide Content

Unit3 Customizing Plots: Introduction to Matplotlib , Plots , making subplots, controlling axes, Ticks, Labels & legends, annotations and Drawing on subplots, saving plots to files, matplotlib configuration using different plot styles, Seaborn library . Making sense of data through advanced visualization : Controlling line properties of chart, creating multiple plots , Scatter plot, Line plot, bar plot, Histogram, Box plot, Pair plot, playing with text, styling your plot, 3d plot of surface

Introduction to Matplotlib Plotting and Visualization Making informative visualizations (sometimes called plots) is one of the most important tasks in data analysis. It may be a part of the exploratory process—for example, to help identify outliers or needed data transformations, or as a way of generating ideas for models. There are two primary uses for data visualization: To explore data To communicate data matplotlib API Primer: Matplotlib is a low level graph plotting library in python that serves as a visualization utility .

Installation of Matplotlib If you have  Python  and  PIP  already installed on a system, then installation of Matplotlib is very easy. Install it using this command: > pip install matplotlib Most of the Matplotlib utilities lies under the   pyplot   submodule , and are usually imported under the  plt  alias : With matplotlib , we use the following import convention: import matplotlib.pyplot as plt

Plotting x and y points The  plot()  function is used to draw points (markers) in a diagram. By default, the  plot()  function draws a line from point to point. The function takes parameters for specifying points in the diagram. Parameter 1 is an array containing the points on the x-axis. Parameter 2 is an array containing the points on the y-axis. If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to the plot function . The x-axis is the horizontal axis. The y-axis is the vertical axis.

Ex1: Creating a simple plot : import  numpy  as  np import  matplotlib.pyplot  as  plt x = np.array ([80, 85, 90, 95, 100, 105, 110, 115, 120, 125]) y = np.array ([240, 250, 260, 270, 280, 290, 300, 310, 320, 330]) plt.plot (x, y) plt.title ("Sports Watch Data") plt.xlabel ("Average Pulse") plt.ylabel ("Calorie Burnage ") plt.show ()

Line Plot

Figures and Subplots Plots in matplotlib reside within a Figure object. You can create a new figure with plt.figure : fig = plt.figure () In IPython , an empty plot window will appear, but in Jupyter nothing will be shown until we use a few more commands. plt.figure has a number of options; notably, figsize will guarantee the figure has a certain size and aspect ratio if saved to disk. You can’t make a plot with a blank figure. You have to create one or more subplots using add_subplot : ax1 = fig.add_subplot (2, 2, 1) This means that the figure should be 2 × 2 (so up to four plots in total), and we’re selecting the first of four subplots (numbered from 1).

If you create the next two sub plots, you’ll end up with a visualization that looks like Figure 9-2: In [18]: ax2 = fig.add_subplot (2, 2, 2) In [19]: ax3 = fig.add_subplot (2, 2, 3)

If we add the following command, you’ll get something like Figure 9-3 : plt.plot ( np.random.randn (50 ). cumsum (), 'k- -') The 'k--' is a style option instructing matplotlib to plot a black dashed line.

Adjusting the spacing around subplots By default matplotlib leaves a certain amount of padding around the outside of the subplots and spacing between subplots . You can change the spacing using the subplots_adjust method on Figure objects, also avail able as a top-level function: subplots_adjust (left=None, bottom=None, right=None, top=None, wspace =None, hspace =None ) wspace and hspace controls the percent of the figure width and figure height, respectively , to use as spacing between subplots . plt.subplots_adjust ( wspace =0, hspace =0)

Colors, Markers, and Line Styles Matplotlib’s main plot function accepts arrays of x and y coordinates and optionally a string abbreviation indicating color and line style. For example, to plot x versus y with green dashes, you would execute: ax.plot (x , y, 'g- -') ax.plot (x, y, linestyle ='--', color='g ') Line plots can additionally have markers to highlight the actual data points . The marker can be part of the style string, which must have color followed by marker type and line style from numpy.random import randn plt.plot ( randn (30). cumsum (), ' ko - -')

O/p

Ticks, Labels, and Legends This could also have been written more explicitly as : plot( randn (30 ). cumsum (), color='k', linestyle ='dashed', marker='o ') The pyplot interface, designed for interactive use, consists of methods like xlim , xticks , and xticklabels . These control the plot range, tick locations, and tick labels, respectively. They can be used in two ways. Called with no arguments returns the current parameter value (e.g., plt.xlim () returns the current x-axis plotting range) Called with parameters sets the parameter value (e.g., plt.xlim ([0, 10]), sets the x-axis range to 0 to 10)

Setting the title, axis labels, ticks, and ticklabels To illustrate customizing the axes, To create a simple figure and plot of a random walk (see Figure 9-8 ): fig = plt.figure () ax = fig.add_subplot (1, 1, 1) ax.plot ( np.random.randn (1000). cumsum ())

To change the x-axis ticks, it’s easiest to use set_xticks and set_xticklabels . The former instructs matplotlib where to place the ticks along the data range; by default these locations will also be the labels. But we can set any other values as the labels using set_xticklabels : In [40]: ticks = ax.set_xticks ([0, 250, 500, 750, 1000]) In [41]: labels = ax.set_xticklabels (['one', 'two', 'three', 'four', 'five'], rotation=30 , fontsize ='small ') The rotation option sets the x tick labels at a 30-degree rotation. Lastly, set_xlabel gives a name to the x-axis and set_title the subplot title (see Figure 9-9 for the resulting figure): ax.set_title ('My first matplotlib plot ') ax.set_xlabel ('Stages')

Adding legends In [44]: from numpy.random import randn In [45]: fig = plt.figure (); ax = fig.add_subplot (1, 1, 1 ) In [46]: ax.plot ( randn (1000). cumsum (), 'k', label='one ') Out[46]: [<matplotlib.lines.Line2D at 0x7fb624bdf860 >] In [47]: ax.plot ( randn (1000). cumsum (), 'k--', label='two ') Out[47]: [<matplotlib.lines.Line2D at 0x7fb624be90f0 >] In [48]: ax.plot ( randn (1000). cumsum (), 'k.', label='three') Out[48]: [<matplotlib.lines.Line2D at 0x7fb624be9160 >]

Once you’ve done this, you can either call ax.legend () or plt.legend () to automatically create a legend. The resulting plot is in Figure 9-10: In [49]: ax.legend ( loc = 'best‘)

Annotations and Drawing on a Subplot you may wish to draw your own plot annotations , which could consist of text, arrows, or other shapes. You can add annotations and text using the text, arrow, and annotate functions. text draws text at given coordinates (x, y) on the plot with optional custom styling. ax.text (x , y, 'Hello world!', family=' monospace ', fontsize =10) Annotations can draw both text and arrows arranged appropriately.

from datetime import datetime fig = plt.figure () ax = fig.add_subplot (1, 1, 1) data = pd.read_csv ('examples/spx.csv', index_col =0, parse_dates =True) spx = data['SPX'] spx.plot (ax=ax , style ='k- ') crisis_data = [ ( datetime (2007, 10, 11), 'Peak of bull market'), ( datetime (2008, 3, 12), 'Bear Stearns Fails'), ( datetime (2008, 9, 15), 'Lehman Bankruptcy') ] for date, label in crisis_data : ax.annotate (label, xy =(date, spx.asof (date) + 75), xytext =(date, spx.asof (date) + 225), arrowprops = dict ( facecolor ='black', headwidth =4, width=2, headlength =4), horizontalalignment ='left', verticalalignment ='top') # Zoom in on 2007-2010 ax.set_xlim (['1/1/2007', '1/1/2011']) ax.set_ylim ([600, 1800]) ax.set_title ('Important dates in the 2008-2009 financial crisis')

Annotation

Saving Plots to File plt.savefig ( ' figpath.svg ‘) plt.savefig ( 'figpath.png', dpi=400, bbox_inches ='tight ') savefig doesn’t have to write to disk; it can also write to any file-like object, such as a BytesIO : from io import BytesIO buffer = BytesIO () plt.savefig (buffer) plot_data = buffer.getvalue () See Table 9-2 for a list of some other options for savefig

matplotlib Configuration One way to modify the configuration programmatically from Python is to use the rc method; for example, to set the global default figure size to be 10 × 10, you could enter: plt.rc ('figure', figsize =(10, 10 ))

Plotting with pandas and seaborn In pandas we may have multiple columns of data, along with row and column labels. pandas itself has built-in methods that simplify creating visualizations from Data Frame and Series objects. Another library is seaborn , a statistical graphics library created by Michael Waskom. Seaborn simplifies creating many common visualization types

Line Plots Series and DataFrame each have a plot attribute for making some basic plot types. By default, plot() makes line plots (see Figure 9-13): In [60]: s = pd.Series ( np.random.randn (10). cumsum (), index= np.arange (0, 100, 10)) In [61]: s.plot ()

DataFrame’s plot method plots each of its columns as a different line on the same subplot, creating a legend automatically (see Figure 9-14): df = pd.DataFrame ( np.random.randn (10, 4). cumsum (0), columns =['A', 'B', 'C', 'D '],index= np.arange (0 , 100, 10)) > df.plot ()

DataFrame Plot

Bar Plots The plot.bar () and plot.barh () make vertical and horizontal bar plots, respectively . In this case, the Series or DataFrame index will be used as the x (bar) or y ( barh ) ticks (see Figure 9-15 ): fig, axes = plt.subplots (2, 1 ) data = pd.Series ( np.random.rand (16), index=list (' abcdefghijklmnop ')) data.plot.bar (ax=axes[0], color='k', alpha=0.7 ) data.plot.barh (ax=axes[1], color='k', alpha=0.7 ) (h-horizontal)

The options color='k' and alpha=0.7 set the color of the plots to black and use par tial transparency on the filling

Refer –Textbook 286 Python for Data Analysis Data Wrangling with Pandas, NumPy , and IPython Wes McKinney

Practise : from matplotlib import pyplot as plt years = [1950, 1960, 1970, 1980, 1990, 2000, 2010] gdp = [300.2, 543.3, 1075.9, 2862.5, 5979.6, 10289.7, 14958.3] # create a line chart, years on x-axis, gdp on y-axis plt.plot (years, gdp , color='green', marker='o', linestyle ='solid') # add a title plt.title ("Nominal GDP") # add a label to the y-axis plt.ylabel ("Billions of $") plt.show ()

bar chart A bar chart is a good choice when you want to show how some quantity varies among some discrete set of items. For instance, Figure 3-2 shows how many Academy Awards were won by each of a variety of movies : movies = ["Annie Hall", "Ben- Hur ", "Casablanca", "Gandhi", "West Side Story"] num_oscars = [5, 11, 3, 8, 10] # plot bars with left x-coordinates [0, 1, 2, 3, 4], heights [ num_oscars ] plt.bar (range( len (movies)), num_oscars ) plt.title ("My Favorite Movies") # add a title plt.ylabel ("# of Academy Awards") # label the y-axis # label x-axis with movie names at bar centers plt.xticks (range( len (movies)), movies) plt.show ()

BAR Chart

A bar chart can also be a good choice for plotting histograms of bucketed numeric values, as in Figure 3-3, in order to visually explore how the values are distributed : from collections import Counter grades = [83, 95, 91, 87, 70, 0, 85, 82, 100, 67, 73, 77, 0] # Bucket grades by decile , but put 100 in with the 90s histogram = Counter(min(grade // 10 * 10, 90) for grade in grades) plt.bar ([x + 5 for x in histogram.keys ()], # Shift bars right by 5 histogram.values (), 10 # Give each bar its correct height # Give each bar a width of 10 edgecolor =(0, 0, 0)) # Black edges for each bar plt.axis ([-5, 105, 0, 5]) # x-axis from -5 to 105, # y-axis from 0 to 5 plt.xticks ([10 * i for i in range(11)]) # x-axis labels at 0, 10, ..., 100 plt.xlabel (" Decile ") plt.ylabel ("# of Students") plt.title ("Distribution of Exam 1 Grades") plt.show ()

For example, making simple plots (like Figure 3-1) is pretty simple: from matplotlib import pyplot as plt years = [1950, 1960, 1970, 1980, 1990, 2000, 2010] gdp = [300.2, 543.3, 1075.9, 2862.5, 5979.6, 10289.7, 14958.3] # create a line chart, years on x-axis, gdp on y-axis plt.plot (years, gdp , color='green', marker='o', linestyle ='solid') # add a title plt.title ("Nominal GDP") # add a label to the y-axis plt.ylabel ("Billions of $") plt.show ()

Simple line chart

Bar Charts A bar chart is a good choice when you want to show how some quantity varies among some discrete set of items. For instance, Figure 3-2 shows how many Academy Awards were won by each of a variety of movies: movies = ["Annie Hall", "Ben- Hur ", "Casablanca", "Gandhi", "West Side Story"] num_oscars = [5, 11, 3, 8, 10] # plot bars with left x-coordinates [0, 1, 2, 3, 4], heights [ num_oscars ] plt.bar (range( len (movies)), num_oscars ) plt.title ("My Favorite Movies") # add a title plt.ylabel ("# of Academy Awards") # label the y-axis # label x-axis with movie names at bar centers plt.xticks (range( len (movies)), movies) plt.show ()

Bar chart

A bar chart can also be a good choice for plotting histograms of bucketed numeric values, as in Figure 3-3, in order to visually explore how the values are distributed from collections import Counter grades = [83, 95, 91, 87, 70, 0, 85, 82, 100, 67, 73, 77, 0] # Bucket grades by decile , but put 100 in with the 90s histogram = Counter(min(grade // 10 * 10, 90) for grade in grades) plt.bar ([x + 5 for x in histogram.keys ()], # Shift bars right by 5 histogram.values (), # Give each bar its correct height 10, # Give each bar a width of 10 edgecolor =(0, 0, 0)) # Black edges for each bar plt.axis ([-5, 105, 0, 5]) # x-axis from -5 to 105, # y-axis from 0 to 5 plt.xticks ([10 * i for i in range(11)]) # x-axis labels at 0, 10, ..., 100 plt.xlabel (" Decile ") plt.ylabel ("# of Students") plt.title ("Distribution of Exam 1 Grades") plt.show ()

Bar chart for hitogram

Line chart Line Charts - As we saw already, we can make line charts using plt.plot These are a good choice for showing trends, as illustrated in Figure 3-6: variance == [1, 2, 4, 8, 16, 32, 64, 128, 256] bias_squared = [256, 128, 64, 32, 16, 8, 4, 2, 1] total_error = [x + y for x, y in zip(variance, bias_squared )] xs = [i for i, _ in enumerate(variance)] # We can make multiple calls to plt.plot # to show multiple series on the same chart plt.plot ( xs , variance, 'g- ', label='variance') plt.plot ( xs , bias_squared , 'r-.', label='bias^2') # green solid line # red dot-dashed line plt.plot ( xs , total_error , 'b:', label='total error') # blue dotted line

Bias variance tradoff # Because we've assigned labels to each series, # we can get a legend for free ( loc =9 means "top center") plt.legend ( loc =9) plt.xlabel ("model complexity") plt.xticks ([]) plt.title ("The Bias-Variance Tradeoff") plt.show ()

Scatterplots A scatterplot is the right choice for visualizing the relationship between two paired sets of data. For example, Figure 3-7 illustrates the relationship between the number of friends your users have and the number of minutes they spend on the site every day.

Sactter plot

Additional Materials – refer : https://www.geeksforgeeks.org/overlapping-histograms-with-matplotlib-in-python/?ref=lbp Practise Programs: # import the library import matplotlib.pyplot as plt # Creation of Data x1 = ['math', ' english ', 'science', 'Hindi', 'social studies'] y1 = [92, 54, 63, 75, 53] y2 = [86, 44, 65, 98, 85] # Plotting the Data plt.plot (x1, y1, label='Semester1') plt.plot (x1, y2, label='semester2') plt.xlabel ('subjects') plt.ylabel ('marks') plt.title ("marks obtained in 2010") plt.plot (y1, ' o:g ', linestyle ='--', linewidth ='8') plt.plot (y2, ' o:g ', linestyle =':', linewidth ='8') plt.legend () O/p:-

Histograms and Density Plots A histogram is a kind of bar plot that gives a discretized display of value frequency. The data points are split into discrete, evenly spaced bins, and the number of data points in each bin is plotted. A histogram is a graph showing  frequency  distributions. It is a graph showing the number of observations within each given interval. Example: Say you ask for the height of 250 people, you might end up with a histogram like this:

Histogram You can read : 2 people from 140 to 145cm 5 people from 145 to 150cm 15 people from 151 to 156cm 31 people from 157 to 162cm 46 people from 163 to 168cm 53 people from 168 to 173cm 45 people from 173 to 178cm 28 people from 179 to 184cm 21 people from 185 to 190cm 4 people from 190 to 195cm

The   hist ()  function takes in an array-like dataset and plots a histogram, which is a graphical representation of the distribution of the data. Here’s how you can use the  hist () function to create a basic histogram : import matplotlib.pyplot as plt data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] plt.hist (data) plt.show () # Output: # A histogram plot with x-axis representing the data and y-axis representing the frequency.

The ‘bins’ parameter in the  hist ()  function determines the number of equal-width bins in the range. Let’s see how changing the ‘bins’ parameter affects the histogram. import matplotlib.pyplot as plt data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] plt.hist (data, bins=20) plt.show () # Output : # A histogram plot with x-axis representing the data and y-axis representing the frequency. The number of bars is increased due to the increased number of bins.

Working with ‘range’ The ‘range’ parameter specifies the lower and upper range of the bins. Anything outside the range is ignored. import matplotlib.pyplot as plt data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] plt.hist (data, range=[2, 3]) plt.show () # Output: # A histogram plot with x-axis representing the data and y-axis representing the frequency. The plot only includes data within the specified range . In this example, we’ve set the ‘range’ to [2, 3]. As a result, the histogram only includes the data points between 2 and 3.

Exploring ‘density’ The ‘density’ parameter, when set to True, normalizes the histogram such that the total area (or integral) under the histogram will sum to 1. This is useful when you want to visualize the probability distribution. import matplotlib.pyplot as plt data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] plt.hist (data, density=True) plt.show () # Output: # A histogram plot with x-axis representing the data and y-axis representing the probability density. The total area under the histogram sums to 1.

Histograms with Seaborn and Pandas Seaborn : An Enhanced Visualization Library Seaborn is a statistical plotting library built on top of Matplotlib . It provides a high-level interface for creating attractive graphics, including histograms. import seaborn as sns data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] sns.histplot (data) # Output: # A histogram plot similar to Matplotlib but with a different style.

In this example, we use the  histplot () function from Seaborn to create a histogram. The output is similar to Matplotlib’s histogram, but it comes with a distinct Seaborn style.

import pandas as pd data = pd.DataFrame ([1, 2, 2, 3, 3, 3, 4, 4, 4, 4], columns=['Values']) data['Values'].plot(kind=' hist ') # Output: # A histogram plot similar to Matplotlib but created from a DataFrame .

Scatter or Point Plots Point plots or scatter plots can be a useful way of examining the relationship between two one-dimensional data series. For example, here we load the macrodata dataset from the statsmodels project, select a few variables, then compute log differences: macro = pd.read_csv ('examples/macrodata.csv') data = macro[[' cpi ', 'm1', ' tbilrate ', ' unemp ']] trans_data = np.log(data).diff(). dropna () trans_data [-5:] Out[103]: cpi m1 tbilrate unemp 198 - 0.007904 0.045361 - 0.396881 0.105361 199 - 0.021979 0.066753 - 2.277267 0.139762 200 0.002340 0.010286 0.606136 0.160343 201 0.008419 0.037461 - 0.200671 0.127339 202 0.008894 0.012202 - 0.405465 0.042560

We can then use seaborn’s regplot method, which makes a scatter plot and fits a linear regression line (see Figure 9-24): sns.regplot ('m1', ' unemp ', data= trans_data ) Out[105 ]: plt.title ('Changes in log %s versus log %s' % ('m1', ' unemp ')

In exploratory data analysis it’s helpful to be able to look at all the scatter plots among a group of variables; this is known as a pairs plot or scatter plot matrix. Making such a plot from scratch is a bit of work, so seaborn has a convenient pairplot function, which supports placing histograms or density estimates of each variable along the diagonal (see Figure 9-25 for the resulting plot): sns.pairplot ( trans_data , diag_kind =' kde ', plot_kws ={'alpha': 0.2})

Pair plot

Three-dimensional Plotting in Python using Matplotlib 3D plots are very important tools for visualizing data that have three dimensions such as data that have two dependent and one independent variable. By plotting data in 3d plots we can get a deeper understanding of data that have three variables. We can use various matplotlib library functions to plot 3D plots . We will first start with plotting the 3D axis using the   Matplotlib  library. For plotting the 3D axis we just have to change the projection parameter of  plt.axes ()  from None to 3D . import numpy as np import matplotlib.pyplot as plt fig = plt.figure () ax = plt.axes (projection='3d')

3d plot With the above syntax three -dimensional axes are enabled and data can be plotted in 3 dimensions. 3 dimension graph gives a dynamic approach and makes data more interactive. Like 2-D graphs, we can use different ways to represent to plot 3-D graphs. We can make a scatter plot, contour plot, surface plot, etc. Let’s have a look at different 3-D plots. Graphs with lines and points are the simplest 3-dimensional graph.   We will use  ax.plot3d and   ax.scatter  functions to  plot line and point graph respectively.

Creating 3D surface Plot The axes3d present in Matplotlib’s mpl_toolkits.mplot3d toolkit provides the necessary functions used to create 3D surface plots . Surface plots are created by using ax.plot_surface () function . Syntax :   ax.plot_surface (X, Y, Z)where X and Y are 2D array of points of x and y while Z is 2D array of heights .

Pie chart A  Pie Chart  is a circular statistical plot that can display only one series of data. The area of the chart is the total percentage of the given data. Pie charts are commonly used in business presentations like sales, operations, survey results, resources, etc. as they provide a quick summary. In this article, let’s understand how to create pie chart in python with pie diagram . Matplotlib API has pie() function in its pyplot module which create a pie chart representing the data in an array. let’s create  pie chart  in python. Syntax:   matplotlib.pyplot.pie (data, explode=None, labels=None, colors=None, autopct =None, shadow=False)

Pie chart # Import libraries from matplotlib import pyplot as plt import numpy as np # Creating dataset cars = ['AUDI', 'BMW', 'FORD ', 'TESLA ', 'JAGUAR', 'MERCEDES'] data = [23, 17, 35, 29, 12, 41] # Creating plot fig = plt.figure ( figsize =(10, 7)) plt.pie (data, labels=cars) # show plot plt.show ()

Box Plot   A  Box Plot  is also known as  Whisker plot  is created to display the summary of the set of data values having properties like minimum, first quartile, median, third quartile and maximum. In the box plot, a box is created from the first quartile to the third quartile, a vertical line is also there which goes through the box at the median. Here x-axis denotes the data to be plotted while the y-axis shows the frequency distribution . The  matplotlib.pyplot  module of matplotlib library provides boxplot() function with the help of which we can create box plots. Syntax:   matplotlib.pyplot.boxplot (data, notch=None, vert =None, patch_artist =None, widths=None)

The data values given to the ax.boxplot () method can be a Numpy array or Python list or Tuple of arrays. Let us create the box plot by using numpy.random.normal () to create some random data, it takes mean, standard deviation, and the desired number of values as arguments .

Add Text Inside the Plot in Matplotlib   The  matplotlib.pyplot.text ()  function is used to add text inside the plot. The syntax adds text at an arbitrary location of the axes. It also supports mathematical expressions. Python matplotlib.pyplot.text () Syntax Syntax:  matplotlib.pyplot.text (x, y, s, fontdict =None, ** kwargs )

Adding Mathematical Equations as Text Inside the Plot In this example, this code uses  Matplotlib   and  NumPy   to generate a plot of the parabolic function y = x^2 over the range -10 to 10. The code adds a text label “Parabola $Y = x^2$” at coordinates (-5, 60) within the plot. Finally, it sets axis labels, plots the parabola in green, and displays the plot . import matplotlib.pyplot as plt import numpy as np x = np.arange (-10, 10, 0.01) y = x**2 #adding text inside the plot plt.text (-5, 60, 'Parabola $Y = x^2$', fontsize = 22) plt.plot (x, y, c='g') plt.xlabel ("X-axis", fontsize = 15) plt.ylabel ("Y-axis", fontsize = 15) plt.show ()

Text book Python for Data Analysis Data Wrangling with Pandas, NumPy , and Ipython , Wes McKinney , SECOND EDITION .