Measures of Relative Standing and Boxplots

mmirfattah 901 views 15 slides Jul 09, 2021
Slide 1
Slide 1 of 15
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15

About This Presentation

Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 3: Describing, Exploring, and Comparing Data
3.3: Measures of Relative Standing and Boxplots


Slide Content

Elementary Statistics Chapter 3: Describing, Exploring, and Comparing Data 3.3 Measures of Relative Standing and Boxplots 1

Chapter 3: Describing, Exploring, and Comparing Data 3.1 Measures of Center 3.2 Measures of Variation 3.3 Measures of Relative Standing and Boxplots 2 Objectives: Summarize data, using measures of central tendency, such as the mean, median, mode, and midrange. Describe data, using measures of variation, such as the range, variance, and standard deviation. Identify the position of a data value in a data set, using various measures of position, such as percentiles, deciles, and quartiles. Use the techniques of exploratory data analysis, including boxplots and five-number summaries, to discover various aspects of data

z Scores A z score (or standard score or standardized value ) is the number of standard deviations that a given value x is above or below the mean. It is obtained by subtracting the mean from the value and dividing the result by the standard deviation. Population:   3.3 Measures of Relative Standing and Boxplots Values: z scores ≤ − 2.00 or z scores ≥ 2.00 3 Important Properties of z Scores A z score is the number of standard deviations that a given value x is above or below the mean. ​ z scores are expressed as numbers with no units of measurement. A data value is significantly low if its z score is less than or equal to − 2 or the value is significantly high if its z score is greater than or equal to +2. If an individual data value is less than the mean, its corresponding z score is a negative number.

Example 1 4 Which of the following two data values is more extreme relative to the data set from which it came?         Temperature The 4000 g weight of a baby ( ) The temperature of an adult ( )     ,  

5 A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10; she scored 30 on a history test with a mean of 25 and a standard deviation of 5. Compare her relative positions on the two tests. She has a higher relative position in the Calculus class.           Example 2 ,  

6     Interpretation : the platelet count of 75 is significantly low. Platelets  clump together and form clots to stop the bleeding during injury.   The lowest platelet count in a dataset is 75 (platelet counts are measured in 1000 cells/ ), is this significantly low? ( )     Example 3 ,  

Percentiles are measures of location, denoted P 1 , P 2 , . . . , P 99 , which divide a set of data into 100 groups with about 1% of the values in each group. (Percentiles separate the data set into 100 equal groups .) A percentile rank for a datum represents the percentage of data values below the datum (round the result to the nearest whole number): 3.3 Measures of Relative Standing and Boxplots     n total number of values in the data set k percentile being used (Example: For the 25th percentile, k = 25.) L locator that gives the position of a value (Example: For the 12th value in the sorted list, L = 12.) P k k th percentile (Example: P 25 is the 25th percentile.) 7  

8 Fifty (50) cell phone data speeds listed below are arranged in increasing order. Find the percentile for the data speed of 11.8 Mbps. 0.8 1.4 1.8 1.9 3.2 3.6 4.5 4.5 4.6 6.2 6.5 7.7 7.9 9.9 10.2 10.3 10.9 11.1 11.1 11.6 11.8 12.0 13.1 13.5 13.7 14.1 14.2 14.7 15.0 15.1 15.5 15.8 16.0 17.5 18.2 20.2 21.1 21.5 22.2 22.4 23.1 24.5 25.7 28.5 34.6 38.5 43.0 55.6 71.3 77.8       Interpretation: A data speed of 11.8 Mbps is in the 40 th percentile and separates the lowest 40% of values from the highest 60% of values. We have P 40 = 11.8 Mbps. Example 4 ,  

Example 5 9 For the 50 cell phone data speeds listed, a) find the 20 th percentile ( P 20 ), and b) the 87 th percentile ( P 87 ). 0.8 1.4 1.8 1.9 3.2 3.6 4.5 4.5 4.6 6.2 6.5 7.7 7.9 9.9 10.2 10.3 10.9 11.1 11.1 11.6 11.8 12.0 13.1 13.5 13.7 14.1 14.2 14.7 15.0 15.1 15.5 15.8 16.0 17.5 18.2 20.2 21.1 21.5 22.2 22.4 23.1 24.5 25.7 28.5 34.6 38.5 43.0 55.6 71.3 77.8 Converting a Percentile to a Data Value   Solution: k = 20, n = 50   Whole Number: The value of the 20 th percentile is between the L th (10 th ) value and the L + 1 st ( 11 st )value. The 20 th percentile:   Procedure: Rank the data (Lo to Hi). Find (k = % & n = sample size). is not a whole number, round it up to the next whole number, the L th value is the k th  percentile. is a whole number, the  k th  percentile is the average of that value and the next one in your data.       Solution: k = 87, n = 50     Not a Whole Number: Round up ALWAYS: The value of the 87 th percentile is the 44 th value.  

Quartiles Quartiles are measures of location, denoted Q 1 , Q 2 , and Q 3 , which divide a set of data into four groups with about 25% of the values in each group. ( Quartiles separate the data set into 4 equal groups. Q 1 =P 25 , Q 2 =MD, Q 3 =P 75 ) Q 1 (First quartile, or P 25 ) It separates the bottom 25% of the sorted values from the top 75%. Q 2 (Second quartile, or P 50 ) and same as the median. It separates the bottom 50% of the sorted values from the top 50%. Q 3 (Third quartile): Same as P 75 . It separates the bottom 75% of the sorted values from the top 25%. Caution Just as there is not universal agreement on a procedure for finding percentiles, there is not universal agreement on a single procedure for calculating quartiles, and different technologies often yield different results. Deciles separate the data set into 10 equal groups. D 1 =P 10 , D 4 =P 40 The Interquartile Range , IQR = Q 3 – Q 1 . Step 1 Arrange the data in order from lowest to highest. Step 2 Find the median of the data values. This is the value for Q 2 . Step 3 Find the median of the data values that fall below Q 2 . This is the value for Q 1 . Step 4 Find the median of the data values that fall above Q 2 . This is the value for Q 3 . 10 3.3 Measures of Relative Standing and Boxplots 10-90  

Example 7 11 Given the fifty (50) data speeds listed, find the 5-number summary. 0.8 1.4 1.8 1.9 3.2 3.6 4.5 4.5 4.6 6.2 6.5 7.7 7.9 9.9 10.2 10.3 10.9 11.1 11.1 11.6 11.8 12.0 13.1 13.5 13.7 14.1 14.2 14.7 15.0 15.1 15.5 15.8 16.0 17.5 18.2 20.2 21.1 21.5 22.2 22.4 23.1 24.5 25.7 28.5 34.6 38.5 43.0 55.6 71.3 77.8 5-Number Summary Min = 0.8 Mbps & Max = 77.8 Mbps The median is equal to MD = Q 2 = (13.7 + 14.1) / 2 = 13.9 Mbps. 5-Number Summary Consists of : Minimum First quartile, Q 1 Second quartile, Q 2 (same as the median) Third quartile, Q 3 Maximum Q 1 = 7.9 Mbps (Median of the bottom half) Q 3 = 21.5 Mbps (Median of the top half)

Example 8 12 Given the fifty (50) data speeds listed, construct a boxplot. Identify Outliers if any. 0.8 1.4 1.8 1.9 3.2 3.6 4.5 4.5 4.6 6.2 6.5 7.7 7.9 9.9 10.2 10.3 10.9 11.1 11.1 11.6 11.8 12.0 13.1 13.5 13.7 14.1 14.2 14.7 15.0 15.1 15.5 15.8 16.0 17.5 18.2 20.2 21.1 21.5 22.2 22.4 23.1 24.5 25.7 28.5 34.6 38.5 43.0 55.6 71.3 77.8 Boxplots A boxplot (or box-and-whisker diagram ) is a graph that consists of a line extending from the min to the max value, and a box with lines drawn at the first quartile Q 1 , the median, and the third quartile Q 3 . Min = 0.8, Q 1 = 7.9, Q 2 = 13.9, Q 3 = 21.5, Max = 77.8 Constructing a Boxplot Find the 5-number summary (Min, Q 1 , Q 2 , Q 3 , Max). Construct a line segment extending from the minimum to the maximum data value. Construct a box (rectangle) extending from Q 1 to Q 3 , and draw a line in the box at the value of Q 2 (median). , & Any numbers smaller than and larger than 41.9 is considered an outlier.   43.0, 55.6, 71.3 & 77.8

Skewness A boxplot can often be used to identify skewness. A distribution of data is skewed if it is not symmetric and extends more to one side than to the other. Modified Boxplots A modified boxplot is a regular boxplot constructed with these modifications: A special symbol (such as an asterisk or point) is used to identify outliers as defined above, and the solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier. Identifying Outliers ( An outlier is a value that lies very far away from the vast majority of the other values in a data set. ) for Modified Boxplots Find the quartiles Q 1 , Q 2 , and Q 3 . Find the interquartile range (IQR), where IQR = Q 3 − Q 1 . Evaluate 1.5 × IQR. In a modified boxplot, a data value is an outlier if it is above Q 3 , by an amount greater than 1.5 × IQR or below Q 1 , by an amount greater than 1.5 × IQR. 13 3.3 Measures of Relative Standing and Boxplots Boxplots

Example 9 14 Given the fifty (50) data speeds listed, find the 40 th percentile, denoted by P 40 . 0.8 1.4 1.8 1.9 3.2 3.6 4.5 4.5 4.6 6.2 6.5 7.7 7.9 9.9 10.2 10.3 10.9 11.1 11.1 11.6 11.8 12.0 13.1 13.5 13.7 14.1 14.2 14.7 15.0 15.1 15.5 15.8 16.0 17.5 18.2 20.2 21.1 21.5 22.2 22.4 23.1 24.5 25.7 28.5 34.6 38.5 43.0 55.6 71.3 77.8 Converting a Percentile to a Data Value ( Time ) Whole Number: The value of the 40 th percentile is between the L th (20 th ) value and the 21 st value. The 40 th percentile is P 40 = 11.7 Mbps.   Procedure: Rank the data (Lo to Hi). Find (k = % & n = sample size). is not a whole number, round it up to the next whole number, the L th value is the k th  percentile. is a whole number, the  k th  percentile is the average of that value and the next one in your data.   Solution: k = 40, n = 50      

Example 10 15 A teacher gives a 20-point test to 10 students. a) Find the percentile rank of a score of 12. b) Find the value corresponding to the 25 th percentile. 18, 15, 12, 6, 8, 2, 3, 5, 20, 10 Sort in ascending order. 2, 3, 5, 6, 8, 10, 12, 15, 18, 20 Interpretation: A student whose score was 12 did better than 65% of the class. 6 values       b) Sort in ascending order. 2, 3, 5, 6, 8, 10, 12, 15, 18, 20       Interpretation: The value 5 corresponds to the 25 th percentile.
Tags