DATA ANALYSIS FOR BUSINESS ch02-Discriptive Statistics_Tabular and Graphical Methods.ppt

kellymeinhold327 27 views 44 slides May 09, 2024
Slide 1
Slide 1 of 44
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44

About This Presentation

DATA ANALYSIS FOR BUSINESS CH2


Slide Content

11
Chapter 2
Descriptive Statistics: Tabular and
Graphical Methods
Graphically Summarizing Qualitative Data
Graphically Summarizing Quantitative Data
Stem-and-leaf Display
Misleading Graphs and Charts

2
2.1 Graphically Summarizing Qualitative Data
With qualitative data, names identify the different
categories
This data can be summarized using a frequency
distribution
Frequency distribution: A table that summarizes
the number (or frequency) of items in each of
several non-overlapping classes.

2-3
Describing Pizza Preferences
A business entrepreneur plans to open a pizza restaurant
in a college town, and wishes to study the pizza
preferences of the college students.
Table 2.1 lists pizza preferences of 50 college students
Table 2.1 does not reveal much useful information
Table 2.1
Example 2.1

4
A frequency distribution is a
useful summary
The frequency distribution
shows us how the
preferences are distributed
among the six restaurants.
Papa’s John’s is the mostpopular restaurant.
Papa’s John’s is roughly twiceas popular of the next three
runners up –Bruno’s, Little Caesars, and Will’s.
Pizza Hut and Domino’s are the least
preferred restaurants
Table 2.2

5
Relative Frequency and Percent Frequency
Relative frequency summarizes the proportion (or fraction)
of items in each class
If the data set consists of nobservations,
Multiply times 100 to obtain the percent frequency.
Table 2.3

2-6
Bar Charts and Pie Charts
Bar chart: A vertical or horizontal rectangle
represents the frequency for each category
Height can be frequency, relative frequency, or
percent frequency
Pie chart: A circle divided into slices where
the size of each slice represents its relative
frequency or percent frequency

2-7
Excel Bar and Pie Chart of the Pizza
Preference Data
Figures 2.1 and 2.2

8
Exercise 2.1
Jeep Model
Frequency Relative
Frequency
Percent
Frequency
Commander 71 0.2829 28.29%
Grand Cherokee 70 0.2789 27.89%
Liberty 80 0.3187 31.78%
Wrangler 30 0.1195 11.95%
251 1.0000 100.00%
Table 2.4
Table 2.4 is the frequency distribution of
vehicles sold in 2006 by the Greater
Cincinnati Jeep dealers.
Please find the relative frequency and
percent frequency.

99
Comparison
Percentage of Automobiles Sold by Manufacturer, 1970
versus 1997
Figures 2.3 and 2.4

2-10
2.2 Graphically Summarizing Quantitative
Data
Often need to summarize and describe the shape of
the distribution of a population or sample of
measurements.
Summarize quantitative data by using
frequency distribution:
a list of data classes with the count or “frequency” of values
that belong to each class
“Classify and count”
The frequency distribution is a table
histogram:
a pictureof the frequency distribution

1111
Constructing the frequency distribution
Steps in making a frequency distribution:
1.Determine the number of classes K
2.Determine the class length
3.Form non-overlapping classes of equal width
4.Tally and count the number of measurements in
each class
5.Graph the histogram

1212
Example 2.2
The Payment Time Case: Reducing
Payment Times
In order to assess the effectiveness of the system, the
consulting firm will study the payment times for invoices
processed during the first three months of the system’s
operation.
During this period, 7,823 invoices are processed using
the new system. To study the payment times of these
invoices, the consulting firm numbers the invoices from
0001 to 7823 and uses random numbers to select a
random sample of 65 invoices. The resulting 65 payment
times are given in Table 2.5

1313
2229161518171213171615
1917102115141718122014
1615162022142519231519
1823221616191318242426
1318171524151714181721
16212519202716171621
Table 2.5 A Sample of Payment Times (in Days)
for 65 Randomly Selected Invoices.
Example 2.2 #2
Table 2.5

1414
Group all of the ndata into Knumber of classes
Kis the smallest whole number for which
2
K
n
In Examples 2.2 , n= 65
For K= 6, 2
6
= 64, < n
For K= 7, 2
7
= 128, > n
So use K= 7 classes
Step1: The number of classes K

1515
Class length L is the step size from one to the next
In Examples 2.2, The Payment Time Case, the largest
value is 29 days and the smallest value is 10 days, so
Arbitrarily round the class length up to 3 days/classK
L
value smallest - value Largest
 days/class 71432
classes 7
days 19
classes 7
days 10 - 29
.L 
Step2: Class Length L

16
The classes start on the smallest data value. This is the lower
boundaryof the first class. The upper boundaryof the first
class is smallest value +L.
•In the example 2.2, the lower boundary of the first class is 10, the
upper boundary of the first class is 10+3=13. So the first class -10
days and less than 13 days (10≤n<13)-includes 10,11,and 12 days.
The lower boundary of the second class is the upper boundary of
the first class. The upper boundary of the second class is adding
L to this lower boundary.
In the example 2.2, the second class-13 days and less than 16 days
(13≤n<16)--includes 13,14, and 15 days.
And so on
Step 3: Form non-overlapping class of equal width
(Define the boundaries of classes)

1717
Classes (days)Tally Frequency
10 < 13 ||| 3
13 < 16 |||| 14
16 < 19 ||| 23
19 < 22 || 12
22 < 25 ||| 8
25 < 28 |||| 4
28 < 31 | 1
65
||||||||
||||||||||||||||
||||||||
||||
Check: All frequencies must sum to n
Step 4: Tallies and Frequencies
Table 2.6

18
Step 5: Graph the histogram
Show the frequency distribution in a histogram
Figure 2.5

19
A graph in which rectangles represent the
classes
The base of the rectangle represents the class
length
The height of the rectangle represents
the frequency in a frequency histogram, or
the relative frequency in a relative frequency
histogram
Histogram

20
The relative frequencyof a class is the proportion or
fraction of data that is contained in that class
Calculated by dividing the class frequency by the total
number of data values
For example:
Relative frequency may be expressed as either a
decimal or percent (percent frequencydistribution)
A relative frequency distribution is a list of all the data
classes and their associated relative frequencies
Relative Frequency, Percent Frequency
Classes (days)Frequency Relative FrequencyPercent Frequency
10 < 13 3 3/65 = 0.0462 4.62%
13 < 15 14 14/65 = 0.2154 21.54
… … …

2121
Classes (days)Frequency Relative Frequency
10 < 13 3 3/65 = 0.0462
13 < 16 14 14/65 = 0.2154
16 < 19 23 0.3538
19 < 22 12 0.1846
22 < 25 8 0.1231
25 < 28 4 0.0615
28 < 31 1 0.0154
65 1.0000
Check: All relative frequencies must sum to 1
Relative Frequency: Example 2.2
Table 2.7

2222
Relative Frequency Histogram
Example 2.2: The Payment Times Case
Figure 2.6
The tail on the right appears to be longer than the tail on
the left. We say: the distribution is skewed to the right.

23
Remarks
The procedure introduced is not the only way to
construct a histogram.
e.g. it is not necessary to
set the lower boundary of
the 1
st
class equal to the
smallest measurement.
Sometimesit is desirable to let the nature of the
problem determine the histogram classes.
e.g. 10-year lengths for ages of the residents in a city
Sometimeshistogram with unequal class
lengths is better. e.g. open-ended classes
Figure 2.7

24
Some common distribution shapes
Right SkewedLeft Skewed Symmetric
Figure 2.8

2525
Skewness(偏度)
Skewed distributions are not symmetrical about their
center. Rather, they are lop-sided with a longer tail on
one side or the other.
•A population is distributed according to its relative
frequency curve
•The skew is the side with the longer tail
Right SkewedLeft Skewed Symmetric
Figure 2.9

26
FrequencyPolygons
Plot a point above each class midpoint at a height
equal to the frequency of the class
Useful when comparing two or more distributions
Table 2.8
Example 2.3Comparing Two Grade Distribution
32 63 69 85 91
45 64 69 86 92
50 64 72 87 92
56 65 76 87 93
58 66 78 88 93
60 67 81 89 94
61 67 83 90 96
61 68 83 90 98
Scores for Statistics Exam 1
(in increasing order)
Classes FrequencyPercent
Frequency

27
Scores for Statistics Exam 2
(in increasing order)
55 74 80 87 93
62 74 82 88 94
63 74 83 89 94
66 75 84 90 95
67 76 85 91 97
67 77 86 91 99
71 77 86 92
73 78 87 93
Table 2.9 and Figures 2.11, 2.12, 2.13

2-28
Cumulative Distributions
Another way to summarize a distribution is to
construct a cumulative distribution
To do this, use the same number of classes, class
lengths, and class boundaries used for the
frequency distribution
Rather than a count, we record the number of
measurements that are less than the upper
boundary of that class
In other words, a running total

2-29
Various Frequency Distribution
Table 2.10

2-30
Ogive
Ogive: A graph of a cumulative distribution
Plot a point above each upper class boundary at
height of cumulative frequency
Connect points with line segments
Can also be drawn using:
Cumulative relative frequencies
Cumulative percent frequencies
Figure 2.14

2-31
2.3 Stem-and-Leaf Displays
Purpose is to see the overall pattern of the
data, by grouping the data into classes
the variation from class to class
the amount of data in each class
the distribution of the data within each class
Best for small to moderately sized data
distributions

2-32
Car Mileage Example
Table 2.11
Example 2.4

3333
The stem-and-leaf display of car mileages:
29 8
30 13455677888
31 0012334444455667778899
32 011123344557788
33 03
29 + 0.8 = 29.8
33 + 0.0 = 33.0
33 + 0.3 = 33.3
Figure 2.15
Stem unit =1, Leaf unit =0.1

3434
Splitting The Stems
There are no rules that dictate the number of stem
values, so we can split the stems as needed
Starred classes (*) extend from 0.0 to 0.4
Unstarredclasses extend from 0.5 to 0.9
29 8
30 * 1 3 4
30 5 5 6 7 7 8 8 8
31 * 0 0 1 2 3 3 4 4 4 4 4
31 5 5 6 6 7 7 7 8 8 9 9
32 * 0 1 1 1 2 3 3 4 4
32 5 5 7 7 8
33 * 0 3
Figure 2.16

3535
Looking at the last stem-and-leaf display, the
distribution appears almost “symmetrical” (对称的)
The upper portion of the display…
Stems 29, 30*, 30, and 31*
… is almost a mirror image of the lower portion of
the display
Stems 31, 32*, 32, and 33*

3636
Constructing a Stem-and-Leaf Display
1.Decide what units will be used for the stems and the
leaves. As a general rule, choose units for the stems so
that there will be somewhere between 5 and 20 stems.
2.Place the stems in a column with the smallest stem at
the top of the column and the largest stem at the
bottom.
3.Enter the leaf for each measurement into the row
corresponding to the proper stem. The leaves should
be single-digit numbers (rounded values).
4.If desired, rearrange the leaves so that they are in
increasing order from left to right.

2-37
Constructing a Stem-and-Leaf Display
It is possible to construct a stem-and-leaf display
from measurements containing any number of digits.
Example 2.5
Table 2.13
Number of DVD players sold
for each of last 12 months
Stem and Leaf plot
for
Players
Sold
stem unit =1000
leaf unit =100
Frequency StemLeaf
1 135
2 143 7
3 152 7 9
3 161 5 7
2 171 9
0 18
1 190
12
13,50215,93214,739
15,24914,31217,111
19,01016,12116,708
17,88615,66516,475
Figure 2.17

Back-to-Back Stem-and-Leaf Display
Exam1 Exam2
 2 3
 3
 4
 5 4
 0 5
 8 6 5 5
 4 4 3 1 1 0 6 2 3
 9 9 8 7 6 5 6 6 7 7
 2 7 1 3 4 4 4
 8 6 7 5 6 7 7 8
 3 3 1 8 0 2 3 4
 9 8 7 7 6 5 8 5 6 6 7 7 8 9
4 3 3 2 2 1 0 0 9 0 1 1 2 3 3 4 4
 8 6 9 5 7 9
We can construct a Back-
to-Back Stem-and-Leaf
Display if we wish to
comparetwo distributions.
Conclusion:
Exam 1: two concentrations
of scores (bimodal)
Exam 2: almost single
peaked and somewhat
skewed to the left
Figure 2.18
Example 2.6

Description of Quantitative 定量data
Table and Graph
Stem-and-leaf display (茎叶图)
Frequency distributions (频率分布)
Histogram (直方图)
Dot plot(点图)

4040
2.4 Misleading Graphs and Charts
Scale Break
Break the vertical scale to exaggerate effect
Mean Salaries at a Major University, 2002 -2005
Figure 2.19

4141
Misleading Graphs and Charts:
Scale Effects
Compress vs. stretch the vertical axis to exaggerate or minimize
the effect
Mean Salary Increases at a Major University, 2002 -2005
Figure 2.20

42
Chapter Summary
Frequency distribution
Bar chart and pie chart
Histogram
Shape of the distribution
Stem-and-leaf display
Misleading graphs and charts

43
Appendix:
Excel --Bar chart and Pie Chart

44
Appendix: MegaStat
Tags