Coefficient of Variation Business statstis

AbdullahAbdullah76320 34 views 32 slides Aug 07, 2024
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

Data analysis


Slide Content

Measures of variation
Data for 5 starting
players from two
basketball teams:
A: 72 , 73, 76, 76, 78
B: 67, 72, 76, 76, 84
Mean, Median & Mode

Measures of Variation
Ex. 1 continued. To describe the difference in the
two data sets, we use a descriptive measure that
indicates the amount of spread , or dispersion, in a
data set.
Range: difference between maximum and
minimum values of the data set.

Measures of Variation
Range of team A: 78-72=6
Range of team B: 84-67=17
Advantage of range: 1) easy to compute
Disadvantage: only two values are
considered.

Unlike the range, the sample standard deviation
takes into account all data values. The following
procedure is used to find the sample standard
deviation:
1. Find mean of data : =


1
n
i
x
n
72 73 76 76 78
75
5
   

Step 2: Find the deviation of each score from the
mean
x


72
72-75 = -3
73 73–75 = -2
76 76-75 = 1
76 76-75 = 1
78 78-75= 3
Note that
the sum of
the
deviations =
0
0
 ( ) 0x x
x x

The sum of the deviations from mean will always be zero.
This can be used as a check to determine if your calculations
are correct.
Note that
 
_
( ) 0x x

Step 3: Square each deviation from the mean. Find the sum of
the squared deviations.
Height deviation squared deviation
72 -3 9
73 -2 4
76 1 1
76 1 1
78 3 9
 = 24


2
1
( )
n
i
i
X X

Step 4: The sample variance is determined by dividing the sum of
the squared deviations by (n-1) (the number of scores minus one)
Note that sum of squared deviations is 24
Sample variance is
 =





2
2
_
1
( )
1
i
n
i
x x
s
n


24
6
5 1

The four steps can be combined into one mathematical
formula for the sample standard deviation. The sample
standard deviation is the square root of the quotient of the sum
of the squared deviations and (n-1)
_
2
1
( )
1
i
n
i
x x
s
n







Sample Standard Deviation:

=6

Four step procedure to calculate sample standard
deviation:
1. Find the mean of the data
2. Set up a table which lists the data in the left hand
column and the deviations from the mean in the next
column.
3. In the third column from the left, square each
deviation and then find the sum of the squares of the
deviations.
4. Divide the sum of the squared deviations by (n-1)
and then take the positive square root of the result.

Problem for students:
By hand: Find variance and
standard deviation of data: 5, 8, 9,
7, 6
Answer: Standard deviation is
approximately 1.581 and the
variance is the square of 1.581 =
2.496

Standard deviation of grouped data:
1. Find each class midpoint.
2. Find the deviation of each value
from the mean
3. Each deviation is squared and then
multiplied by the class frequency.
4. Find the sum of these values and
divide the result by (n-1) (one less
than the total number of
observations).

 



2
1
( )
1
k
i i
i
x x f
s
n

Here is the frequency distribution of the number of rounds of golf
played by a group of golfers. The class midpoints are in the second
column. The mean is 29.35 . Third column represents the square of the
difference between the class midpoint and the mean. The 5
th
column is the
product of the frequency with values of the third column. The final result is
highlighted in red
class midpoint data-mean frequency (x-mean)^2*frequency x*f
squared
[0,7) 3.5668.3948 0 0 0
[7,14) 10.5355.4482 2 710.8963556 21
[14,21) 17.5140.5015 10 1405.015111 175
[21,28) 24.523.55484 21 494.6517333 514.5
[28,35) 31.54.608178 23 105.9880889 724.5
[35,42) 38.583.66151 14 1171.261156 539
[42,49) 45.5260.7148 5 1303.574222 227.5
75 5191.38666729.35333
8.37579094

 



2
1
( )
1
k
i i
i
x x f
s
n

Variance For Grouped Data

Interpreting the standard deviation
1. The more variation in a data set, the greater the
standard deviation.
2. The larger the standard deviation, the more
“spread” in the shape of the histogram representing
the data.
3. Standard deviation is used for quality control in
business and industry. If there is too much variation
in the manufacturing of a certain product, the
process is out of control and adjustments to the
machinery must be made to insure more uniformity
in the production process.

Three standard deviations rule
“ Almost all” the data will lie within 3 standard deviations
of the mean
Mathematically, nearly 100% of the data will fall in the
interval determined by
 
_ _
( 3 , 3 )x s x s

Empirical Rule
If a data set is “mound shaped” or “bell-shaped”,
then:
1. approximately 68% of the data lies within one
standard deviation of the mean
2. Approximately 95% data lies within 2 standard
deviations of the mean.
3. About 99.7 % of the data falls within 3 standard
deviations of the mean.

Yellow region is 68% of the total area. This includes all data within one
standard deviation of the mean.
Yellow region plus brown regions include 95% of the total area. This
includes all data that are within two standard deviations from the
mean.

Question
A company produces a lightweight valve that is
specified to weigh 1365 grams. Unfortunately, because
of imperfections in the manufacturing process not all of
the valves produced weigh exactly 1365 grams. In fact,
the weights of the valves produced are normally
distributed with a mean weight of 1365 grams and a
standard deviation of 294 grams. Within what range of
weights would approximately 95% of the valve weights
fall? Approximately 16% of the weights would be more
than what value? Approximately 0.15% of the weights
would be less than what value?

Solution

Chebyshev’s Theorem
The empirical rule applies only when data
are known to be approximately normally
distributed.
Chebyshev’s theorem applies to all
distributions regardless of their shape
and thus can be used whenever the data
distribution shape is unknown or is non-
normal.

Question
In the computing industry the average age of
professional employees tends to be younger than in
many other business professions. Suppose the
average age of a professional employed by a
particular computer firm is 28 with a standard
deviation of 6 years. A histogram of professional
employee ages with this firm reveals that the data
are not normally distributed but rather are amassed
in the 20s and that few workers are over 40. Apply
Chebyshev’s theorem to determine within what
range of ages would at least 80% of the workers’
ages fall.

Coefficient of Variation
The coefficient of variation is a
statistic that is the ratio of the
standard deviation to the mean
expressed in percentage and is
denoted CV.
CV = (σ/μ)*(100)

Interquartile Range
The interquartile range is the range of values
between the first and third quartile.
Essentially, it is the range of the middle 50% of the
data and is determined by computing the value of Q3
- Q1.
The interquartile range is especially useful in
situations where data users are more interested in
values toward the middle and less interested in
extremes.
In describing a real estate housing market, Realtors
might use the interquartile range as a measure of
housing prices when describing the middle half of the
market for buyers who are interested in houses in
the midrange.

INTERQUARTILE RANGE Q3 - Q1
Tags