STAT CH 3 stastices for managment, business managment, marketing and accounting student.pptx

blenwerke8 7 views 52 slides Oct 21, 2025
Slide 1
Slide 1 of 52
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52

About This Presentation

statistics is the course that is common for all deciplenies since it is enables as to analyze data for decision making and for resealable work and for research as well as scientific work. this document contain more about statistics for social scenes and techniques to conduct data analysis to reliabl...


Slide Content

CH-3 Measures of Central Tendency and Dispersion The symbol of summation notation ∑ is the capital Greek letter sigma, which means “the sum of”. Consider n numbers x 1 , x 2 , x 3 … x n , a concise way of representing their sum is; this is read as “the sum of the terms x i, where i assumes the values from 1 to n inclusive.” The numbers 1 and n are called the lower and the upper limits of summation, respectively. For example, x 1 = 5, x 2 = 6, x 3 = 8, and x 4 = 10, then;   =3*2+3*4=6+12=18  

Measures of Central Tendency In the study of a population with respect to one in which we are interested we may get a large number of observations. It is not possible to grasp any idea about the characteristic when we look at all the observations. So, it is better to get one number for one group. That number must be a good representative one for all the observations to give a clear picture of that characteristic. Such representative number can be a central value for all these observations . This central value is called a measure of central tendency or an average or a measure of locations. There are five averages: simple averages and special averages. Among them mean, median and mode are called simple averages

Measures of Central Tendency The averages geometric mean and harmonic mean are called special averages. The meaning of average is nicely given in the following definitions. A measure of central tendency is a typical value around which other figures congregate. An average stand for the whole group of which it forms a part yet represents the whole. One of the most widely used set of summary figures is known as measures of location .

Arithmetic mean or mean Arithmetic means or simply the mean sum of the observations divided by the number of observations. If the variable x assumes n values x1, x2 … xn then the mean, is given by

Arithmetic mean or mean This formula is for the ungrouped or raw data. Example: A student’s marks in 5 subjects are 2, 4, 6, 8, and 10. Find his average mark . Solution : 2+4+6+8+10=30/5=6 5 Ungrouped Frequency Data: The mean for ungrouped Frequency data is obtained from the following formula:

Arithmetic mean or mean Where x = the value of individual class f = the frequency of individual class N = the sum of the frequencies or total frequencies.

Arithmetic mean or mean Example: Given the following frequency distribution, calculate the arithmetic mean Solution first xif = 64*8+63*18+62*12+61*9+60*7+59*6=3713 Second sum fi =8+18+12+9+7+6=60 Finally 3713/60=61.9 Marks 64 63 62 61 60 59 Number of Students 8 18 12 9 7 6

Arithmetic mean or mean Example: Following is the distribution of persons according to different income groups. Calculate arithmetic mean grouped data. Solution first find mid class(x)=L+U/2 Second to find f(x)=fx Income Birr (100) 0-10 10-20 20-30 30-40 40-50 50-60 60-70 Number of persons 6 8 10 12 7 4 3

Arithmetic mean or mean Solution : 0+10/2=5*f 10+20/2=15*f 20+30/2=25*f 30+40/2=35*f 40+50/2=45*f 50+60/2=55*f 60+70/2=65*f Income C.I Number of Persons (f) Mid (X) Fx 0-10 6 5 30 10-20 8 15 120 20-30 10 25 250 30-40 12 35 420 40-50 7 45 315 50-60 4 55 220 60-70 3 65 195 Total 50   1550

Merits and demerits of Arithmetic mean: 1550 = 31 50 Merits : It is rigidly defined. It is easy to understand and easy to calculate. If the number of items is sufficiently large, it is more accurate and more reliable. It is a calculated value and is not based on its position in the series. It is possible to calculate even if some of the details of the data are lacking. It provides a good basis for comparison.

Demerits of Arithmetic mean: : It cannot be obtained by inspection nor located through a frequency graph. It cannot be in the study of qualitative phenomena not capable of numerical measurement i.e., Intelligence, beauty, honesty etc., It can ignore any single item only at the risk of losing its accuracy. It is affected very much by extreme values. It cannot be calculated for open-end classes. It may lead to fallacious conclusions, if the details of the data from which it is computed are not given.

Harmonic mean (H.M): Harmonic mean : reciprocal of the arithmetic average of the reciprocal of the given values. If X 1 , X 2 …. X n are n observations,

Harmonic mean (H.M): For a frequency distribution Example: From the given data calculate H.M 5, 10,17,24,30 First to find 1/x 1/5=0.2000 1/10=0.1000 1/17=0.0588 X 1 X 5 0.2000 10 0.1000 17 0.0588 24 0.0417 30 0.0333 Total 0.4338

Harmonic mean (H.M): = H.M= 11.526  

Harmonic mean (H.M): Example : The marks secured by some students of a class are given below. Calculate the harmonic mean . First =1/x 1/20=0.05 1/21=0.0476 1/22=0.0454 1/23=0.0435 Marks 20 21 22 23 24 25 N o of Students 4 2 7 `1 3 1 Marks X N o of students f 1/x f (1/x) 20 4 0.0500 0.2000 21 2 0.0476 0.0952 22 7 0.0454 0.3178 23 1 0.0435 0.0435 24 3 0.0417 0.1251 25 1 0.0400 0.0400   18   0.8216

Harmonic mean (H.M): 18/21.9=0.8219

Merits of H.M and Demerits of H.M: Merits of H.M: It is rigidly defined. It is defined on all observations. It is amenable to further algebraic treatment. It is the most suitable average when it is desired to give greater weight to smaller observations and less weight to the larger ones. Demerits of H.M: It is not easily understood. It is difficult to compute. It is only a summary figure and may not be the actual item in the series

Geometric mean: The geometric mean of a series containing n observations is the n th root of the product of the values. If x 1 , x2…, x n are observations then The geometric mean of a series containing n observations is the n th root of the product of the values. If x 1 , x2…, x n are observations then

Geometric mean: For grouped data Geometric mean will be computed as follows Example: Calculate the geometric mean of the following series of monthly income of a batch of families 180, 250, 490, 1400, 1050

Geometric mean: = Antilog = 13.5107 5 = Antilog 2.7021 = 503.6 X Log x 180 2.2553 250 2.3979 490 2.6902 1400 3.1461 1050 3.0212 Total 13.5107

Merits of Geometric mean and Demerits of Geometric mean: Merits of Geometric mean: It is rigidly defined It is based on all items It is very suitable for averaging ratios, rates and percentages It is capable of further mathematical treatment. Unlike AM, it is not affected much by the presence of extreme values Demerits of Geometric mean: It cannot be used when the values are negative or if any of the observations is zero It is difficult to calculate particularly whe n the items are very large or when there is a frequency distribution It brings out the property of the ratio of the change and not the absolute difference of change as the case in arithmetic mean. The GM may not be the actual value of the series.

Positional Averages: These averages are based on the position of the given observation in a series, arranged in an ascending or descending order. The magnitude or the size of the values does matter as was in the case of arithmetic mean. It is because of the basic difference that the median and mode are called the positional measures of an average.

Median: The median is that value of the variant which divides the group into two equal parts, one part comprising all values greater, and the other, all values less than median. Median is defined as the value of the middle item or the mean of the values of the two middle items when the data are arranged in an ascending or descending order of magnitude. Ungrouped or Raw data: Thus, in an ungrouped frequency distribution if the n values are arranged in ascending or descending order of magnitude, the median is the middle value if n is odd . When n is even, the median is the mean of the two middle values. By the formula: Median (Md) = n +1 2

Median: Example 1: When odd number of values are given. Find median for the following data 25, 18, 27, 10, 8, 30, 42, 20, 53 Solution: Arranging the data in the increasing order 8, 10, 18, 20 , 25, 27, 30, 42, 53 The middle value is the 5th item i.e., 25 is the median Using formula, Median (Md) = n +1 = 9 +1 = 10/2 = 5 th item = 25 2 2 Example 2: When even number of values are given. Find median for the following data 5, 8, 12, 30, 18, 10, 2, 22 Solution: Arranging the data in the increasing order 2, 5, 8, 10, 12, 18, 22, 30 Here median is the mean of the middle two items (i.e.) mean of (10, 12) i.e. 10 +12 = 11

Median: Using the formula, Median (Md) = n +1 2 = 8 +1 2 = (9/2) th item = 4.5 th item = 4 th item + (1/2) (5 th item- 4 th item) = 10 + (1/2) (12-10) = 10 + (1/2) x 2 = 10 +1 = 11 Grouped Data: In a grouped distribution, values are associated with frequencies. Grouping can be in the form of a discrete frequency distribution or a continuous frequency distribution. Whatever may be the type of distribution, cumulative frequencies have to be calculated to know the total number of items.

Median: In the case of a grouped series, the median is calculated by linear interpolation with the help of the following formula: M = l 1 + l 2 – l 1 x m-c f Where, m=the median l 1 =the lower limit of the class in which the median lies l 2 =the upper limit of the class in which the median lies f= the frequency of the class in which the median lies m= the middle item (n +1)/2 th c= the cumulative frequency of the class preceding the one in which the median lies. Example: The following table gives the frequency distribution of 325 workers of a factory, according to their average monthly income in a certain year. Calculate median income

Median: Solution: Income group (in birr) N o of workers Below 100 1 100-150 20 150-200 42 200-250 55 250-300 62 300-350 45 350-400 30 400-450 25 450-500 15 500-550 18 550-600 10 600 and above 2 Total 325 Income group (in birr) N o of workers Cumulative frequency c.f Below 100 1 1 100-150 20 21 150-200 42 63 200-250 55 118 250-300, l 1 & l 2 62 f 180 300-350 45 225 350-400 30 255 400-450 25 280 450-500 15 295 500-550 18 313 550-600 10 323 600 and above 2 325 Total 325  

Median: First to find m= 325 +1 = 326 = 163 2 2 Second to find It means median lies in the class interval of birr 250-300 M = l 1 + l 2 – l 1 x m-c f 250 + 300-250 x 163-118 62 = 250 + 50 x 45 62 = 286.29 birr

Merits of Median and Demerits of Median : Merits of Median: Median is not influenced by extreme values because it is a positional average. Median can be calculated in case of distribution with open end intervals. Median can be located even if the data are incomplete. Median can be located even for qualitative factors such as ability, honesty etc. Demerits of Median: A slight change in the series may bring drastic change in median value. In case of even number of items or continuous series, median is an estimated value other than any value in the series. It is not suitable for further mathematical treatment except its use in mean deviation. It is not taken into account all the observations

Mode: The mode refers to that value in a distribution, which occur most frequently . It is an actual value, which has the highest concentration of items in and around it. According to Croxton and Cowden “The mode of a distribution is the value at the point around which the items tend to be most heavily concentrated . It may be regarded at the most typical of a series of values”. It shows the center of concentration of the frequency in around a given value. Therefore, where the purpose is to know the point of the highest concentration it is preferred. It is, thus, a positional measure. Its importance is very great in marketing studies where a manager is interested in knowing about the size, which has the highest concentration of items. For example, in placing an order for shoes or ready-made garments the modal size helps because this size and other sizes around in common demand

Mode: Ungrouped or Raw Data: For ungrouped data or a series of individual observations, mode is often found by mere inspection. Example : 2, 7, 10, 15, 10 , 17, 8, 10 , 2 Mode = M0=10 In some cases, the mode may be absent while in some cases there may be more than one mode. Example 1: 1. 12, 10, 15, 24, 30 ( No mode ) Example 2: 2. 7 , 10 , 15, 12, 7 , 14, 24, 10 , 7 , 20, 10 ( The modes are 7 and 10) Grouped Data: For Discrete distribution, see the highest frequency and corresponding value of X is mode.

Mode: In case of grouped data, mode is determined by the following formula : Where, l = the lower value of the class in which the mode lies. f 1 = the frequency of the class in which the mode lies f = the frequency of the class preceding the modal class f 2 = the frequency of the class succeeding the modal class c = the class interval of the modal class

Mode: Example: Calculate mode for the following : Solution : The highest frequency is 150 and corresponding class interval is 200 – 250, which is the modal class. Here l =200, f 1 =150, f =91, f 2 =87, c =50 Class interval Frequency 0-50 5 50-100 14 100-150 40 150- 200 91 200-250 150 250-300 87 300-350 60 350-400 38 400 and above 15

Mode: Mode= 200 + (150 – 91) x 50 (150-91) + (150-87) Mode = 200 + (59/122) x 50 Mode= 200 + 24.18 Mode = 224.18

Merits of Mode and Demerits of mode: Merits of Mode: It is easy to calculate and, in some cases, it can be located mere inspection Mode is not at all affected by extreme values. It can be calculated for open-end classes. It is usually an actual value of an important part of the series. In some circumstances it is the best representative of data. Demerits of mode: It is not based on all observations . It is not capable of further mathematical treatment. Mode is ill-defined generally; it is not possible to find mode in some cases. As compared with mean, mode is affected to a great extent, by sampling fluctuations. It is unsuitable in cases where relative importance of items has to be considered.

Measures of Dispersion Dispersion (also known as scatter, spread or variation) measures the extent to which the items vary from some central value. It may be noted that the measure of dispersion measure only the degree (i.e., the amount of variation) but not the direction of variation. The measures of dispersion are also called averages of second order because these measures give an average of the differences of various items from an average.

What is the significance of measuring dispersion? Measures of dispersion are calculated to serve the following purposes: To determine the reliability of an average : measuring variability determines the reliability of an average by pointing out what extent the average is representative of the entire data To facilitate comparison: measure of dispersion facilitate comparison of two or more distribution with regard to their variability To facilitate control : measure of dispersion determines the nature and cause of variation in order to control the variation itself To facilitate the use of other statistical measures : measuring variability facilitates the use of other statistical measures like correlation, regression, statistical inference, etc.

What are the properties of a good measure of dispersion? Since a measure of dispersion is the average of the deviations of items from an average, it should also possess all the qualities of a good measure of an average. According to Yule and Kendall, the qualities of good measure of dispersion are as follows: Simple to understand - it should be simple to understand Easy to calculate -it should be rigidly defined. For the same data, all the methods should produce the same answer. Different methods of computation leading to different answers is not proper Based on all items- it should be based on all items. When it is based on all items, it will produce a more representative value Amenable to further algebraic treatment - it should be amenable to further algebraic treatment. This means combining groups, calculation of missing values, adjustment for wrong entries, etc., which are possible without the knowledge of actual values of all items. Such treatment should be possible with a good measure of dispersion also .

What are the properties of a good measure of dispersion? Sampling stability- it should have sampling stability. It means that the average difference between the values obtained from the sample and the corresponding values from the population should be the least. If it is so far a measure of dispersion. It is the best measure. Not unduly affected by extreme items- it should not be unduly affected by the extreme items. Extreme items many times, are not true representatives of the data. So, their presence should not affect the calculation to large extent. This list is not a complete list of the properties of a good measure of dispersion. But these are the most important characteristics which a good measure of dispersion should possess.  

What are the measures of dispersion? Measure of dispersion may be either absolute or relative Absolute measure of dispersion Absolute measure is the measure of dispersion which is expressed in the same statistical unit in which the original data are given such as kilograms, tones, kilometers, birr, etc. For example, when rainfalls on different days are available in mm, any absolute measure of dispersion gives the variation in rainfall in mm. These measures are suitable for comparing the variability in two distribution having variables expressed in the same units and of the same average size. This measure is not suitable for comparing the variability in two distribution having variables expressed in different units. Following are the absolute measures of dispersion: range, inter-quartile range, mean deviation, standard deviation.

What are the measures of dispersion? Relative Measure of dispersion Relative measure of dispersion is the ratio of a measure of absolute dispersion to an appropriate average or the selected items of the data. It may be noted that the same average base should be used as has been used while computing absolute dispersion. Relative measure of dispersion is also called as coefficient of dispersion because relative measure is a pure number that is independent of the unit of measurement. Following is the relative measure of dispersion: coefficient of range, coefficient of quartile deviation, coefficient of mean deviation, coefficient of standard deviation or coefficient of variation

Range and Coefficient of Range Range is difference between the value of largest item and the value of smallest item included in the distribution. Interpretation of range; If the average of the two distributions is almost same, the distribution with smaller range is said to have less dispersion and the distribution with larger range is said to have more dispersion. Range This is the simplest possible measure of dispersion and is defined as the difference between the largest and smallest values of the variable. In symbols, Range = L – S Where, L = Largest value and S = Smallest value. In individual observations and discrete series, L and S are easily identified. In continuous series, the following two methods are followed.

Range and Coefficient of Range Method 1: L = Upper boundary of the highest class S = Lower boundary of the lowest class. Method 2: L = mid value of the highest class. S = mid value of the lowest class. Co-efficient of Range Co-efficient of Range = L – S L + S Example: Find the value of range and it’s co-efficient for the following data. 7, 9, 6, 8, 11, 10, 4 Solution: L=11, S = 4. Range = L – S = 11- 4 = 7 Co-efficient of Range = L – S = 11- 4 = 0.4667 L + S 11 +4

Range and Coefficient of Range Example 2: Calculate range and its co efficient from the following distribution. Size: 60 -63 63-66 66-69 69-72 72- 75 Number: 5 18 42 27 8 Solution: L = Upper boundary of the highest class = 75 S = Lower boundary of the lowest class = 60 Range = L – S = 75 – 60 = 15 Co-efficient of Range = L – S = 75- 60 = 0.11111 L + S 75 +60

Merits and Demerits of range: Merits: It is simple to understand. It is easy to calculate. In certain types of problems like quality control, weather forecasts, share price analysis, etc., range is most widely used. Demerits: It is very much affected by the extreme items. It is based on only two extreme observations. It cannot be calculated from open-end class intervals. It is not suitable for mathematical treatment. It is a very rarely used measure.

Standard Deviation and Coefficient of variation: Standard Deviation: Karl Pearson introduced the concept of standard deviation in 1893. It is the most important measure of dispersion and is widely used in many statistical formulas. Standard deviation is also called Root-Mean Square Deviation. The reason is that it is the square–root of the mean of the squared deviation from the arithmetic mean. It provides accurate result. Square of standard deviation is called Variance.

Standard Deviation and Coefficient of variation: Definition: It is defined as the positive square-root of the arithmetic mean of the Square of the deviations of the given observation from their arithmetic mean. The standard deviation is denoted by the Greek letter σ (sigma).

Standard Deviation and Coefficient of variation: Example: Calculate the standard deviation from the following data. 14, 22, 9, 15, 20, 17, 12, 11 Solution: Deviations from actual mean. First to mean 97/8=12.125 Values (X) 14 -1 1 22 7 49 9 -6 36 15 20 5 25 17 2 4 12 -3 9 11 -4 16 120   140

Standard Deviation and Coefficient of variation: Calculation of standard deviation: Discrete Series:

Merits and Demerits of Standard Deviation: Merits : It is rigidly defined and its value is always definite and based on all the observations and the actual signs of deviations are used. As it is based on arithmetic mean, it has all the merits of arithmetic mean . It is the most important and widely used measure of dispersion. It is possible for further algebraic treatment. It is less affected by the fluctuations of sampling and hence stable. It is the basis for measuring the coefficient of correlation and sampling. Demerits: It is not easy to understand and it is difficult to calculate. It gives more weight to extreme values because the values are squared up. As it is an absolute measure of variability, it cannot be used for the purpose of comparison.

Coefficient of Variation The Standard deviation is an absolute measure of dispersion. It is expressed in terms of units in which the original figures are collected and stated. The standard deviation of heights of students cannot be compared with the standard deviation of weights of students, as both are expressed in different units, i.e., heights in centimeter and weights in kilograms. Therefore, the standard deviation must be converted into a relative measure of dispersion for the purpose of comparison . The relative measure is known as the coefficient of variation. The coefficients of variation are obtained by dividing the standard deviation by the mean and multiply it by 100. Symbolically, Coefficient of variation, CV = Standard of deviation / Mean x 100%

Coefficient of Variation Example: In two factories A and B located in the same industrial area, the average weekly wages (in birr) and the standard deviations are as follows : For Factory A; CV= 14.5 % SET BY RAM WONDE FROM STBC………   Factory Average Standard Deviation A 34.5 5 B 28.5 4.5