Business statistics and analytics

3,578 views 125 slides Oct 26, 2020
Slide 1
Slide 1 of 125
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112
Slide 113
113
Slide 114
114
Slide 115
115
Slide 116
116
Slide 117
117
Slide 118
118
Slide 119
119
Slide 120
120
Slide 121
121
Slide 122
122
Slide 123
123
Slide 124
124
Slide 125
125

About This Presentation

Manual which covers extensively the concepts, methods, and techniques of business statistics for the students of management.


Slide Content

BIET – MBA Programme, Davangere

1
Prof. Vijay K S Business Statistics and Analytics

Business Statistics and
Analytics

-: Working Manual:-



Name of the student:
Section:
USN Number:

BIET – MBA Programme, Davangere

2
Prof. Vijay K S Business Statistics and Analytics

Course Objectives:
1. To make the students learn about the applications of statistical tools and techniques
in decision making.
2. To emphasize the need for statistics and decision models in solving business
problems.
3. To enhance the knowledge on descriptive and inferential statistics.
4. To familiarize the students with analytical package MS Excel.
5. To develop analytical skills in students in order to comprehend and practice data
analysis at different levels.

Syllabus
Unit 1: (12 Hours) Introduction to Statistics: Meaning and Definition, functions, scope and
limitations, Collection and presentation of data, frequency distribution, measures of
central tendency - Mean, Median, Mode, Geometric mean, Harmonic mean,
Measures of dispersion: Range – Quartile Deviation – Mean Deviation - Standard Deviation
– Variance- Coefficient of Variance - Comparison of various measures of Dispersion

Unit 2: (8 Hours) Correlation and Regression: Scatter Diagram, Karl Pearson correlation,
Spearman’s Rank correlation (one way table only), simple and multiple regression
(problems on simple regression only)

Unit 3: (6 Hours) Probability Distribution: Concept and definition - Rules of probability –
Random variables – Concept of probability distribution – Theoretical probability
distributions: Binomial, Poisson, Normal and Exponential – Baye’s theorem (No
derivation) (Problems only on Binomial, Poisson and Normal).

Unit 4: (10 Hours) Time Series Analysis: Introduction - Objectives Of Studying Time Series
Analysis - Variations In Time Series - Methods Of Estimating Trend: Freehand Method -
Moving Average Method - Semi-Average Method – Least Square Method. Methods of
Estimating Seasonal Index: Method Of Simple Averages - Ratio To Trend Method - Ratio
To Moving Average Method

Unit 5: (8Hours) Linear Programming: structure, advantages, disadvantages, formulation
of LPP, solution using Graphical method.
Transportation problem: basic feasible solution using NWCM, LCM, and VAM unbalanced,
restricted and maximization problems.

Unit 6: (8 Hours) Project Management: Introduction – Basic difference between PERT &
CPM – Network components and precedence relationships – Critical path analysis –
Project scheduling – Project time-cost trade off – Resource allocation, Concept of project
crashing.

BIET – MBA Programme, Davangere

3
Prof. Vijay K S Business Statistics and Analytics

Uncheck the following assumptions:

- Business Statistics is all about only numbers
- People who are good in maths can do well with business statistics
- Analytics and Business Statistics are the difficult subjects to pass
- Need advanced mathematical ability is required to learn Business Statistics.


PRACTICAL COMPONENT :( Student-Centered Learning)
- Students are expected to have a basic excel classes
- Students should be able to relate the concepts which can highly enhance an
application scenario in your profession.
- Student should demonstrate the application of the techniques covered in this
course.


COURSE OUTCOMES:
- Facilitate objective solutions in business decision making under subjective
conditions.
- Demonstrate different statistical techniques in business/real-life situations.
- Understand the importance of probability in decision making
- Understand the need and application of analytics.
- Understand and apply various data analysis functions for business problems

BIET – MBA Programme, Davangere

4
Prof. Vijay K S Business Statistics and Analytics








Unit – 1
Introduction to Statistics

Measures of Central Tendency
&
Measures of Dispersion

BIET – MBA Programme, Davangere

5
Prof. Vijay K S Business Statistics and Analytics

Unit 1: (12 Hours)
Introduction to Statistics: Meaning and Definition, functions, scope and limitations,
Collection and presentation of data, frequency distribution
Measures of central tendency - Mean, Median, Mode, Geometric mean, Harmonic mean,
Measures of dispersion: Range – Quartile Deviation – Mean Deviation - Standard
Deviation – Variance, Coefficient of Variance - Comparison of various measures of
Dispersion
Statistics:
Meaning and Definition:
The simple sense of statistics is the facts is shown in number
Example: Average score in maths is 45
Definition:
- “The collection, representation, analysis and interpretation of the numerical data.”
- The term statistics means a numerical statement or statistical methodology, when
used in the sense of statistical data it refers to quantitative aspects of things and
is a numerical description.
- The art and science of collecting, analysing, presenting and interpreting data.
Characteristics of Statistics:
By statistics we mean
 Aggregate of facts
 Aggregate to a marked extent by a multiplicity of courses
 Enumerated and expressed interms of numbers
 Statistics should be collected with a reasonable standard of accuracy
 Collected and placed to in relation to each other
Statistical Methods:
It is a science which deals with the methods of collecting, classifying, presenting,
comparing and interpreting nemerical data collected to throw same light on any sphere
of enquiry.
Types of Statistical Methods:
Descriptive statistics: - It consists of procedures used to summarize and describe the
charecteristics of a set of data.
Inferential Statistics: - It consists of procedure used to make inference about population
charecteristics on the basis of sample results.

BIET – MBA Programme, Davangere

6
Prof. Vijay K S Business Statistics and Analytics

Some Important terminologies:
- Data: Collection of observatios of one or more variable of interest
- Population: A collection of all elements (Units or variable) of interest
- Sample: A subset of the population
- Variable: A characteristic, number, or quantity that increases or decreases over
time, or takes different values in different situations.
Functions of Statistics:
- To collect and present facts in a systematic manner
- To help in formulation and testing of hypothesis
- To help in facilitating the comparison of data
- To help predicting the future trends
- To help to find the relationship between variables
- Simplefies the mass of complex data
- To help to formulate policies
Scope and Importance of statistics:
1. Statistics and planning: Statistics in indispensable into planning in the modern age
which is termed as “the age of planning”. Almost all over the world the govt. are re-storing
to planning for economic development.
2. Statistics and economics: Statistical data and techniques of statistical analysis have to
immensely useful involving economical problem. Such as wages, price, time series
analysis, demand analysis.
3. Statistics and business: Statistics is an irresponsible tool of production control.
Business executive are relying more and more on statistical techniques for studying the
much and desire of the valued customers.
4. Statistics and industry: In industry statistics is widely used inequality control. In
production engineering to find out whether the product is confirming to the
specifications or not. Statistical tools, such as inspection plan, control chart etc.
5. Statistics and mathematics: Statistics are intimately related recent advancements in
statistical technique are the outcome of wide applications of mathematics.
6. Statistics and modern science: In medical science the statistical tools for collection,
presentation and analysis of observed facts relating to causes and incidence of dieses and
the result of application various drugs and medicine are of great importance.
7. Statistics, psychology and education: In education and physiology statistics has found
wide application such as, determining or to determine the reliability and validity to a test,
factor analysis etc.

BIET – MBA Programme, Davangere

7
Prof. Vijay K S Business Statistics and Analytics

8. Statistics and war: In war the theory of decision function can be a great assistance to
the military and personal to plan “maximum destruction with minimum effort.”
Statistics in business and management:
1. Marketing: Statistical analysis are frequently used in providing information for making
decision in the field of marketing it is necessary first to find out what can be sold and the
to evolve suitable strategy, so that the goods which to the ultimate consumer. A skill full
analysis of data on production purchasing power, man power, habits of compotators,
habits of consumer, transportation cost should be consider to take any attempt to
establish a new market.
2. Production: In the field of production statistical data and method play a very important
role. The decision about what to produce? How to produce? When to produce? For whom
to produce is based largely on statistical analysis.
3. Finance: The financial organization discharging their finance function effectively
depend very heavily on statistical analysis of peat and tigers.
3. Banking: Banking institute have found if increasingly to establish research department
within their organization for the purpose of gathering and analysis information, not only
regarding their own business but also regarding general economic situation and every
segment of business in which they may have interest.
4. Investment: Statistics greatly assists investors in making clear and valued judgment in
his investment decision in selecting securities which are safe and have the best prospects
of yielding a good income.
5. Purchase: the purchase department in discharging their function makes use of
statistical data to frame suitable purchase policies such as what to buy? What quantity to
buy? What time to buy? Where to buy? Whom to buy?
6. Accounting: statistical data are also employer in accounting particularly in auditing
function, the technique of sampling and destination is frequently used.
7. Control: the management control process combines statistical and accounting method
in making the overall budget for the coming year including sales, materials, labor and
other costs and net profits and capital requirement.
Limitations of Statistics:
- Does not study qualitative phenomenon
- Does not deal with indiivdual items
- Statistical results are true only on an average
- Statistical s=data should be uniform and homogeneous
- Statsitical results depends on the accuracy of data
- Statistical conclusions are not universally true
- Statistics results can be interpreted only if a person has sound knowledge of
statisctics

BIET – MBA Programme, Davangere

8
Prof. Vijay K S Business Statistics and Analytics

Collection and presentation of data:
Statistical data: A set of information collected from a sample to draw to general
conclusion about the population
Statistical data may be classified as
- Primary Data – Collected first time by the researcher
o Sources – Interview, Observation, Indirect or oral investigation,
information from the local agents and correspondents, mail
questionnaires, through enumerations.
- Secondary Data – Already collected data
o Sources: Published statitics, Publication of research institutes,
Publication of business and financial institutes, newspaper and
periodicals, reports of various committees and commissions,
unpublished statistics
Presentation of Date: arranging things or data in groups or classes according to
their resembalces and affinities and gives expressions to the chapter of attributes
that may subset among a diversity of individuals
Some Important classification
o Geographical (on the basis of area or region)
Example: Sales of the company
Region Sales
North 450
South 310
East 281
West 114
o Chronological (On the basis of histrical i.e. with respect to time)
Example: Sales reported by the departmental stores
Month Sales (In lakhs)
Jan 45
Feb 31
March 28
April 11
o Qualitative ( On the basis of character / attributes)
 Simple Classification: Classification is done into two calsses
 Maniifold Classification : The classification is based on more than
one attribute at a time
o Numerical, qunatitative ( On the basis of magnitude)
Marks No of Students
0-10 45
10-20 31
20-30 28
30-40 11

BIET – MBA Programme, Davangere

9
Prof. Vijay K S Business Statistics and Analytics

Frequency distribution:
A frequency distribution is a statistical table, which shows the set of all distict
values of the variable arranged in order of magnitude, either individually or in
groups with their corresponding frequencies.
Classification of Frequency distribution:
- Series of individual observation
Items are listed one after the other
Roll
No.
Marks Obtained
1 83
2 53
3 72
4 61
- Discrete (Ungrouped) Frequency distribution
Variants differ from each other by a definite amount
No of Kids Families
1 13
2 53
3 12
4 14
- Continuous Frequency distribution (Grouped frequency distribution)
Measurements are only approximations and are expressed in terms of
intervals with certyain limits
Marks Students
0-5 1
5-10 13
10-15 8
15-20 5

Some technical terms in formulating the frequency distribution
o Class Limits: Smallest and largest values in the class
o Class Intervals: The difference between upper and lower limit of a class
interval
Methods of Forming Class Interval:
o Exclusive Method (Over lapping)
Marks Students
0-5 1
5-10 13
o Inclusive Method (Non Overlapping)
Marks Students
0-4 1
5-9 13

BIET – MBA Programme, Davangere

10
Prof. Vijay K S Business Statistics and Analytics

Presenting Data:
Some of the diagrammatic representation of data
o One dimensional diagrams (Line and Bar)
o Two-dimensional diagram (Rectangle, square, pie)
o Three dimensional diagram (Cube, Sphere, Cylinder)
o Pictogram
o Cartogram

Measures of Central Tendency
Central Tendency:
 It is also termed as average
 They sometime referred as measures of location
 Central tendency is the middle point of distribution
 This is used to describe the inherent (Essential) characteristics of a frequency
distribution
 Average or Central Tendency which condense a huge unwieldy (Awkward or
heavy) set of numerical data into single numerical values which are
representative of the entire distribution
 This will give us a bird’s eye view of the huge mass of numerical data
 Central tendency or the average values are typically values around which other
items of the distribution assembles or congregates.
 These are the values lie between the two extreme observations of the
distribution and give us an idea about the concentration of the values in the
central part of distribution
 This is very much useful in
o Describing the distribution in concise manner
o Comparative study of different distribution
o To compute other measures such as Dispersion
“Central tendency is the tendency (behaviour) of numerical data to move towards
its central value like Arithmetic mean, Median, Mode, Geographical Mean, and
Harmonic Mean is called Central Tendency”
Example: Average score of class in particular subject is 65
 It implies that each student score has contributed in getting 65 and thus, each
score is understood to move towards 65

BIET – MBA Programme, Davangere

11
Prof. Vijay K S Business Statistics and Analytics

Central Tendency / Central Location of following curves
 Curve A
and C has got
same or
equal Central
Tendency
(CT)
 Central
Tendency of
curve B lies
right to the
curve A and
C
Dispersion: Dispersion is the spread of the data in a distribution i.e. the extent to which
the observations are scattered
Dispersion of the following curves


Here the curve B has got
wider spread or
dispersion than the
curve A



Various Measures of Central Tendency
I. Mean (Arithmetic Mean / Simple Mean denoted as AM or ??????̅ )
II. Median (Denoted as Md also called as positional average)
III. Mode ( Denoted as Z or Mo)
IV. Geometric Mean (GM)
V. Harmonic Mean (HM)
Depending on the seriousness of data analysis we choose between different
measures of central tendency listed above

BIET – MBA Programme, Davangere

12
Prof. Vijay K S Business Statistics and Analytics


I. MEAN (ARITHMETIC MEAN / SIMPLE MEAN)











 Most of the time it was referred as average of something i.e. some given
value

 Arithmetic mean of a given set of observations is their sum divided by
the number of observation

 Mean= Sum of all values / Total number of values
Examples:
- Average winter temperature of New-York city
- Average corn yield from acre of land

Note: The date can be in any one of the following form - Grouped and Ungrouped Date
1. Raw data: Examples: The height of 6 plants in the garden is 6, 5, 3, 4, 2, 7
2. Discrete frequency distribution: A discrete variable is one whose set of possible
values is finite. Discrete variables are frequently counting variables, like the
number of cars owned, Number kids in the family etc.
Example:
X f
0 3
1 12
2 18

BIET – MBA Programme, Davangere

13
Prof. Vijay K S Business Statistics and Analytics

3 7
40
Where X = Number of Children f = Number of family
3. Continuous frequency distribution
a. Mutually Exclusive Class Intervals
b. Mutually Inclusive
c. Open Ended Class Intervals
a. Mutually Exclusive Class Intervals: Here the lower limit is included
and upper limit is excluded from the class interval.
CI f
50 – 60 5
60 – 70 16
70 -80 19
80 -90 10
50
CI=Class Interval
b. Mutually Inclusive Class Intervals: here both upper and lower limit
is included in the class interval
CI f
50-59 4
60-69 17
70-79 20
80-89 8
90-99 1
50
c. Open Ended Class: If the initial or final class interval is
indeterminate at its end
CI f
Below
60
5
60 – 70 16
70 -80 19
80-90 30
30

Some Points Regarding the Class Intervals
 To calculate median, mode for the particular values, the mutually inclusive class
intervals to be converted to mutually exclusive class intervals.
 To calculate Arithmetic Mean (AM), Geometric Mean (GM) and Harmonic Mean
(HM) conversion of mutually inclusive to exclusive is not necessary
 For all calculation open ended CI must be converted into mutually exclusive Class
Intervals.

BIET – MBA Programme, Davangere

14
Prof. Vijay K S Business Statistics and Analytics

 Calculating the mean from ungrouped data or Raw Data or Individual
Observation
Ungrouped Data or Raw Data:
 Here the sample size is small
 We add all the observation to calculate mean
 This is not possible, if there is 5000 observation i.e. large number of data or
observation
In General, if X1, X2, X3………..Xn are given “n” observations then their Arithmetic
Mean usually denoted as ??????̅ is given by

� ̅ =
∑�
�


Where:


Example: The Arithmetic mean of 5, 8, 10, 15, 24 and 28 is
=
5+8+10+15+24+28
6
= 90/8 = 15
Calculating the mean from grouped data:
 This is used when the number of observation is large and difficult to compute
 Here, we are access the frequency distribution of the data, not every individual
observation
o Discrete frequency distribution
o Continuous frequency distribution
 Frequency distribution consist of data that are grouped by classes. Each value of
the observation falls somewhere one of the classes.
 To find the arithmetic mean of continuous frequency distribution, we first
calculate the midpoint of each class
 Then multiply each midpoint by the frequency of observations in that class, sum
all these results, and divide the sum by the total number of observations in the
sample.
o Discrete frequency distribution
� ̅ =
∑��
�

BIET – MBA Programme, Davangere

15
Prof. Vijay K S Business Statistics and Analytics


o Continuous frequency distribution

� ̅ =
∑��
�
Where “X” is the middle value
� ̅ = ??????+
∑�
�
(Shortcut method / Step Deviation method)
� ̅ = ??????+
ℎ∑�
�
(Shortcut method / Step Deviation method)


Where:


Problems:
Q.No-1.1: Calculate AM for the raw data; 22, 28, 26, 24, 26, 15, 08, 09, 32, 20
Q.No-1.2: Calculate AM for the following distribution
X 0 1 2 3 4
f 8 23 45 24 7
Note: It is a discrete frequency distribution
Q.No-1.3: Calculate AM for the following distribution
CI 0-10 10-20 20-30 30-40 40-50
F 3 14 31 13 4
Note: It is a continuous frequency distribution

Q.No-1.4: Calculate AM for the following distribution table
CI 0 - 9 10-19 20-29 30-39 40-49 50-59
F 3 15 38 14 10 5

Q.No-1.5: Calculate AM for the following distribution table
Marks Below 20 Below 30 Below 40 Below 50
Students 12 35 48 60
Note: Since it has open ended class intervals so it should be converted into mutually
exclusive class intervals.

BIET – MBA Programme, Davangere

16
Prof. Vijay K S Business Statistics and Analytics

Q.No-1.6: Calculate AM for the following distribution table
Marks 10 Above 20 Above 30 Above 40 Above 50 Above 60
Above
No. Students 60 48 35 18 8 2

Q.No-1.7: The following is the frequency distribution of the number of telephone calls
received in 245 successive one-minute intervals at an exchange
No. of calls: 0 1 2 3 4 5 6 7
Frequency: 14 21 25 43 51 40 39 12
Obtain the mean number of calls per minute
Q.No-1.8: The following table gives salary per month of 450 employees in a factory,
Find mean salary of the employees
Salary (,000) 0-5 5-10 10-15 15-20 20-25 25-30
Employees 80 120 100 60 50 40

Q.No-1.9: An average monthly balances of 600 customers is given as follows. Find the
mean from this data
Class (in Dollars) Frequency
0 - 49.99
50.00 – 99.99
100.00 – 149.99
150.00 – 199.99
200.00 – 249.99
250.00 – 299.99
300.00 – 349.99
350.00 – 399.99
400.00 – 449.99
450.00 – 499.99
78
123
187
82
51
47
13
9
6
4

Q.No-1.10: From the following data find AM?
C.I 130-
134
135-139 140-144 145-149 150-
154
155-
159
160-164
Employees 5 15 28 24 17 10 1

Q.No-1.11: Calculate the average no. of days the workers are absent in a company
No. Days
Absent
Less
than 5
5-10 10-15 15-20 20-25 25-30 30-35
No. of
Workers
29 224 465 582 634 644 650

BIET – MBA Programme, Davangere

17
Prof. Vijay K S Business Statistics and Analytics


Q.No-1.12: Calculate the average age of employee with the help of following distribution
table
Age above
Years
20 25 30 35 40 45 50 55 60
f 450 410 330 300 210 185 85 40 12

Q.No-1.13: Find the missing frequency from the following data, if �̅ = 15.38
X 10 12 14 16 18 20
f 3 7 x 20 8 5

Q.No-1.14: Find missing frequency in the following data given �̅ = 50 and N = 120
C.I 0-20 20-40 40-60 60-80 80-100
f 17 f1 32 f2 19

 Step deviation method for grouped or continuous frequency distribution:
In case of grouped or continuous frequency distribution, with class intervals of equal
magnitude, the calculation are further simplified by taking
� ̅ = ??????+
ℎ∑��
�

Q.No-1.15: Calculate the mean for the following frequency distribution
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of
Students
5 5 7 15 8 6 4
a) By the direct formula b) Step deviation method
Properties of Arithmetic Mean
1. The algebraic sum of the deviations of the given set of observations from their
Arithmetic mean is “Zero”
In simple words, the sum of deviations taken from the Arithmetic mean is always
“Zero”
Mathematically ∑(� − �̅) = 0
For frequency distribution ∑f(X - �̅) = 0
2. Sum of square deviation taken from the AM is always least among such deviations
taken from other measures of other tendency

Mathematically ∑(� − �
̅
)
2
is always less than ∑(� − �)
2
, ∑(� − �)
2

∑(� − ��)
2
, ∑(� − ��)
2

BIET – MBA Programme, Davangere

18
Prof. Vijay K S Business Statistics and Analytics


3. Mean of the combined series: If we know the sizes and means of two component
series, then we can find the mean of the resultant series obtained on combining
the given series.
If n1 and n2 observations posses x1 and x2 as means respectively then the
combined group of size n1 and n2 is given by
� ̅
12 =
�1� ̅1+�2 � 2̅̅̅̅̅
�1+�2

Merits and Demerits of Arithmetic Mean:

Merits
 It is rigidly defined
 It is easy to calculate and understand i.e. the concept is familiar to most
people and intuitively clear
 It is based on all the observations
 Every data set has a mean. It is a measure that can be calculated and it is
unique because every data set has one and only mean
 It is suitable for further mathematical treatment
 Of all averages, Arithmetic mean is affected least by fluctuations of
sampling, that means the Arithmetic mean is a stable average
 The mean is useful for performing statistical procedures such as
comparing the means from several data set
Demerits
 The strongest drawback of Arithmetic mean is that it is very much affected
by extreme observations. Two or three very large values of variable may
disproportionately affect the values of the Arithmetic Mean
 Arithmetic mean cannot be used in the case of open end classes such as
less than 10, more than 70. Since such classes we cannot determine the
midpoint / mid value. In such cases Mode or Median may be used.
 It cannot be determined by inspection nor can it be located graphically
 It cannot be used, if we are dealing with qualitative data / Characteristic
such as Honesty, Beauty
In case of qualitative data, Median is the only average used
 Arithmetic Mean cannot be obtained, if a single observation is missing or
lost or is illegible unless we drop it out and compute the Arithmetic Mean
of the remaining value
 In extreme asymmetrical (Skewed) distribution, usually arithmetic mean
is not representative of the distribution and hence not suitable for
measure of location or Measure of Central Tendency

BIET – MBA Programme, Davangere

19
Prof. Vijay K S Business Statistics and Analytics

 Arithmetic mean may lead to wrong conclusions if the details of the data
from which it is obtained are not available.
 Arithmetic mean may not be one of the values which the variable actually
takes and is termed as fictitious average.

Weighted Arithmetic Mean
Usual AM gives equal importance for all items; but most of the situation Mean is
calculated based on the importance level of the observations.
To make the average computed as representative of the distribution – proper weightage
is given to various items
Let W1, W2, W3,……….Wn be the weights attached to variable values X1, X2, X3………Xn
respectively. Then the Weighted Arithmetic Mean usually denoted c

�̅�=
�1�1+�2�2+�3�3………����
�1+�2+�3……��


�̅�=
∑��
∑�

In Case of frequency distribution, if f1, f2, f3…………..fn are the frequencies of the variable
values X1, X2, X3……………Xn respectively than the weighted average / weighted
arithmetic mean is given by

�̅�=
�1(�1�1)+�2(�2�2)+�3(�3�3)………��(����)
�1+�2+�3……��


�̅�=
∑�(��)
∑�

Q.No-1.16: Calculate the Weighted Arithmetic Mean for the following distribution
Item Rice Wheat Sugar Jawar Oil Tea Salt
Price 40 30 33 35 25 250 15
Weight 1 0.5 0.2 0.5 0.25 0.1 0.05

Q.No-1.17: A candidate obtained the following percentage of marks in an examination:
English-60, Hindi-75, Mathematics-65, Physics-59, and Chemistry-55. Find the
candidate’s weighted arithmetic mean if weights are 1, 2, 1, 3, and 3 respectively are
allocated to the subjects.

BIET – MBA Programme, Davangere

20
Prof. Vijay K S Business Statistics and Analytics

Q.No-1.18: The mean annual salary of all employees in a company is Rs. 25000. The mean
salary of female and male employees is Rs. 27,000 and 17,000 respectively. Find percent
of male, female employed by the company.
Q.No-1.19: The mean monthly salary paid to 77 employees in a company was Rs. 78. The
mean salary of 32 of them was 75, and that of others 25 was 82, what was the mean salary
of remaining?
Q.No-1.20: Average daily income for group of 50 persons in a factory was calculated to be
Rs. 169, it was later found that 1 values was measured by 134 instead of the current value
143. Calculate the correct average income.
Q.No-1.21: Calculate the Mean score of students using weights aligned to subjects Physics,
Chemistry, Maths, Biology, English and Hindi; respectively as 3, 2, 3, 0, 1, 1 using the
following marks
Subjects Hindi English Physic Chemistry Maths Biology
Marks 56 70 72 62 80 69

Q.No-1.22: The number 3.2, 5.8, 7.9 and 4.5 have frequencies X, (X+2), (X-3) and (X+6)
respectively. If the arithmetic mean is 4.876. Find the value of x.
Q.No-1.23: Marks secured by 50 students in a test paper are given below
30 45 48 55 39 25 31 12 18 21 54 59 51
33 43 44 10 38 19 26 41 35 37 41 46 33
51 37 58 58 17 19 23 26 29 38 57 36 35
44 43 27 19 43 22 31 47 34 31 15 35 32
Prepare frequency table with class interval 10-19, 20-29, 30-39…….., and calculate the
value of the Arithmetic Mean from the frequency table obtained.
2. MEDIAN
Median is another measure of Central Tendency which locates the middle most value in
given set of data
Median is the measure of Central Tendency different from any of the means
Median is a single value from the data set that measures the central item in the data
Median is that value of the variable which divides the group in two equal parts, one part
comprising of the values greater than and the other less than Median
This single item is the middlemost or most central item in the set of numbers. As said
earlier half of the items lie above this point and the other half lie below it
Contradicting to the Arithmetic mean which is based on all the items of the distribution,
the median is only positional average i.e. its value depends on the position occupied by a
value in the frequency distribution.

BIET – MBA Programme, Davangere

21
Prof. Vijay K S Business Statistics and Analytics


 Calculation of Median from raw data or ungrouped data
To find the Median of a dataset, first array the data in ascending or descending order. If
the data set contains an odd number of items, the middle item of the array is the Median.
If there is even items, the arithmetic means of two middle items
Median M = (
�+1
2
)�ℎ́ item in an arrange
Where � is the number of items in the array.

 Calculation of Median from Discrete frequency distribution
Here in this case
Median M= (
�+1
2
) �ℎ́ observation for discrete frequency distribution
Where N is the number of items in the distribution i.e. sum of frequencies
 Calculation of Median from Continuous frequency distribution
Median M = L + {
(
??????
2
−�)∗�
�
}
Where L = Lower limit of Median class
N = Total number of items
M = Cumulative frequency of the class proceeding the median class
f = Frequency of the median class
C = Class width or Magnitude of the class
Q.No-2.1: Find the Median of the observations 12, 15, 16, 82, 75
Q.No-2.2: Find the Median of the observations 12, 18, 13, 42, 63, 78
Q.No-2.3: Find the Median of the following distribution
X 10 20 30 40 50
f 3 8 13 9 7
Q.No-2.4: Find the Median of the following distribution
C.I 0-10 10-20 20-30 30- 40 40- 50
f 5 12 23 12 3

BIET – MBA Programme, Davangere

22
Prof. Vijay K S Business Statistics and Analytics

Q.No-2.5: Find the Median of the following distribution
C.I 10 - 14 15 - 19 20 - 24 25 – 29 30 – 34
f 2 5 8 4 1

Note: to calculate Median
- Cumulative frequency is a must
- Class intervals should be mutually exclusive
Q.No-2.6: Calculate Mean and Median from the following distribution
C.I 10-20 20-30 30- 40 40- 50 50-60 60-70 70-80 80-90
f 4 12 40 41 27 13 9 4

Q.No-2.7: Find the Median for the following data
Height (C.I) 125-129 130-134 135-139 140-144 145-149
No. of Items (f) 2 5 8 4 1

Q.No-2.8: Find the Median wage of labours from the following
Wages Above
0
Above
10
Above
20
Above
30
Above
40
Above
50
Above
60
Above
70
No. Labours 650 500 425 375 300 275 250 100

Q.No-2.9: Find the missing frequency, if M=14
C.I 0-5 5-10 10-15 15-20 20-25 25-30
f 5 7 Q 8 6 4

Q.No-2.10: Calculate Median from the series
5 Men get less than Rs. 5
12 Men get less than Rs. 10
22 Men get less than Rs. 15
30 Men get less than Rs. 20
36 Men get less than Rs. 25
40 Men get less than Rs. 30
Q.No-2.11: Calculate the missing frequency from the following data having Median = 46
and N = 230
Variables 10-20 20-30 30- 40 40- 50 50-60 60-70 70-80
f 12 30 f1 65 f2 25 19
Merits and Demerits of Median

BIET – MBA Programme, Davangere

23
Prof. Vijay K S Business Statistics and Analytics

Merits:
 Rigidly defined
 Easy to calculate for non-mathematical person
 Since, it is a positional average, not affected by the extreme observations.
Useful in the skewed distribution
 Computed while dealing with open ended classes
 Located by simple inspection and even graphically
 This is the only average which will deal with qualitative characteristics

3. MODE:

 Mode is one of the measure of central tendency that is different from the mean
that somewhat like the median
 The mode is the value that is repeated most often in the data set
 The mode is defined as the highest or the most popular value in the given data
 Mode is the value which occurs most frequently in a set of observations and
around which the other items of the set clusters densely located
 It is the value at the point around which the items tend to be most heavily
concentrated. It is regarded as the most typical of a series of values
 Mode is the value which has the greatest frequency density in its immediate
neighbourhood
 Mode is termed as the fashionable value of the distribution
o Example: Average size of the shoe sold in a shop is 7
o Average Indian Male is 5 feet 6 inch
Here the average refer to neither mean nor median but mode, the most
frequent value in the distribution

 Mode denoted a Mo or Z

 Calculation of “Mode”
o Mode (Z) is the highest value - Raw Data
o Mode (Z) is the value corresponding to highest frequency in discrete
frequency distribution
o Mode (Z) in continuous frequency distribution is
Z = L + {
(�−�1)∗�
2�−�1−�2
} in a model class
Model class = Class with highest frequency
L = Lower limit of the model class

BIET – MBA Programme, Davangere

24
Prof. Vijay K S Business Statistics and Analytics

f = Frequency of model class
f1 = Frequency of proceeding model class
f2 = Frequency of succeeding model class
C = Class width of model class
Q.No-3.1: Find Mode 2, 6, 8, 12, 32, 25, 41, 63, 25
Q.No-3.2: Find the mode for following distribution table
X 0 1 2 3 4
f 7 13 25 10 3

Q.No-3.3: Find the mode for following distribution table
C.I 0-10 10-20 20-30 30-40 40-50
f 3 8 13 10 1

Q.No-3.4: Find the Mean, Median and Mode
CI 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
f 4 12 40 41 27 13 9 4

Merits and Demerits of Mode
Merits:
 Easy to calculate and understand; done by merely inspection process
 Not affected by observations
 Convenient for open ended class
Demerits:
 Mode is not rigidly defined
 Mode is not suitable for further mathematical treatment
 Affected to a greater extent with the fluctuation of samplings

Empirical Relationship between Mean (??????̅ ), Median (M) and Mode (Z) (Slightly Skewed)
Symmetrical distribution that contains only one mode always have the same value for the
Mean, Median and Mode
In case of Skewed distribution
In case of positively and negatively skewed distribution. The median can be taken for to
measure the central tendency

BIET – MBA Programme, Davangere

25
Prof. Vijay K S Business Statistics and Analytics

Since the Median is not as highly influenced by the frequency of occurrence of a single
value as is the mode, nor it is pulled by extreme value as is the Mean
Whenever the given distribution is slightly skewed Mean (??????̅ ), Median (M) and Mode (Z)
have showed following relationship
Z = 3M – 2 ??????̅
I.e. Mode = 3 Median – 2 Mean
Q.No-3.5: Find Mean, Median and Mode using Empirical Relationship
Size in
Inches
5 10 25 20 25 30 35
f 1 3 13 17 27 36 38

GEOMETRIC MEAN
- GM is nth root of product of quantities of the series. It is observed by
multiplying the values of items together and extracting the root of the
product corresponding to the number of items.
- Thus, square root of the products of two items and cube root of the
products of the three items are the geometric mean
- It is never larger than the arithmetic mean
- If there are zeroes and negative numbers in the series, the geometric
mean cannot be used.
- Logarithms can be used to find the geometric mean to reduce large
numbers and to save time
- Appropriate in situations where, there is an average percentage rate of
change over a period of time.
- It is widely used in the construction of index numbers
Geometric Mean (GM) = √�
1 �
2 �
3 �
4…………………….�
�
??????

When the number of items in the series is larger than 3, the process of
computing GM is difficult. To overcome this, logarithms of each value is obtained. The log
of all the values added up and divided by number of items. The antilog of the ratio
obtained is the required GM.
Geometric Mean (GM) = Antilog [
���
1
??????
+ ���
2
??????
+ ���
3
??????
+ ���
4
??????
………………………..+ ���
??????
??????

�
]
= Antilog [∑
���
????????????
�
�
�=1]
Geometric Mean (GM) for Continuous case
GM = Antilog [
∑� ���
??????
�
]

BIET – MBA Programme, Davangere

26
Prof. Vijay K S Business Statistics and Analytics

Merits of GM
- It is based on all the observation in the series
- It is rigidly defined
- It is suited for averages and ratios
- It is less affected by extreme values
- It is useful for studying social and economic data
Demerits of GM
- It is not simple to understand
- It requires computational skill
- It cannot be computed if any items are zero or negative
- It has restricted applications
Problems:
4.1 Find the GM of date 2, 4, 8
4.2 Find the GM of date 2, 4, 8, 10 using logarithms
4.3 The annual rate of growth rate of growth of output of a company in the last five years
is given below. Find the GM of the growth rate
Year Growth Rate Output at the end of the
year
1998 5.0 105
1999 7.5 112.87
2000 2.5 115.69
2001 5.0 121.47
2002 10.0 133.61

4.4 Comparing the previous year, the overhead (OH) expenses went up to 32% is year
2003, then increased by 40% in the next year and 50% increase in the following year.
Calculate average increase in overhead expenses. Let 100% OH expenses at base year
Year Growth Rate
2002 Base Year
2003 132
2004 140
2005 150

4.5 Consider the following time series at monthly sales of ABC Company for 4 months.
Find average rate of change per month sales
Month Sales
I 10,000
II 8,000
III 12,000
IV 15,000

BIET – MBA Programme, Davangere

27
Prof. Vijay K S Business Statistics and Analytics

4.6 Find the GM for the following data
Yield of Wheat in MT No. of Farms
1 – 10 3
11 – 20 16
21 – 30 26
31 – 40 31
41 – 50 16
51 – 60 8

Harmonic Mean
- It is the total number of items of a value, divided by the sum of
reciprocal of values of a variable
- It is a specified average which solves problems involving variables
expressed in “Time rates” that vary according to time
- Example: Speed in km/hr., min/day, Price/chapter
- Harmonic mean (HM) is suitable only when time factor is a variable and
the act being performed remains constant
HM =
�

1
??????

Merits of Harmonic Mean
- It is based on all observation
- It is rigidly defined
- ‘Suitable in case of series having wide dispersion
- It is suitable for further mathematical treatment
Demerits of Harmonic Mean
- It is not easy to compute
- Cannot be used when one of the items is zero
- It cannot represent distribution
Problems:
5.1: The daily income of 5 families in a very remote village is given below. Compute HM
Family Income
(X)
1 85
2 90
3 70
4 50
5 60

BIET – MBA Programme, Davangere

28
Prof. Vijay K S Business Statistics and Analytics

5.2 A man travels by a car for 3 days; he covered 480 km each day. On the first day, he
drives for 10 hrs. At the rate of 48 KMPH, on the second day for 12 hrs. At the rate of
40 KMPH, and on the 3
rd
day for 15 hrs. at the rate 32 KMPH. Compute HM, Weighted
mean and compare them.
5.3 Find the HM for the following data
Class Interval Frequency
0 – 10 5
10 – 20 15
20 – 30 25
30 – 40 8
40 - 50 7

MEASURES OF DISPERSION

 An average does not tell the full story. It is hardly a full representative of a mass
unless we know the manner in which the individual items scatter around it. A
further description of the series is necessary if we are to gauge how representative
the average is.
 The measure of central tendency must be supported and supplemented by some
other measure, one such measures is dispersion.
 The literal meaning of dispersion is “Scatteredness”
 We study dispersion to have an idea of homogeneity (Compactness) or
heterogeneity (Scatter) of the distribution.
 Why dispersion is important characteristic to understand and Measure?
o It enables us to judge the reliability of our measure of Central Tendency
o To tackle the problems associated with widely dispersed data
o We may wish to compare the dispersion of various samples
 Dispersion is the measure of the variation of the items
 It is a measure of the extent to which the individual item vary
 It is the degree of the scatter or variation of variables about a central value
 Degree to which numerical data tend to spread about an average value is called
variation or dispersion of data

BIET – MBA Programme, Davangere

29
Prof. Vijay K S Business Statistics and Analytics

- The curve A has git less
dispersion or Variability
- The Curve B has got less
variability than curve C
but more variability than
curve A
- Curve C has got more
dispersion / variability
than curve A and curve B


Objectives or Significance of the measures of dispersion
 To find the reliability of an average
 To control the variation of the data from the central value
 To compare two or more set of data regarding their variability
 To obtain other statistical measures for further analysis of data
Characteristics for an ideal measure of dispersion
 It should be rigidly defined
 Easy to calculate and easy to understand
 It should be based on all the observations
 It should be amiable to further mathematical treatment
 Less affected by possible fluctuation of sampling
 It should not be much affected by extreme observations
Measures of Dispersion
1. Range
2. Quartile Deviation
3. Mean Deviation
4. Standard Deviation

1. RANGE
Range is the crude measure of dispersion; Calculated as
R = High – Low
R = H – L
Its relative measure is called co-efficient of Range R =
??????−�
??????+�

Example: Find range and its co-efficient 10, 13, 18, 8, 14, 16, 23, 25

BIET – MBA Programme, Davangere

30
Prof. Vijay K S Business Statistics and Analytics

2. QUARTILE DEVIATION
This is measure of dispersion which consider deviation between upper and lower
quartile
Quartile Deviation - QD = Q3 – Q1 (Inter quartile range)
QD = =
�3−�1
2
(Semi - Inter quartile range)

Co-efficient of QD = =
�3−�1
�3+�1

Example1: Find Quartile Deviation and its Co-efficient
13, 82, 65, 45, 58, 76, 18, 29, 34, 91
Example 2: Find semi-interquartile range and co-efficient of Quartile Deviation for the
following frequency distribution
Size 4-8 8-12 12-16 16-20 20-24 24-28 28-32 32-36 36-40
f 6 10 18 30 15 12 10 6 2

3. MEAN DEVIATION
It is measure of dispersion which calculate average distance between each observation
and its central value (�̅ or M or Z)
If the Mean Deviation is calculated around �̅ it is called Mean Deviation about the mean.
Formulas
Mean Deviation =
∑I X−A I
�
or
∑I d I
�
Where A = Mean/Median/Mode
I d I = Mod d = I X-A I
In case of Frequency Distribution
Mean Deviation =
∑f I X−A I
�
or
∑f I d I
�

Relative Measure of Mean Deviation
Co-efficient of Mean Deviation =
Mean Deviation
Average about which it is calculated

BIET – MBA Programme, Davangere

31
Prof. Vijay K S Business Statistics and Analytics

Example 1: Find Mean Deviation about �̅ for the following data
18, 75, 56, 63, 36
Example 2: Find Mean Deviation about Median
Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
Student 2 6 12 18 25 20 10 7

Example 3: Calculate M.D. about the Median and its coefficient
Size of Items 4 6 8 10 12 14 16
f 2 1 3 6 4 3 1


STANDARD DEVIATION & VARIANCE

The most comprehensive explanation of dispersion are those that deal with the average
deviation from some measure of central tendency
Variance and Standard Deviation are the two measures for measuring dispersion in
statistics. Both of these will tell us an average distance of any observation in the data set
from the mean of the distribution
VARIANCE and STANDARD DEVIATION of PAPULATION
Population Variance
 Every population has a variance, which is symbolised by ??????
2
(Sigma Squared)
 To calculate the population variance, we divide the sum of squared distances
between the mean and each item in the population by total number of items in
the population
 By squaring the each distance, we make each number positive and at the same
time, assign more weight to large deviation (Distance between the mean and the
value)
Population Standard Deviation
 Population Standard deviation is denoted as ?????? (Sigma Squared)
 It is simply the square root of population variance
 Standard deviation is the square root of the average of squared distances of the
observations from the mean
 While the variance is expressed in the square of the units used in the data,
standard deviation is in the same units as those used in the data

BIET – MBA Programme, Davangere

32
Prof. Vijay K S Business Statistics and Analytics

Formula for Variance and Standard Deviation

Formula for Raw Data:
Variance = ??????
2
=
∑(�−�̅ )
2
�

Standard Deviation = ??????= √
∑(�−�̅ )
2
�
�� ??????= √
∑(�)
2
�
−�̅2

??????
2
= Population Standard Deviation
?????? = Population Variance
X = Observation
�̅ = Mean
N = Number of observation in the population
∑= Sum of all the values

Grouped data:
Variance = ??????
2
=
∑�(�−�̅ )
2
�
��
∑�(�)
2
�
−(�̅)
2
or
∑�(�)
2
�

(
∑��
�
)
2

Standard Deviation = ??????=√
∑�(�−�̅ )
2
�
�� ??????=√
∑�(�)
2
�
−�̅2

or
??????=√
∑�(�)
2
�
− (
∑��
�
)
2

Where f = Frequency of each of the class
Example 1: Calculate S.D for the following data
X 25 36 45 65 82 93 58 70

Example 2: Find S.D and Co-Efficient of SD for the following data
C.I 0-10 10-20 20-30 30-40 40-50 50-60 60-70
F 5 7 14 12 9 6 2

BIET – MBA Programme, Davangere

33
Prof. Vijay K S Business Statistics and Analytics

Example 3: Find Mean and SD for the following data
Age 10 20 30 40 50 60 70 80
No. of
Person
15 30 53 75 100 110 115 125
Example 4: The 15 Vessels were produced in one day and we test each vessel to
determine its purity. The data is given below, calculate the standard deviation.
The result of purity test on vessel and Observed percentage of impurity is as follows
0.04 0.14 0.17 0.19 0.22
0.06 0.14 0.17 0.21 0.24
0.12 0.15 0.18 0.21 0.25


CO-EFFICIENT OF VARIANCE

If Co-efficient of Variance for a given data is more, the data is said to be less consistent,
the other hand if C.V is less it means that variability in the data is less and more
consistent.
C.V =
??????
�
̅ *100
Example 1: The run scored by two batsmen A & B in 10 innings are as follows
A 10 115 5 75 7 120 36 84 29 19
B 45 12 76 42 4 50 37 48 130 0
Find I) Better one score II) Consistent Batsmen

Example 2: Life of 2 models of refrigerator in recent survey are shown as follows.
What is the average life of each model?
Which model has grater uniformity?
Life (years) 0-2 2-4 4-6 6-8 8-10 10-12
A 5 16 13 7 5 4
B 2 7 12 19 9 1

Example 3: Two brands of tyres are listed with the following results
A) Which brand of tyre have greater average life?
B) Compare the variability and state which brand of tyres would you use
Life 20-25 25-30 30-35 35-40 40-45
X 1 22 64 10 3
Y 0 24 76 0 0

BIET – MBA Programme, Davangere

34
Prof. Vijay K S Business Statistics and Analytics

Standard Deviation for combined series
If n1 observation have mean �̅1, and SD ??????1 and n2 observation have mean �̅2, and SD
??????2 then combined SD of (n1+n2) observations is calculated as
??????= √
�1( ??????1
2
+ �1
2
)+ �2( ??????2
2
+ �2
2
)
�1+�2

Where d1 = �̅-�̅1 d2 = �̅-�̅2
And � ̅
=
�1�̅1+�2�̅2
�1+�2

Example1: The Mean and SD of marks obtained by 2 groups of students consisting of 50
each are given below. Calculate S.D of all 100 students
Group Mean Standard Deviation n
1 60 8 50
2 55 7 50

Example 2: Calculate the missing information from the following data
A B C Combined
Numbers 175 ? 225 500
SD ? 63 5.9 5.4
Mean 220 240 ? 235

Example 3: A shareholders research centre of India has conducted a research study on
price behaviour of 3 leading industries A, B, C. The results published in quarterly journal
are as follows
Shares Average Price Standard Deviation
(SD)
Current Selling Price (S
P)
A 18.2 5.4 36.0
B 22.5 4.5 34.75
C 24.0 6.0 39.0
I. Which share in your opinion is more stable in value
II. If you are holder of all 3 shares, which are would you dispose off at present?
Why?
Example 4: Following are the record of goals scored by team A in the football season.
No. of Goals 0 1 2 3 4
Matches 1 9 7 5 3
For team B the average number of goals scored per match was 2.5 with S.D = 1.25 gaols

BIET – MBA Programme, Davangere

35
Prof. Vijay K S Business Statistics and Analytics

Find which team to be consider more consistent?
Note: Standard deviation is understood to be best measure of dispersion as
1. It includes all observations in its calculations
a. Is more suitable as compared to other measurement of dispersion
b. Is not much affected by extreme values
c. It is rigidly defined
2. S.D is used in finding the normal probabilities of population
Some remarks on Standard Deviation
1. Mean Deviation can be computed by taking the deviation from any averages i.e.
Mean, Median and Mode, but Standard Deviation is always computed from
Arithmetic Mean
2. Standard Deviation of the variable X will be denoted by ??????� , This notation will be
useful when we have to deal with the standard deviation of two or more
variables
3. SD is always taken as the positive square roots
4. Since S.D depends on the numerical value of the deviation thus the value of ?????? will
be greater if the value of x are scattered ;widely away from the mean
- Smaller the ?????? implies that distribution is homogeneous
- Larger value of ?????? implies distribution is heterogeneous
Mathematical Properties of Standard Deviation
1. Standard deviation is independent of change of origin but not of scale
2. Standard Deviation is the minimum value of the root mean square deviation
3. S.D greater or equal to Range
4. S.D is suitable for further mathematical treatment
5. The S.D of first n natural numbers 1, 2, 3,4 ….n is √
(�
2
−1)
12

6. The Empirical Rule:
a. For a symmetrical bell shaped distribution, we have approximately the
following properties

i. 68% of the Observation lie in the range : Mean ± ??????
ii. 95% of the observation lie in the range : Mean ± 2??????
iii. 99% of the observation lie in the range : Mean ± 3??????

7. The approximate relationship between Quartile Deviation (QD), Mean Deviation
(MD) and Standard Deviation (??????) is
QD =
2
3
?????? MD =
4
5
??????

BIET – MBA Programme, Davangere

36
Prof. Vijay K S Business Statistics and Analytics

QD : MD : SD :: 10 : 12 : 15
8. Any discrete distribution, Standard Deviation is not less than the Mean Deviation
about mean I.e. SD ≥ Mean Deviation about mean
Question from Previous Question Papers:
1. Which is good measure of central tendency? Give any two reason

2. The following distribution gives the pattern of overtime work done by 100 employees of
a company. Find the mean and median.
Overtime(Hrs) 10-15 15-20 20-25 25-30 30-35 35-40
No. of
Employees
11 20 35 20 8 6

3. The following data gives the prices X and Y of shares A and B respectively. Compute the
coefficient of variance X and Y and state which is more stable in value.
Price of Shares A X 55 54 52 56 58 52 50 51 49
Price of Shares B Y 108 107 105 106 107 104 103 104 101


4. A Sample of 50 cars each of 2 makes X and Y is taken and average running life in years is
recorded

Life (No. of Years)
No. of Cars
Make X Make Y
0-5 8 6
5-10 12 10
10-15 17 20
15-20 10 12
20-25 3 2
a) Which of these two make gives higher average life?
b) Which of these makes has shown greater consistent performance? Use standard
deviation.

5. Calculate mean for the following frequency distribution
I) By direct method
II) By step-deviation method
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of Students 6 5 8 15 7 6 3

6. Find the value of Mean, Median and Mode from the following date given below
Wt (in kg) 93-97 98-102 103-107 108-112 113-117 118-122 123-127 128-132
No. of Students 3 5 12 17 14 6 3 1

7. Determine the missing frequency of the class interval 15-20, the mean being 19 units
X 5-10 10-15 15-20 20-25 25-30
F 2 2 ? 4 4

BIET – MBA Programme, Davangere

37
Prof. Vijay K S Business Statistics and Analytics

8. Find the standard deviation method for the following data
CI 0-10 10-20 20-30 30-40 40-50 50-60 60-70
F 6 14 10 8 1 3 8
9. Calculate the quartiles from the following data
CI 0-10 10-20 20-30 30-40 40-50
F 3 8 20 12 7

10. Compute Mean, Median and Mode from the data pertaining to marks scored by 80
students in statistics. The test is of 140 marks?
Marks more
than
00 20 40 60 80 100 120
No of Students 80 76 50 28 18 09 01

11. Find the value of X and Y from the following distribution
Mid Values 15 25 35 45 55 65
Frequency 10 X 15 20 Y 11
Note: N =82 and Median = 41

12. Calculate Q1 and Q3 from the following Distribution
CI 0-10 10-20 20-30 30-40 40-50 50-60
F 3 8 20 12 7 3

13. From the prices of shares of X and Y below find out which is more stable in values.
X 35 54 52 53 56 58 52 50 51 49
Y 108 107 105 106 107 104 103 104 105 101

14. The following data gives the prices X and Y of shares A and B respectively. Compare the
co-efficient of variation of X and Y ad state which share is more stable in value.
Share A 55 54 52 56 58 52 50 51 49
Share B 108 107 105 106 107 104 103 104 101

BIET – MBA Programme, Davangere

38
Prof. Vijay K S Business Statistics and Analytics



Unit – 2
Correlation and Regression

BIET – MBA Programme, Davangere

39
Prof. Vijay K S Business Statistics and Analytics

Unit -2
Correction and Regression: Scatter Diagram, Karl Pearson correlation, Sparman’s Rank
correlation (One way table only), Simple and multiple regression (Problems on simple
regression only)
Correlation:
Introduction:
Measures of central tendency and dispersion are confined to univariate distribution i.e.
the distribution involving only one variable. These measures also used for the purpose of
the comparision and analysis
In some distribution or set of data, each unit assumes two values, we call it as bivariate
distribution. Example, an individual in distribution has got two vaiables like hight and
weight. If we measure more than two variables on each unit of distribution, it is called
“Multivariate distribution”.
Some of the examples of “Bivariate distribution”
- The series of marks of individuals in two subjects in an examination
- The series of sales revenue and advertising expenditure of different
companies in a particular year.
- Imports and exports of cotton from 1989 to 1994
- The series of ages of husband and wives in a sample of selected married
couple
In bivairate distribution, we may be interested to find if there is any relationship between
two variables under study.
The correlation is a statistical tool which studies the relationship between two variables
The correation analysis involves various methods and techniques used for studying and
measuring the extent of the relationship between the two variables.
Variables are set to be correlated if the change in one variable result in corresponding
change in the other variable.
Some definition:
“When the relationship is of a quantitative nature, the appropriate statistical tool for
discovering and measuring the relationship and expressing it in a brief formula is known
as correlation”
“Correlation is an analysis of the covariation between two or more variables”.

BIET – MBA Programme, Davangere

40
Prof. Vijay K S Business Statistics and Analytics

Types of correlation
A) Positive and Negative Correlation
B) Linear and Non-Linear Correlation

A) Positive and Negative Correlation
If the values of two variables deviate in the same direction, correlation is said to be
“Positive or Direct” Correlation
Example:
- Hight and Weights
- The family income and expenditure on laxury goods
- Amount of rainfall and yeild of crops
- Price and Supply of commodities
If the values of two variables deviate in opposite direction
Example:
- Price and Demand of commodity
- Volume and pressure of perfect gas
- Sales of wollen garments and day temperature

B) Linear and Non-Linear Correlation
Correlation is linear if corresponding to a unit change in one variable, there is a
constant change in the other variables over the entire range of values
In general, two variables x and y are said to be linear related, if there exists a
relationship of the form
Y = a + b x
- Here “b”is the slope of the stright line
- Generally assumed that the relationship between two variables under study
is linear
The relationship between two variables is said to be non-linear or curvilinear, if
corresponding to a unit change in one variable, the other variable doesnot change at a
constant rate but at flactuating rate.

BIET – MBA Programme, Davangere

41
Prof. Vijay K S Business Statistics and Analytics

Correlation and Cousation
Correlation analysis enables us to have an idea about the degree and direction of
relationship between the two variables under study
But it fails to reflect upon the cause and effect relationship between two variables
In a bivariate distribution, if the variables have cause and effect relationship, they are
bound to have high degree of correlation between them. That means, causation always
implies correlation.
However, the converse is not true i.e. there may be a fairly high degree of correlation
between the two variables; need not imply a cause and effect relationship between them.
This high degree of correlation between variables may be due to
Mutual dependence
Both the variables bring influenced by the same external factors
Pure chance
Methods of studying correlation:
Assuming that there is a linear relationship exist between two variables or series
The commonly used method for studying the correlation between two variables are
I. Scatter diagram method
II. Karl Pearson’s Co-efficient of Correlation (Covariance method)
III. Two way frequency table (Bivariate correlation method)
IV. Ranks method or Spearman’s Rank Correlation
V. Concurrent Deviation Method

I. Scatter Diagram Method:
- It is one of the simplest way or method of diagrammatic representation of a
bivariate distribution and provides us one of the simplest tool of
ascertaining the correlation between two variables
- The “n” points are plotted as dots of two variables (Examples heights and
weight). The diagram of dots so obtained is known as “Scatter Diagram”
- From the scatter diagram, we can form a fairly good, tough rough idea about
the relationship between the two variables.

Some points regarding scatter diagram:
- If the points are very dense, i.e. very close to each other then it is said to be
fairly good amount of correlation may be expected between two variables.

BIET – MBA Programme, Davangere

42
Prof. Vijay K S Business Statistics and Analytics

On the other end, if the points are widely scattered, a poor correlation may
be expected between them.
- If the points on the scatter diagram reveal trend (either upward or
downward), the variables are said to be correlated and if no trend is
revealed, the variables are not correlated. Uncorrelated
Perfect Positive Correlation






Perfect Negative Correlation






Low degree of positive correlation






Low degree of negative correlation

BIET – MBA Programme, Davangere

43
Prof. Vijay K S Business Statistics and Analytics

High degree of positive correlation







High degree of negative correlation








No Correlation







Remarks:
- Scatter diagram provides rough idea about the relationship between two variables.
It is not getting affected by the extreme numbers or observations as it does with
the mathematical formulae. However, this method is not suitable if the number of
observation is fairly large
- It does not provide us an exact measures of the extent of relationship between two
variables
- It provides only the approximate estimating line or line of best fit by free hand
method

BIET – MBA Programme, Davangere

44
Prof. Vijay K S Business Statistics and Analytics

Karl Pearson Coefficient of Correlation:
This is also called “Covariance Method” or “Product moment correlation co-efficient”
- A mathematical method of measuring the intensity or the magnitude of linear
relationship between two variable series
- It was suggested by Karl Pearson, and this method is most widely used in the area
of practice
- Karl Pearson’s measure, also known as Pearsonian correlation coefficient between
two variables i.e. series X and Y. Usually denoted by r( x , y ) or rxy or r
r =
��� (�,�)
??????� ??????�

It is the ratio of the covairnace between x and y, written as Cov (x, y), to
the product of standrad deviation of x and y
Here the Cov(x,y) =
1
�
∑(�−�̅) . (y - �̅)
??????�= √
∑(�−�̅)
2
�

??????�= √
∑(�−�̅)
2
�

If �̅ ��� �̅ come out to be integer (i.e. whole number) then the following
formula is feasible to use
r =
∑�� . ��
√∑��
2
. ��
2

or
r =
∑(�−�̅) . (�−�̅)
√∑(�−�̅)
2
. (�−�̅)
2


If �̅ ��� �̅ are in fractions then the above formula is cumbersome to apply,
then we should use the following formula

r =
�∑��− ∑� . ∑�
√�.∑�
2
−(∑�)
2
∗�.∑�
2
−(∑�)
2

BIET – MBA Programme, Davangere

45
Prof. Vijay K S Business Statistics and Analytics

Note:
1. Two variables are said to be correlated if their exist cause and effect relationship
between them
Example:
- Yield and rainfall
- Production and price
- Demand and supply

2. Correlation is said to be Positive, if increases in one variable result in increase on
other variable or Decrease in one as a result of decrease of other.
Example: Demand and Production, Production and Supply, Income and Want

3. Correlation is said to be negative, if increase in one variable result in decrease of
other variable. Similarly the decrease in one due to increase in other variable.
Example: Production and Price, Supply and Price

4. Correlation is said to be zero, if the variables are independently behaving
Example: Decrease in tax rate and revenue

5. r lies between -1 and +1
 If r lies between 0 to 1, that means positive correlation exists
 If r is exactly 1, the correlation is perfect positive correlation
 If r lies between -1 to 0, that means negative correlation exists
 If r is -1, that implies perfect negative correlation
Problems:
C-1: Find Karl Pearson’s co-efficient of correlation for the following data
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60

C-2: Calculate KPCC for the following two variables
X 80 90 100 110 120 130 140 150 160
Y 15 15 16 19 17 18 16 18 14

C-3: Calculate Pearson’s Co-efficient of correlation for the following
X 6.9 8.5 5.8 8.6 9.6 8.0 9.7
Y 2.9 3.8 6.5 2.3 5.5 3.5 3.2

BIET – MBA Programme, Davangere

46
Prof. Vijay K S Business Statistics and Analytics

C-4: Calculate Karl Pearson’s Co-efficient of correlation between expenditure on
Advertising and sales from the data given below
Ad Expenses 39 65 62 90 82 75 25 98 36 78
Sales
(Lakhs)
47 53 58 86 62 68 60 91 51 84

C-5: From the following table calculate the co-efficient of correlation by Karl Pearson’s
Method
X 6 2 10 4 8
Y 9 11 ? 8 7
Arithmetic mean of X and Y series are 6 & 8 respectively
C-6: Calculate the co-efficient of correlation between X and Y series from the following
data
X Y
No. of Pairs of Observation 15 15
Arithmetic Mean 25 18
Standard Deviation 3.01 3.03
Sum of Squared Deviation from
Mean
136 138

C-7: The Co-efficient of correlation between X and Y is 0.48, the Co-variance of x, y is 36,
the variance of X is 16. Find the standard deviation of Y.

C-8: Given the following information
rxy = 0.8 ∑��= 60 ??????�=2.5 and ∑�
2
= 90, where x and y are the deviation from the
respective means. Find the number of items.

Properties of Correlation Co-efficient
1. Pearson’s Correlation Co-efficient can bot exceed 1 numerically. In other words it
lies between -1 and +1
2. Correlation Co-efficient is independent of the change of origin and scale
3. Two Independent variables are uncorrelated but the converse is not true i.e.
uncorrelated variables need not necessarily be independent
Spearman’s Rank Correlation
Sometimes we come across statistical series in which the variables under consideration
are not capable of quantitative measurement can be arranged in serial order. This
happens when we are dealing with qualitative characteristics such a honesty, Beauty,
Character, morality etc.

BIET – MBA Programme, Davangere

47
Prof. Vijay K S Business Statistics and Analytics

In these above cases or situation Karl Person’s Co-efficient of correlation cannot be used
as such.
The variables are measurable and attributes are not able to measure i.e. quantification of
these are difficult. Spearman’s Co-efficient of correlation is more appropriate for this and
it is calculated by using following formula
It is denoted as ?????? (Rho)
?????? = 1 −
6 ∑�
2
�
3
−�

D = Difference between ranks i.e. D = R1-R2
N = Number of observations under x and y

Problems:
C-9: Find Spearman’s Rank correlation for the following data
X 12 18 32 45 21 30
Y 70 68 75 95 86 12

C-10: From the following data calculate co-efficient of Rank correlation
Ranks X 1 2 3 4 5 6 7 8 9 10 11 12
Y 12 9 6 10 3 5 4 7 8 2 11 1


C-11: Calculate the Spearman’s Co-efficient of rank for the following data
X 78 89 97 69 49 79 68 57
Y 125 137 156 112 107 136 123 108
Note: If X and Y contains repeated observations, spearman’s rank co-efficient
correlation is calculated as
?????? = 1 −
6 (∑�
2
+�??????)
�
3
−�


Where CF is the correction factor for the repeated observations and is given by
CF =
�
3
−�
12
”m” is the number of repeated observations

BIET – MBA Programme, Davangere

48
Prof. Vijay K S Business Statistics and Analytics

Problems:
C-12: Find the spearman’s rank correlation co-efficient
X 78 89 89 69 59 79 68 57
Y 125 137 156 112 112 112 123 108

C-13: Find Spearman’s Rank Correlation Co-efficient
X 12 18 32 18 25 24 25 40 38 22
Y 16 15 28 16 24 22 28 36 34 19

C-14: Find the Co-efficient of rank correlation between X and Y
X 30 38 28 27 28 23 30 33 28 35
Y 29 27 22 29 20 29 18 21 27 22

C-15: Calculate Co-efficient of rank correlation
X 15 20 28 12 40 60 20 80
Y 40 30 50 30 20 10 30 60

Regression Analysis:
Technique of establishing one variable based on the values of other variable whenever
two variables are correlated is called regression.
I.e. if “x” and “y” are correlated then estimating the values of “x” based on the values “y”
or estimating the values of “y” based on the values of “x” is called Regression Analysis.
To estimate “x” values based on “y”, we have the regression equation of “x” on “y” is
given by
(x-�̅) = bxy (y-�̅)
Where bxy is the regression coefficient of x on y
�̅, �̅ are the means of x and y respectively
Similarly to estimate “y” based on “x” we use the regression equation of “y” on “x” as
(y-�̅) = byx (x-�̅)
Where byx is the regression coefficient of y on x
�̅, �̅ are the means of x and y respectively
And

BIET – MBA Programme, Davangere

49
Prof. Vijay K S Business Statistics and Analytics

bxy = r .
??????�
??????�
or bxy =
� ∑��−(∑�) (∑�)
� ∑�
2
− (∑�)
2

byx = r .
??????�
??????�
or byx =
� ∑��−(∑�) (∑�)
� ∑�
2
− (∑�)
2
R-1: Find two regression Equation for the following data
x 62 72 98 76 81 56 76 92 88 49
y 112 124 131 117 132 96 120 136 97 85

R-2: Following data relate to experience of 8 operators and their performance rating (y),
calculate the regression line of performance rating on experience and estimate the
probable performance of the operator has 15 years of experience.
R-3: Fit a least square line of the following
X 1 3 4 8 9 11 14
Y 1 2 4 5 7 8 9
a) Obtain Co-efficient of X on Y and Y on X
b) Find the co-efficient of correlation between X and Y
c) Find Y and X, when X=10 and Y=6 respectively
Note:
1. Co-efficient of correlation r = √��� . ���
2. bxy and byx will never be a opposite sign
3. r will be positive if bxy and byx are positive
4. r will be negative if bxy and byx are negative
R-4: The height of father’s and Son’s is given in the following table, Find the two lines of
regression and estimate the expected average height of son’s, when the height of the
father is 67.5 inches
Height of father 65 66 67 67 68 69 71 73
Height of sons 67 68 64 68 72 70 69 70

R-5: The marks of the 8
th
Standard students in mathematics and statistics are as follows,
find the regression on rank of marks in statistics on marks in mathematics also find the
marks of 9
th
student in statistics, if he has scored 90 in mathematics.
(X) Maths Scored 50 40 60 46 50 48 59 47
(Y) Statistics Scored 30 37 42 32 35 45 40 35

R-6: You are given the following information. Find
- r = 0.66

BIET – MBA Programme, Davangere

50
Prof. Vijay K S Business Statistics and Analytics

- Two regression equation
- Regression co-efficient of correlation
- Estimate x when y=100
X Y
AM 36 85
SD 11 8

R-7: Following is the information about advertisement expenditure and Sales
Advertising
Expenditure
Sales
AM 20 120
SD 5 25
r = 0.8
a) Calculate two regression equation
b) Find the likely sales when advertisement expenses are 25 crores
c) What should be the advertisement budget when sales target is 150 crores
R-8: Two regression equation are 2y-x-50 = 0 and 3y-2x-10=0
Find (i) r (ii) �̅ and �̅ or (Point of Intersection)
Question from Previous Question Papers:

1. The following data give the test scores and sales made by nine sales men during
certain period:
Test Scores: 14 19 24 21 26 22 15 20 19
Sales (’00 Rs) 31 36 48 37 50 45 33 41 39
Find the regression equation and also estimate the most probable sales volume of a
salesman making a score of 28.

2. From the following data calculate the rank correlation coefficient after making
adjustment for ties ranks
X 48 33 40 9 16 16 65 24 16 57
Y 13 13 24 6 15 4 20 9 6 19

3. The Following data relate to age of employees and the number of days they reported
sick in a month. Calculate Karl Pearson’s co-efficient of correlation and interpret it.
Age (Years) 30 32 35 40 48 50 52 55 57 61
Sick Days 1 0 2 5 2 4 6 5 7 8

BIET – MBA Programme, Davangere

51
Prof. Vijay K S Business Statistics and Analytics

4. The following data gives the experience of machine operator and their performance
ratings.
Operator 1 2 3 4 5 6 7 8
Performance (in Years) 16 12 18 4 3 10 5 12
Performance Ratings 87 88 89 68 78 80 75 83
Calculate the regression line of performance rating on experience and estimate the
probable performance rating if an operator has 7 years of experience

5. A Company wants to assess the impact of R and D expenditure (in Rs. 1000/-)
on its annual profit (in Rs. 1000/-). The following table presents the
information for last 8 years.
Year R and D
Expenditure
Annual Profit
2010 9 45
2011 7 42
2012 7 40
2013 1 60
2014 4 30
2015 5 34
2016 3 25
2017 3 20
Estimate the regression equations and predict the annual profit for the year
2020 for an allocated sum of Rs. 100,000/- as R and D expenditure.

6. The following table shows the ages (x) and blood pressure (y) of 8 persons.
X 52 63 45 36 72 65 47 25
y 62 53 51 25 19 43 60 33
Obtain the regression equation of y on x and find the expected blood pressure of a
person who is 49 year old.
7. The following data relate to the ages of husbands and wives:
Age of Husbands (Years) 25 28 30 32 35 36 38 39 42 55
Age of Wives (Years) 20 26 29 30 25 18 26 35 35 46
Find the regressions and also find the most likely age of husband when wife’s age is 25
Years
8. Calculate Karl Pearson’s coefficient of correlation between expenditure on
advertising and sales from the data given below:
Advertisement expenditure (Rs.000’) 39 78 65 62 90 82 75 25 98 36
Sales (Rs. Lacs) 47 84 53 58 86 62 68 60 91 51
Comment the results

BIET – MBA Programme, Davangere

52
Prof. Vijay K S Business Statistics and Analytics

9. Consider the following data, obtain the regression equations:
X 6 2 10 4 8
Y 9 11 5 8 7

10. A research company summarized advertising expenditure and sales results
as follows:
Adv. Exp. (Rs. In Crore) Sales (Rs. In Crore
Mean 20 200
SD 18 17


11. The following table gives the age of cars of a certain make and annual
maintenance costs. Estimate the maintenance cost of a 7 year old car.
Age of Cars (in
Years)
2 4 6 8
Maintenance cost (in
Hundreds of Rs.)
10 20 25 30

12. A financial analyst wanted to find, out whether inventory turnover influence
any company’s earnings per share (in%). A random sample of 7 companies
listed in stock exchange were selected and the following data was recorded.
Company A B C D E F G
Inventory Turnover 4 5 7 8 6 3 5
Earning Per Share (%) 11 9 13 7 13 8 8

BIET – MBA Programme, Davangere

53
Prof. Vijay K S Business Statistics and Analytics


Unit – 3
Probability Distribution

BIET – MBA Programme, Davangere

54
Prof. Vijay K S Business Statistics and Analytics

Unit 3: (8 Hours)
Probability Distribution:
 Concept and definition – Rules of probability – Random variables – Concept of
probability distribution
 Theoretical probability distribution: Binomial, Poisson, Normal and Exponential –
Baye’s theorem (No deviation) (Problems only on Binomial, Poisson and Normal)
Introduction:
The result or outcome of an experiment, which is performed repeatedly under essentially
homogeneous and similar condition are categorised either of the following
- It is unique or Certain
o Here the results are predictable with certainty and they are known as
deterministic or predictable phenomenon
Example: Boyles Law: Pressure × Volume = Constant
o Most of physical and Chemical sciences are deterministic in nature

- Uncertain or Unpredictable
o Where the results cannot be predicted with certainty and they are known
as unpredictable or probabilistic phenomenon.
Example: Sales manager on sales target, life of electric bulb
o It is frequently observed in Economics, Business and Social Sciences
A numerical measurement of uncertainty is provided by a very important branch of
statistics called the “Theory of Probability”.
Here the mathematics and statistics, we try to present condition under which we can
make sensible numerical statements about uncertainty and apply certain methods of
calculating numerical values of probabilities and expectations.

“Statistics is the science of decision making with calculated risks in the face of
uncertainty”

Many business decisions are based on variables which are certainly not under control
and hence decision yields poor results.
Using mathematical models, we can make better decisions but in limited number of cases.
In such circumstances we make use of probability i.e. calculations to predict about
uncertain conditions or situation.
Probability is a measure of chance associated with occurrence of an event (That is
essential to make good business decisions)
Example: Demand for the products, which is newly launched.

BIET – MBA Programme, Davangere

55
Prof. Vijay K S Business Statistics and Analytics

Important terminologies in Probability:
1. Experiments

The term experiment refer to describe an act which can be repeated under same
given conditions.

Random experiment: An experiment is called random experiment if when
conducted repeatedly under essentially homogeneous conditions, the result is not
unique or results is not certain but may be any one of the various possible
outcomes.
Or
An Experiment having random outcomes
Or
Experiments whose results are depends on chance
Example: Tossing a coin, rolling a dice

2. Trail
Performing of a random experiment is called a trial
Example: Tossing experiment of a coin has done two times, that means two
trials

3. Event:
Outcome or combination of outcomes of an experiment are termed as events
Example: Tossing a coin – You may get H or T – These are events

4. Mutually Exclusive Events:

Two events are said to be mutually exclusive or incompatible, when both cannot
happen simultaneously in a single trial or in other words, the occurrence of any
one of them avoid the occurrence of the other.

In other words “if happening of one event prevents the happening of the other
events such events we call it as mutually exclusive events”

Example: Tossing a coin leads to two events Head (H) or Trail (T)
If head turns up in tossing a coin, then head prevents tail to turn-up
and vice-versa

BIET – MBA Programme, Davangere

56
Prof. Vijay K S Business Statistics and Analytics


5. Independent and Dependent Events:

Two or more events are said to be independent when the outcome of one doesn’t
affect, and is not affected by the other.
Example: Tossing of coin twice, happening of head during the first trail will
not affect the happening of other in the next trial

The occurrence and non-occurrence of one event in any one trial affect the
probability of other event in other trial
Example: Drawing a card without replacement.

6. Equally likely events:
Events are said to be equally likely when one doesn’t occur more often than the
others. This means none of them is expected to occur in preference of other.

In other words – equal chance of occurrence and importance for all the events to
occur

Example: When you roll a dice, occurrence of all the 6 faces i.e. 1, 2, 3, 4, 5, 6 are
equally likely

7. Simple and Compound Events:
In case of simple events we consider the probability of the happening or not
happening of single events

Compound events, we consider the joint occurrence of two or more events




8. Exhaustive Events:
Events are said to be exhaustive when their totality includes all the possible
outcomes of a random experiment.

In other words, if the sum of individual chance of occurrence is equal to 1

Example1 : Rolling dice, once the possible outcomes are 1, 2, 3, 4, 5 and 6,
hence the exhaustive number of cases is 6
Example 2: If we roll two dice once the exhaustive number of cases is 6
2
= 36
Similarly for rolling of three dice leads to 216 outcomes and summation of
possibilities or probability of occurrence of all these events is 1


10 Red balls and
6 White balls
Probability of drawing 3 white
balls in the first draw and 3
black balls in second draw
Probability of
drawing a red ball

BIET – MBA Programme, Davangere

57
Prof. Vijay K S Business Statistics and Analytics


9. Complementary events:
Let there be two events A and B, A is called the complementary event of B (and
Vice versa), if A and B are mutually exclusive and exhaustive.
Example: When the dice is thrown, the occurrence of an even number and
odd number are complementary events.
Simultaneous occurrence of two events A and B is generally written as AB

Definition of Mathematical Probability

If there be a random experiment with “N” outcomes which are mutually exclusive,
exhaustive and equally likely

Let there be an event “A”, Let “M” outcome occur for the event “A” (Favourable
outcomes), then the probability of occurrence of “A” can be written as follows

P (A) =
�
�
=
�������� ??????��������� �� "�"
??????���� ������ �� ��������


Example: Rolling of dice once S = 1, 2, 3, 4, 5, 6

The total number of outcomes N = 6

 Probability of getting odd numbers
P (Odd Numbers) =
�
�
=
3
6
=
1
2
= 50%
That means the probability of getting the odd number is 50%

 Probability of getting the Even numbers
P (Even Numbers) =
�
�
=
3
6
=
1
2
= 50%
That means the probability of getting the even number is 50%
The probability of not happening of “A”, we call it as complementary events of “A”
denoted as ??????̅ or ??????
�
or ??????
1

i.e. P(??????
̅
) =
�−�
�
=
??????���� �������−??????�������� �������
??????���� �������

P(??????̅) =
�
�

�
�
= 1−
�
�
= 1 – P (A)

BIET – MBA Programme, Davangere

58
Prof. Vijay K S Business Statistics and Analytics

P(??????̅) + P (A) = 1 i.e. the P(Failure) + P (Success) = 1
Theorems of Probability or Rules of Probability
The two important theorems of probability
1. The addition theorem
2. The multiplication theorem

1. The addition Theorem of Probability:

- This is also called as “Or” Probability P (A) or P (B)
P (A Occurring) or P (B Occurring)
- This is also denoted as follows
o P(A) or P(B)
o P(A) U P(B)
o P(A or B)
o P(A U B)
- Here “Or” means “add”

- The addition theorem stats that if two or more A and B are mutually exclusive, the
probability of the occurrence of either A or B is the sum of the individual
probabilities of A and B
So Symbolically P(A or B) = P(A) + P(B)

- When the events are mutually exclusive events i.e. when the events are disjoint,
then adding the probability of occurring both events will be good

Mutually Exclusive events



P(A or B) = P(A U B) = P(A) + P(B)



Similarly, for three or more mutually exclusive events=P(A or B or C)=P(A)+P(B)+P(C)

A B
Event B Event A
Denotes “A Union B”

BIET – MBA Programme, Davangere

59
Prof. Vijay K S Business Statistics and Analytics



- When events are “Not mutually exclusive”, here there is a possibility of occurrence
of both the events. Then the addition rule of probability will get modified
When the events are not mutually exclusive i.e. when there is a overlap, then the
addition rule is






Overlap of Event
P(A or B) = P(A U B) = P(A) + P(B) – P(A ⊓ B)
Or
P(A or B) = P(A U B) = P(A) + P(B) – P(A and B)
Or
P(A or B) = P(A U B) = P(A) + P(B) – P(AB)

2. The Multiplication Theorem of Probability:

- This is also called as “And” Probability
- Denoted as P(A) and P(B) for events A and B
- Also denoted as
o P(A & B)
o P(A ⊓ B)
o P(A) ⊓ P(B)
- This theorem states that if two events A and B are independent, the probability
that they both will occur is equal to the product of their individual probabilities
If A and B are independent i.e. Event “A” will not affect the event “B”
P(A ⊓ B) = P(A) . P(B) or
P(A and B) = P(A) × P(B)

Event A

BIET – MBA Programme, Davangere

60
Prof. Vijay K S Business Statistics and Analytics

Similarly for three events
P(A ⊓ B ⊓ C) = P(A) × P(B) × P(C) or
P(A, B and C) = P(A) . P(B) . P(C)
- If the events are interdependent / Dependent, two events; if A and B are said to be
dependent i.e. B occurs only when A is known to have occurred

Then

P(A ⊓ B) = P(A) × P(B/A) ; P(A) ≠ 0

P(B ⊓ A) = P (B) × P(A/B) ; P(B) ≠ 0

Or
P (B/A) =
� (��)
�(�)
=
� (�) ��� �(�)
�(�)


P (A/B) =
� (��)
�(�)
=
� (�) ��� �(�)
�(�)


The above cases were also called as conditional probability

Problems:

1. Two unbiased coins are tossed once. Find the probability of getting
I) At least one head
II) At most one head
III) Two tails

2. An unbiased dice is rolled once. Find the probability of getting
I) Odd Numbers
II) Even Numbers

3. A pair of dice are rolled. Find the probability of getting the sum on the faces turning
up to be
I) Up to be 7
II) At-least 10
III) Up to be 12
IV) Neither 7 nor 10

4. A Bag contains 6 White, 4 Red and 8 Black marbles. 3 Marbles drawn randomly,
what is the probability that they are of
I) Same Colour
II) Different Colours

BIET – MBA Programme, Davangere

61
Prof. Vijay K S Business Statistics and Analytics

5. A Bag contain 7 White and 9 Black Marbles, 2 marbles are drawn randomly. What is
the probability that
I) They are of same colours?
II) They are of different colours?

6. A bag contains 5 Red, 3 white and 6 Green sticks. 3 sticks are drawn randomly. Find
the probability that
I) All are green II) 2 Red and 1 Green Sticks III) 3 White Sticks

7. 4 Cards are drawn from a pack of cards. Find the probability that 2 are Spades and 2
are Hearts.

8. From the pack of 52 cards, 4 are accidently drawn. Find the chance that
I) They will consist of a Jack, A Queen, A King and A ACE
II) They are one from each suit
III) 2 of them are Red and 2 of them are Black

9. In a College, 60% of the students play football and 50% of them play basketball. If a
student is selected randomly from the college, what is the probability that
I) He plays basketball or Football
II) Students play neither sports

10. A person is known to hit the target in 3 out of 4 shots, where as another person
known to hit the target in 2 out of 3 shots. Find the probability that hit the target?

11. Salesman is known to sell the product in 3 out of 5 attempt, another salesman in 2
out of 5 attempt. Find the probability that
I) Number of sales will be affected when they try to sell the product => Both
will not be able to sell
II) Either of them succeed in selling the product

12. Count against student “X” solving a B/S problem are 8:6 and count in favour of
student “Y” solving B/S problems are 14:16
I) What is the chance that the problem will be solved or they both have
independently of each other?
II) What is the probability that neither solves the problem?

13. If the P(A) is 0.3, P(B) is 0.2 and P(C) = 0.1 and A B C are independent events. Find
the probability of occurrence of at-least one of the 3 events A, B and C

Bayes’ Theorem:
This theorem allows us to use new information to update the conditional probability of an
event. Bayes’ theorem in its simple form is given by

P(A / B) =
� (� ∩ �)
� (�)

Event B

BIET – MBA Programme, Davangere

62
Prof. Vijay K S Business Statistics and Analytics

Random Variable
Random variables are really ways to map the outcome of random processes to numbers. It is a
process of quantifying the outcomes of the random experiment. If you have a random process like
flipping a coin or rolling a dice or you are measuring a rain that might fall tomorrow; here you are
measuring the outcomes of these random processes to numbers that means you are quantifying
the outcomes.
 Random variable is a function which takes real values which are determined by the
outcomes of the random experiment
 The random variables were denoted by the capital letters X, Y, Z
 The actual values which events assumes is not a random variable.
 The random is used to do further mathematical operation of the outcomes and for the
purpose of notation.

Example: A Random experiment where three coins are tossed simultaneously; then the
outcomes are
S = {(�,??????)��� (�,??????) ��� (�,??????)}, which can also be denoted as follows
S = {(�,??????) × (�,??????) × (�,??????)}
The total outcomes as follows
S = {���,��??????,�??????�,�????????????,��??????,�????????????,??????�??????,??????????????????}
Let us consider variable “X” to quantify the outcomes of the above experiment; If “X” is
the No of Head obtained, Then “X” takes any one of the value {0,1,2,3}
Outcomes: ���, ��??????, �??????�, �????????????, ��??????, �????????????, ??????�??????, ??????????????????
Values of X: 3 2 2 1 2 1 1 0

Hence the random variable is a function which takes real values which are determined by the
outcomes of the random experiment.
Discrete and Continuous Random Variable:
If “X” Assumes only a finite or countable infinite set of values, it is known as Discrete
Random Variable
Example: No of students in a college, Marks obtained by the students in a test, Number
of defective mangoes in a basket.
If “X” assumes infinite and uncountable set of values, it is set to be Continuous Random
Variable. Here we usually talk of the values in a particular interval and not at a point.
Example: Height or Weight of students in a classroom
Generally Discrete Random Variable represents counted data while Continuous Random
Variable represents measured data.

BIET – MBA Programme, Davangere

63
Prof. Vijay K S Business Statistics and Analytics


Probability Distribution of a Discrete Random Variable
Let us consider a Discrete Random Variable “X” which can take the possible values x1, x2,
x3,……., xn with each value of the variable X, we associate a number
pi = P(X=Xi); i=1,2,3……………., n
Where pi = P(X=Xi) ≥0 ��� ∑��=�1+�2+�3…….��=1
The function pi = P(X=Xi) pr p(x) is called the Probability Mass Function of the random
variable X and the set of all possible ordered pairs {�,�(�)} is called the Probability
Distribution of random variable X
The concept of probability distribution is analogous to that of frequency distribution. Just as
frequency distribution tells us how the total frequency is distributed among different values (or
classes) of the variable. Similarly a probability distribution tells us how total probability of 1 is
distributed among the various values which the random variable can take. It usually represented
in a tabular form given below.
Probability Distribution of Random Variable X
x p(x)
x1
x2
x3
.
.
xn
p1
p2
p3
.
.
pn

Probability Distribution of a Continuous Random Variable
This will be represented in the form of frequency polygon drawn by referring to the
grouped frequency distribution of a continuous variable. A frequency polygon gets
soother and smoother as the sample size gets larger, and the class intervals become more
numerous and narrow. Ultimately the density polygon becomes a smooth curve called the
density curve. The function that defines the curve is called the Probability Density
Function.
Concept of Probability Distribution
The probability distribution of a random variables may be
- Theoretical list of outcomes and probabilities which can be obtained from a mathematical
model representing some phenomenon or process of interest
- Empirical listing of outcomes associated, with their subjective or contrived probabilities
representing the degree of conviction of the decision maker as to the likelihood of the
possible outcomes.
- An empirical listing of outcomes and their observed relative frequency
Here we are focusing on the theoretical listing of the outcomes and their probabilities.

BIET – MBA Programme, Davangere

64
Prof. Vijay K S Business Statistics and Analytics

Mathematical Expectations of Random Variable:
The expected value of “X” and denoted by E(X) is defined as
E(X) = ∑ [� × �(�)]
E(X) = �̅
Hence Mathematical expectations of a random variable is nothing but its
Arithmetic mean
Variance, Standard Deviation and Mean
Mean = E(X) = ∑� × �(�)
Variance = ??????
�
2
= ?????? (�)
2
– [??????(�)]
2

= ∑�
2
×�(�) − [∑� �(�)]
2

Standard Deviation: ??????
�= √∑�
2
×�(�) − [∑� �(�)]
2


Problems:
1. A die is tossed twice. Getting “an odd number” is termed as success. Find the probability
distribution of the number of successes.
2. Two cards are drown
A. Successively with replacement
B. Simultaneously (Successively without replacement)
From a well shuffled deck of 52 cards. Find the probability distribution of the number of
aces.
3. Obtain the probability distribution of X, the number of heads in three tosses of a coin (Or
simultaneous toss of three coins)
4. Two dice are rolled at random. Obtain the probability distribution of the sum of the
numbers on them.
5. Four bad Apples are mixed accidentally with 20 good apples. Obtain the probability
distribution of the number of bad apples in a draw of 2 apples at random.
6. A die is thrown at random. Find the expectation of the number on it.

7. A random variable “X” has the following probability distribution. Find Mean and
Variance
x 4 5 6 7
P(x) 0.1 0.3 0.4 0.2

8. A r.v. X has the following probability function
X -2 -1 0 1 2 3
P(X) 0.1 K 0.2 2k 0.3 k
Find k, Mean and s.d (X)

BIET – MBA Programme, Davangere

65
Prof. Vijay K S Business Statistics and Analytics

Theoretical Probability Distribution
Theoretical probability distribution are the functions of a known random variable which
generates probabilities for a given values of a random variable. In other words probability
distribution are the ready to use formula (Functions) for calculating probability of a known
variable.
Amongst theoretical or expected frequency distribution the following are popular
1. Binomial Distribution
2. Poisson Distribution
3. Normal Distribution
Binomial Probability Distribution:
- It is also known as “Bernoulli Distribution”, Probability distribution expressing the
probability of one set of dichotomous alternatives i.e. success or failure.
- Conditions or assumptions of Binomial Distribution
o n, the number of trails is finite
o each trail results in two mutually exclusive and exhaustive outcomes, termed as
success and failure
o Trails are independent
o p, the probability of Success is constant for each trail, then q = 1-p, is the
probability of failure in any trail
- Bernoulli trail: A trail having only two outcomes
Example: Tossing a coin: H or T
Outcome of the game: Win or Lose
Business outcome: Success or failure
Let “x” be a random variable for a binomial variable with “n” trail and P(Success) = p, then
probability of “x” number of success is given by
P(x) = n??????
� . �
�
. �
�−�

Where x = Number of success in “n” trail
n = Number of trail
p = probability of success in a single trail
q = (1-p) = (1-Success)
Here “n” and “p” are the parameters of Binomial Distribution. They are the unknown
values in the above formula.
If we know the values of “n” and “p”, then we can able to find the required solution
Constants of Binomial Distribution
- Mean = np
- Variance = npq

BIET – MBA Programme, Davangere

66
Prof. Vijay K S Business Statistics and Analytics

- Standard Deviation = √���
Problems on Binomial Distribution
14. A fair coin is tossed 5 times, what is the probability of getting
I) Exactly three head
II) At-least 1 head
III) No Heads
IV) At most 3 heads

15. A salesman makes a sale of 4 out of 10 (40% success) customers he contacts. If four
customers are contacted today, what is the probability that he makes sales exactly
two?

16. 20% of the bolts manufactured by machine are defective. Find the probability that
there are

I) No defective
II) At most two defective
III) At-least 1 defective
IV) Exactly one defective bolt when 5 bolts are chosen randomly

17. The probability of man hitting target is 1/5, what is the probability that he targets
I) At-least once
II) At-least trice
If he aims 7 times at the target

18. In hundred sets of 10 tosses of an unbiased coin, how many tosses should we expect
to get
I) 7 heads and 3 Tails
II) At-least 7 heads
Fitting up of distribution (Framing a probability Distribution)
- The process of obtaining expected frequency based on the theoretical probability
distribution is called fitting up of distribution.

19. Fit a Binomial distribution for the following data
X 0 1 2 3 4 5
f 2 10 24 38 18 8

20. Fit Binomial distribution for the following data
X 0 1 2 3 4 5
f 2 10 48 114 72 40

21. Fit a binomial distribution for the following data
X 0 1 2 3 4 5 6 7
f 7 6 19 35 30 23 7 1

BIET – MBA Programme, Davangere

67
Prof. Vijay K S Business Statistics and Analytics


22. Mean and Variance of Binomial distribution are 12 and 5 respectively. Find the
parameters of Binomial Distribution?

23. The mean and Standard Deviation of Binomial Distribution is 4 and √3 respectively.
Find n, p and q

24. Find the probability of Success for Binomial distribution if n=6 and 4P(X=4) =
P(X=2)
Poisson Probability Distribution
Poisson distribution may be expected in cases where the chance of any individual event being a
success is small. The distribution is used to describe the behaviour of rare events such as the
number of accidents on road, Number of printing mistakes in books.
It has been called “the Law of Impossible Events”
Let “x” be a discrete random variable with mean (?????? ) assume assigned to a rare event. “x” is said
to follow Poisson probability distribution, the probability mass function (PMF) is given by
P(x) =
�
−??????
??????
??????
�!

Where x= 1, 2, 3, 4……..∞
?????? = Parameters of the Poisson distribution
Let “x” be a discrete random variable with mean (??????) i.e. np or the average number of
occurrence of an event.
The Poisson distribution is a discrete distribution with a single parameter
“??????" increases,the distribution shifts to the right.
All the Poisson probability distribution are skewed to the right. This is the reason why the Poisson
probability has been called the probability distribution of rare events.
Constants of the Poisson distribution
- The mean of Poisson distribution = ??????
- The standard deviation = √??????
Role of the Poisson distribution
Used in infrequently occurring events with respect to time, area, volume or similar events. Some
practical situation in which Poisson distribution can be used are given below.
- Quality control statistics
- Biology to count the number of bacteria
- Number of particles emitted from radioactive substance
- Insurance – No of causalities
- Waiting time problem – Number of incoming calls

BIET – MBA Programme, Davangere

68
Prof. Vijay K S Business Statistics and Analytics

Problems on Poisson distribution:
25. Skilled typist makes an average 2 types mistakes per book of 100 pages. In a randomly
chosen book of 100 pages typed by the same typist, what is the probability that
I) There are no typing mistakes
II) There is at-least one typing mistakes
III) There are exactly 3 typing mistakes

26. In an express way, the average number of car accidents in a week are 3, on randomly
chosen week of the year, what is the probability that
I) There are no accidents reported
II) There are exactly 2 accidents in that week

27. In a world class manufacturing facility, the average number of occupational hazards
in a year is 2. Find the probability that in a given year of safety systems assessment
there are at most two occupational hazards.

28. On an average 2 flights crashed due to technical problem in this year. Find the
probability that in a given year
I) No crash
II) At least one crash

29. Number of customers demanding for information using RTI act in government office
in a week is 1.5 on an average. Find probability that in given week of the year
I) There is no demand
II) There are 3 demands

30. Accidents occurs on a particular stretch of highway t an average vales of 3 per week,
assuming Poisson probability. Find probability of exactly two accidents in a given
week( �
−3
= 0.04979)
Fitting a Poisson distribution
31. Systematic sample of 200 pages was taken from the manuscript type by typist and
observed frequency distribution of the typing mistakes per page is found to be as
follows. Fit Poisson distribution.
Number of typing mistakes 0 1 2 3 4
Number of pages 122 60 15 2 1

32. Calculate Mean and Variance of Poisson variable “X”,
if the probability P(X=4) = P(X=5)
33. For Poisson variable P(X=1) = P(X=2) find mean and P(X=0)
Poisson Approximation to Binomial distribution
- If “A” is Binomial variable with “n” very large (>30) and “P” is very small (<0.1) then X
follows Poisson probability ?????? = np
34. If 2% electric bulbs manufactured by company are defective. Find the probability that
in a sample of 200 bulbs
I) Less than 2 bulbs II) More than 3 bulbs are defective

BIET – MBA Programme, Davangere

69
Prof. Vijay K S Business Statistics and Analytics

35. On an average one in 400 chips is found to be defective if chips are packed in 100’s,
what is the probability that any given box will contain
I) One defectives
II) One or more defectives
III) Less than 2 defectives

Normal Probability Distribution
The normal distribution, also called the Normal Probability Distribution, to be the most useful
theoretical distribution for continuous variables. Most of the date relating to economic and
business statistics or even in social and physical science conform to this distribution.
Properties of the Normal Distribution
1. Normal curve is “bell shaped” and symmetrical in it appearance
2. The height of the normal curve is at its maximum at the mean. Hence the mean and mode
of the normal distribution coincide. Thus for a Normal Distribution Mean, Median and
Mode are all equal.
3. There is one maximum point of the normal curve which occurs at the mean
4. Since there is only one maximum point, the normal curve is uni-modal i.e. it has only one
Mode
5. As dissatisfied from Binomial and Poisson distribution where the variable is discrete. The
variable distributed according to the normal curve is continuous.
6. The first and third quartile are equidistance from the Median
7. The area under the normal curve distributed as follows
a. Mean ± 1?????? covers 68.27% area and 34.135 % area will lie on either side of the
Mean
b. Mean ± 2?????? covers 95.45% area
c. Mean ± 3?????? Covers 99.73% area
8. The mean deviation is 4
th or more precisely 0.7979 of the standard deviation
Conditions for Probability
- The causal forces must be numerous and of appropriately equal weights
- Forces must be the same over the universe from which the observations are drawn. This
is the condition of homogeneity
- Forces affecting events must be independent of one another
- Condition symmetry
Normal Distribution Graph

BIET – MBA Programme, Davangere

70
Prof. Vijay K S Business Statistics and Analytics

Calculation of “Variables” in Normal Distribution
In order to calculate probabilities of Normal variable “X’, we transform to Normal variable “x” to
Standard normal variable “Z” by using Z=
&#3627408485;−µ
??????

There for x ∿ n (µ,??????) &#3627408468;&#3627408466;&#3627408481;&#3627408480; &#3627408481;&#3627408479;&#3627408462;&#3627408475;&#3627408480;&#3627408467;&#3627408476;&#3627408479;&#3627408474; &#3627408481;&#3627408476; &#3627408461; ∿ SD (µ=0,??????=1)
f = (Z) =
1
2√??????
&#3627408466;

??????
2
2
The probability values corresponding to different values of Z are made available under normal
tables which are used to get probabilities for normal distribution.
Problems on Normal distribution:
36. The marks obtained by students in an examination follows normal distribution with
mean = 45 and SD =10. Calculate the probability that randomly chosen student has
scored
I) Less than 60 Marks
II) Between 60 and 80 Marks
III) Less than 40 Marks
IV) More than 70 Marks
V) Between 35 and 60 marks

37. The average daily sales of 500 branch offices was Rs. 1,50,000, SD = 15,000, assuming
the distribution will be normal. Calculate how many branches have sales
I) Above 1,70,000
II) Between 1,20,000 and 1,40,000
III) Between 145 thousands and 165 thousands

38. Distribution of monthly income of 500 workers follow normal distribution with mean
of Rs. 2000 and SD of Rs. 200, estimate the number of workers with income
I) Exceeding Rs. 2300 per month
II) Between 1800 and 2300 per month
III) What is the lowest income of 25% of workers in the highest income group

39. Banking recruitment board conducts qualifying exams for 1000 candidates and the
scores of the candidates follow the normal distribution with mean of 52 marks and SD
= 6
I) Find the number of candidates scoring between 40 and 55 marks
II) If the recruitment board wishes to recruit only 10% of the top scores, what is
the cut marks?

BIET – MBA Programme, Davangere

71
Prof. Vijay K S Business Statistics and Analytics

Questions from Previous Year Question Papers:
1. A Merchant’s file of 20 accounts contains 6 delinquent and 14 non-delinquent accounts.
An auditor randomly selects 5 of these accounts for examination:
A) What is the probability that the auditor finds exactly 2 – delinquent accounts?
B) Find the expected number of delinquent accounts in the sample selected.

2. The mean and standard deviation of the wages of 6000 workers engaged in a factory are
Rs. 1200 and Rs. 400 respectively. Assuming the distribution to be normal estimate:
Percentage of workers getting wages above Rs. 1600
Number of workers getting wages between Rs. 1100 and Rs. 1500
The relevant extract of the area table (under the normal courve from Z=0 to ∞ is given below
Z 0.25 0.5 0.6 0.75 1.00 1.25 1.5
Area 0.0987 0.1915 0.2257 0.2734 0.3413 0.3944 0.4332

3. The mean and standard deviation of the wages of 1000 workers engaged in a factory are
Rs. 1200 and Rs. 400 respectively. Assuming the distribution to be normal, estimate
a. Percentage of workers getting wages above Rs. 1600
b. Number of workers getting wages between Rs. 600 and Rs. 900

4. The probability that a pen manufactured by a company will be defective is 1/10. If 12
such pens are manufactured, using binomial distribution find the probability that.
a. Exactly two will be defective
b. At least 3 will be defective
c. At most 3 will be defective
5. A Project yields an average cash flow of Rs. 500 Lakhs, with a standard deviation of Rs.
60 Lakhs, Calculate the following probabilities.
a. Cash flow will be more than 560 Lakhs
b. Cash Flow will be less than 420 Lakhs

6. The incidence of occupational diseases in an industry is such that the worker have 20
percent chance of suffering from it. What is the probability that out of six worker’s 4 or
more will come in contact of the disease?

7. Suppose a life insurance company insures the lives of 5000 persons aged 42. If
studies show that any 42 years old person will die in a given year to be 0.001. Find
the probability that the company will have to pay at-least two claims during a
given year. What is the probability that company will have to pay zero claims?

8. In a manufacturing organization with 5000 employees. The mean was of workers
is Rs. 8000/- per month with standard deviation of Rs. 2000/-. Assuming normal
distribution, estimate:
a. Number of workers getting salary below Rs. 6000/-
b. Number of workers getting salary above Rs. 10,000/-
c. Number of workers getting salary between Rs. 7000/- and Rs. 9000/-

BIET – MBA Programme, Davangere

72
Prof. Vijay K S Business Statistics and Analytics

9. Mean and standard deviation of wages of 1000 workers engaged in a factory are
Rs. 1200 and 400 respectively. Assuming the distribution to be normal, estimate
a. Percentage of workers getting wages above Rs. 1600
b. Number of workers getting wages between Rs. 600 and Rs. 900
The area under normal curve for different Z are given below
Z 0.5 0.75 1 1.5
Area 0.1915 0.2734 0.3413 0.4332

10. In a factory turning out fan blades, there is a small change of 0.002 for any blade
to be defective. The blades are supplied in packets of 10. Use poisson distribution
to calculate the approximate number of pockets containing no defectives, One
defective and two defective blades respectively in a consignment of 10,000
packets.

BIET – MBA Programme, Davangere

73
Prof. Vijay K S Business Statistics and Analytics


Unit – 4
Time Series Analysis

BIET – MBA Programme, Davangere

74
Prof. Vijay K S Business Statistics and Analytics

Unit 4: (12 Hours)
Time Series Analysis: Introduction - Objectives Of Studying Time Series Analysis -
Variations In Time Series - Methods Of Estimating Trend: Freehand Method - Moving
Average Method - Semi-Average Method - Least Square Method. Methods of Estimating
Seasonal Index: Method of Simple Averages - Ratio to Trend Method - Ratio to Moving
Average Method
Introduction:
Forecasting is an important tool in any decision-making process. Forecasting is essential
in making many magerial decision such as deciding about raw material required for
production, investment required for equipment purchase, sales forecast, human resource
requirement forecast, etc
A time series is a set of numerical values of some variable at regular period over time. The
series is usually tabulated or graphed in a manner that readily conveys the behaviour of
te variable under study.
Example:
The export of cement company between 2007 to 2017
Year Export
(Tonnes)

2007
2
2008 3
2009 6
2010 10
2011 8
2012 7
2013 12
2014 14
2015 14
2016 18
2017 19

The above graph suggests that the series is time dependent. The management of the
company is invested in determining how the series is dependent on time and in
developing a means of predicting future levels with some degree of reliability
Objective of Studying Time Series Analysis
1. The assumption underlying time series analysis is that the time series data
behaves the same in the future as that in the past. Time series analysis is used to
detect the pattern underlying data, isolate the influencing factors which in turn
20072008200920102011201220132014201520162017
Export (Tonnes)23610871214141819
2
3
6
10
8
7
12
1414
18
19
0
2
4
6
8
10
12
14
16
18
20
Export (Tonnes)

BIET – MBA Programme, Davangere

75
Prof. Vijay K S Business Statistics and Analytics

used to estimate the future accurately. Thus, the time series data helps us to cope
with the uncertainty about the future.
2. To review and evaluate the progress made in the plans are based on the time series
data. For example, Finance Minstry of Govt. of India (GOI) reviewing the gross
domestic product ) GDP of the economy during the financial year and chalking out
the strategies to further the growth.

Variations in Time Series / Components of a Time Series
In typical time-series there are three main components which seem to be independent
of one another and seems to be influencing time-series data.
An important step in analysing time series is to consider the types of data patterns. A
time series data can contain some or all of the following elements. They are:
1. Trend (T)
2. Cyclical (C)
3. Seasonal (S)
4. Irregular (I)

1. Trend (T) : The trend is the long term pattern of a time series. A trend can be
positive or negative depending on whether, the time series exhibits an increasing
long term pattern or a decreasing long term pattern. The rate of trend growth
usually varies over time.

BIET – MBA Programme, Davangere

76
Prof. Vijay K S Business Statistics and Analytics

2. Cyclical (C) : Time series data may show up and down movement around a given
trend. For example, business cycle over the years show upward trend and touches
its peak and then it may show slump and hits the bottom. The pattern repeats but
not a regular interval of time. The duration of a cycle depends on the type of
business or industry.
In brief an upward and downward oscillation of uncertain duration and
magnitude about the trend line due to seasonal effect with fairly regular period
with irregular swings is called a cycle.












3. Seasonal (S): It is a speacial case of a cycle component of time series in which the
magnitude and duration of the cycle do not vary but happen at a regular interval
each year. Seasonality occurs when the time series exhibits regular variation
during the same periods (Month, Year or same quarter every year)

BIET – MBA Programme, Davangere

77
Prof. Vijay K S Business Statistics and Analytics

4. Irregular or Random: This type of
variation is unpredictable. This is
caused by short term unanticipated and
non-recurring factors. These follows np
specific pattern.


Methods of Estimating Trend:
These are also called the forecasting methods of Time Series Analysis
Some of them are
1. Freehand Method
2. Moving Average Method
3. Semi-average Method
4. Least-Square Method

1. Freehand Method:
It is easy method of estimating the trend. First, step is to plot the values of the time
series on the graph and then draw a trend line through these points such that the
line reflects long-term trend of the data. This method does not require any
rigorous mathematical calculations.
Here the forecast can be obtained simply by extending the trend line. A trend line
fitted by the freehand method should conform to the following conditions.
- The trend line should be smooth – a straight line or mix of long gradual curve
- The sum of the vertical deviation of the observations above the trend line
should be equal to the sum of the vertical deviation of the observation below
the trend line.
- The sum of squares of the vertical deviations of the observations from the
trend line should be as samll as possible.
- The trend line shoulc bisect the cycles so that area above the trend line should
be equal to the area below the trend line, not only for the entire series but as
much as possible for each full cycle.

Limitations:
The method involves personal bias, very subjective and needs judgement.
Example:
Fit a trend line to the following data by using the freehand method
Year 2010 2011 2012 2013 2014 2015 2016 2017
Sales
(Lakh)
80 90 92 83 94 99 92 104

BIET – MBA Programme, Davangere

78
Prof. Vijay K S Business Statistics and Analytics

2. Moving Average Method:

It is a very simple and flexible method. As the name refers to, in this method we
calculate a series of averages of successive overlapping groups. The number of
values to be included in an average is determined by a constant called Period. The
resulting averages smoothens the fluctuations and extreme values.

Two cases
- When the period is odd
- When the period is even
Example: The sales of a store for 11 years are given below. Find the 3-Year, 5-Year
and 7-Year Mooving average
Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Sales
(Lakh)
7 13 19 25 31 37 43 49 55 61 67

Example: The sales of a store for 11 years are given below. Find the 4-Year and 6-
Year
Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Sales
(Lakh)
7 13 19 25 31 37 43 49 55 61 67

3. Semi-average Method:
This method is used to estimate the trend line if a linear function can discribe the data
sufficiently. This procedure is as follows
1. Divide the given time series into two segments leaving the middle period if the data is
odd in number. If you find the number of time periods even, divide the time series into
two segments leaving the two time period in the middle.
2. Find the average of te values of each segment and plot the two average points on the
graph against the middle time period of each segment.
3. Join the two points to plot the trend line. You can extend the trend line to predict the
value of a future time period.
4. The trend line equation is of the form &#3627408460;⏞ = a + bx. The intercept “a” and the slope can
be found by using the experssion,
Slope =
∆ &#3627408460;
∆ &#3627408459;
=
&#3627408439;&#3627408470;&#3627408467;&#3627408467;&#3627408466;&#3627408479;&#3627408466;&#3627408475;&#3627408464;&#3627408466; &#3627408470;&#3627408475; &#3627408481;ℎ&#3627408466; &#3627408462;&#3627408483;&#3627408466;&#3627408479;&#3627408462;&#3627408468;&#3627408466; &#3627408483;&#3627408462;&#3627408473;&#3627408482;&#3627408466; &#3627408476;&#3627408467; &#3627408466;&#3627408462;&#3627408464;ℎ &#3627408480;&#3627408466;&#3627408468;&#3627408474;&#3627408466;&#3627408475;&#3627408481;
&#3627408439;&#3627408470;&#3627408467;&#3627408467;&#3627408466;&#3627408479;&#3627408466;&#3627408475;&#3627408464;&#3627408466; &#3627408470;&#3627408475; &#3627408481;ℎ&#3627408466; &#3627408474;&#3627408470;&#3627408465; &#3627408477;&#3627408466;&#3627408479;&#3627408470;&#3627408476;&#3627408465; &#3627408476;&#3627408467; &#3627408466;&#3627408462;&#3627408464;ℎ &#3627408480;&#3627408466;&#3627408468;&#3627408474;&#3627408466;&#3627408475;&#3627408481;


Intercept = a = Average value of the first segment at its mid – period

BIET – MBA Programme, Davangere

79
Prof. Vijay K S Business Statistics and Analytics

Example: The sales of a manufacturing firm from 2001 – 2011 is given below. Fit a trend
line by using the method of semi-averages and also estimate the sales for the year 2014.
Year 200
1
200
2
200
3
200
4
200
5
200
6
200
7
200
8
200
9
201
0
201
1
Sales
in
Crore
s
103 105 114 112 116 120 116 122 126 127 124

4. Semi-average Method:
The trend may be linear or curvilinear. Let us consider the values that can be descibed by
a stright line. These are called linear trends. The general equation for estimating a stright
line is &#3627408460;⏞ = a + bx.
Where &#3627408460;⏞ = Estimated value of the independent variable
a = y-intercept, i.e. the value of &#3627408460;⏞ when x = 0
b = Slope of the trend line
x = independent variable, i.e. the time
The values of “a” and “b” can be found out by
a =
∑&#3627408486;
&#3627408475;

b =
∑&#3627408485; &#3627408486;
∑&#3627408485;
2

Example: The production of the firm over the years is given below
a) Fit a stright line trend using the method of least squares
b) Estimate the production figures for the year 2012
c) Use a graph to plot the actual and the estimated production








Year (X) 2005 2006 2007 2008 2009 2010 2011
Production
(in ‘000s)
87 91 93 86 96 99 92

BIET – MBA Programme, Davangere

80
Prof. Vijay K S Business Statistics and Analytics


Methods of Estimating Seasonal Inde
- Method of Simple Averages
- Ratio to trend method
- Ratio to moving average method

Method of simple averages
Example: The sales of lathes in the last three years is given below. Use the method of
simple averages to determine the seasonal index for each month.
Month Jan Feb Mar April May June July Aug Sep Oct Nov Dec
2009 16 17 19 19 24 24 21 29 30 34 34 39
2010 22 21 27 26 30 27 21 27 31 36 33 43
2011 28 28 38 39 39 33 33 37 41 50 44 56

Ratio to Trend Method
Example: The quarterly sales of a stationery store (Rs. In thousands) for five years, i.e.
2007 – 2011 is given below. Use ratio to trend method to determine the seasonal indexes.
Ratio to moving Average method
Example: The quarterly sales for five years from 2008-2011 is given below. Use ratio to
moving average method to determine the sesonal indexes.
Quarter Sales (Rs. In Thousands
I II III IV
2008 77 62 56 61
2009 85 64 62 79
2010 91 73 67 86
2011 102 80 74 95

1. Fit a stright line trend by semi average method for the following data: (June
2010)
Year 1994 1995 1996 1997 1998 1999 2000 2001 2002
Sales (in
‘000)
45 50 60 55 60 65 70 80 85

2. Calculate the trend values by the method of moving averages assuming a 4- year
cycle from the following data relating to sugar production in India. Also plot the
actual and trend values on a graph. (June 2010)
Year 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988
Sugar prodn
(in Lakh
tons)
75 62 76 78 94 84 96 128 116 76 102 168

BIET – MBA Programme, Davangere

81
Prof. Vijay K S Business Statistics and Analytics


3. Compute the trend values by finding four yearly moving- averages for the
following time series. Also graph the deserved values and the trend values.
Year 1988 1989 1990 1991 1992 1993 1994 1995 1996
Sales (in
‘000)
103 104 107 101 102 104 105 99 100

4. Fit a stright line trend for the following data by the method of least squares.
Estimate the value for 1996.
Year: 1989 1990 1991 1992 1993 1994
Value: 10 8 12 9 11 12

5. What is Time Series? Discuss various components of Time Series.

6. Fit a trend line to the following data by the method of Semi-average (Draw a
Graph)
Year Sales of Firm A (thousand
units)
1990 102
1991 105
1992 114
1993 110
1994 108
1995 116
1996 112

7. Below are given the figures of production (in thousand kgs.) of casting in a
factory
Years 1990 1991 1992 1993 1994 1995 1996
Production 12 10 14 11 13 15 16

BIET – MBA Programme, Davangere

82
Prof. Vijay K S Business Statistics and Analytics

Questions from Previous Year Question Papers:
1. Below are given the figures of production of a sugar factory. Values are in
thousand quintals
Year 2011 2012 2013 2014 2015 2016 2017
Production 80 90 92 83 94 99 92

Fit a straight line trend and show the trend line on graph. Estimate production in
2020.

2. For the data on prices (in Rs. Per Kg) of a certain commodity during 2007 to
2011 are shown below. Compute the seasonal indexes by the average percentage
method.
Quarter 2007 2008 2009 2010 2011
1 45 48 49 52 60
2 54 56 63 65 70
3 72 63 70 75 84
4 60 56 65 72 66

3. From the following series of annual data, find the trend line by the method of
semi averages. Also estimate the value for 1999
Year 1990 1991 1992 1993 1994 1995 1996 1997 1998
Actual
Value
170 231 261 267 278 302 299 298 340

4. The sales of a company in millions of rupees for the year 1994 -2001 are given
below
Year 1994 1995 1996 1997 1998 1999 2000 2001
Sales 550 560 555 585 540 525 545 585
a. Find the linear trend equation?
b. Estimate the sales for the year 1993?
c. Find the slope of the straight line trend?
d. Do the figures show a rising trend or a falling trend?

5. Taking of the deviation of the time variable compute the trend values for the
following data by the method of least square:
Days 1 2 3 4 5 6 7
Sales (Rs.) 20 30 40 20 50 60 80

6. With the help of following data, calculate the trend values by the method of least
squares and estimate the sales for the year 2011.
Years 2000 2001 2002 2003 2004 2005 2006
Sales (in Lakhs) 25 27 32 36 44 55 69

BIET – MBA Programme, Davangere

83
Prof. Vijay K S Business Statistics and Analytics

7. The following table relates to the tourists arrivals (in million) during 1994 to
2000 in India.
Years 1994 1995 1996 1997 1998 1999 2000
Tourists Arrival 18 20 23 25 24 28 30
Fit a straight line trend by method of least square and estimate the number of
tourist that would arrive in the year 2004.

8. The sales of a company in millions of rupees for the years 1994-2001 are given
below:
Years: 1994 1995 1996 1997 1998 1999 2000 2001
Sales: 550 560 555 585 540 525 545 585
a. Find the linear trend equation
b. Estimate the sales for the year 1993

9. Calculate seasonal indices by the “ratio to moving averages” method from the
following data
Year 1
st
Quarter 2
nd
Quarter 3
rd
Quarter 4
th
Quarter
2005 68 62 61 63
2006 65 58 66 61
2007 68 63 63 67

10. Gross revenue data (Rs. In Million) for a travel agency for a 10 year period is as
follows

Years: 2000 01 02 03 04 05 06 07 08 09
Revenue: 3 6 10 8 7 12 14 14 18 19
Calculate a 3 year moving average for the revenue earned.

BIET – MBA Programme, Davangere

84
Prof. Vijay K S Business Statistics and Analytics



Unit – 5
Part A
Linear Programming

BIET – MBA Programme, Davangere

85
Prof. Vijay K S Business Statistics and Analytics

Unit 5: (8Hours)
Linear Programming: structure, advantages, disadvantages, formulation of LPP, solution
using Graphical method.

Linear Programming
For making decision in a business environment. Model formulation is very important
because it represents the essence of business decision problem.
Here formulation means, the process of converting the verbal description and numerical
data into mathematical expression, which represents the relevant relationship among
- Decision factors / variables
- Objectives that firm wants to achieve – Objective function
- Restriction – on the use of resources
“Linear programming is a particular type of techniques used for economic allocation of
scare and limited resources, such as Labour, Materials, Machine, Time, Warehouse, Space,
Capital, Energy to several competing activities such as Products, Services, Jobs, New
equipment, Projects etc.”
Linear Programming is one of the optimization techniques used to optimise the business
variables like Profit, Cost, Sales, and Waste with available limited resources.
Linear Programming uses mathematical modelling with the help of “Linear Equation”
“Linear programming is one of the optimizing techniques used to minimize profits or
minimizing cost of given function using linear equation.”
Two works comprising Linear Programming = Linear + Programming
a. Linear – Means linear relationship among the variables in the model, change in
one leads to proportionate change in other
b. Programming – Modelling or Solving problem mathematically
Structure of Linear Programming Model
General Structure of LP Model:
LP model consist of three components
1. Decision Variable / Courses of Action
2. Objective Function
3. Constraints

1. Decision Variable: To arrive at optimal value of the objective function, we need to
evaluate the various alternative i.e. various courses of actions, If there is no
alternative, no need of LP

BIET – MBA Programme, Davangere

86
Prof. Vijay K S Business Statistics and Analytics

- These activities denoted as X1, X2, X3,……Xn
- The values of these denotes the extent to which each of these performed
- They are under the control of decision maker
- These are interrelated in terms of limited resources
- All decision variables are continuous, controllable and Non negative
X1 ≥ 0, X2 ≥ 0 …………….. &#3627408459;&#3627408475;≥ 0

2. Objective Function: Mathematical Representation of the objectives in terms of
measurable quantity such as Profit, Cost, Revenue, Distance etc.
LPP Aims to achieve the highest profit or lowest cost by utilizing the available
limited resources to the best possible extent

Optimize (Minimize or Maximize) Z = ??????
1 &#3627408485;
1 + ??????
2 &#3627408485;
2 + ??????
3 &#3627408485;
3 + ………
??????
&#3627408475;&#3627408485;
&#3627408475;

Z = Measure of performance variable

&#3627408485;
1 , &#3627408485;
2 ,&#3627408485;
3, &#3627408485;
4…………………..&#3627408485;
&#3627408475;
= Decision Variable

??????
1 , ??????
2 ,??????
3……………??????
&#3627408475;
= Quantities

The optimum value of the given objectives function is obtained by the graphical
method and simple method.


3. Constraints:
- There are certain limitations (Or Constraints) on the use of resources,
Example. Labour, Machine, Raw Materials, Space, Money etc. the limit the
degree to which objective can be achieved
- Such constraints must be expressed as linear equalities or inequalities in
terms of decision variable
- The solution of LP model must satisfy these

BIET – MBA Programme, Davangere

87
Prof. Vijay K S Business Statistics and Analytics

General Mathematical Model of Linear Programming Problem:
The General Linear Programming Problem / Model with “n” Decision variables and “m”
constraints can be stated in the following form

Decision Variable &#3627408485;
1 , &#3627408485;
2 ,&#3627408485;
3, &#3627408485;
4……………..&#3627408485;
&#3627408475;
( One should find these values)

Objective Function: Z = ??????
1 &#3627408485;
1 + ??????
2 &#3627408485;
2 + ??????
3 &#3627408485;
3 + ……… ??????
&#3627408475;&#3627408485;
&#3627408475;


Subjected to the linear Constraints
&#3627408462;
11 &#3627408485;
1 + &#3627408462;
12 &#3627408485;
2 + &#3627408462;
13 &#3627408485;
3 + …………………….. &#3627408462;
1&#3627408475; &#3627408485;
&#3627408475; (≤ = ≥) &#3627408463;
1
&#3627408462;
11 &#3627408485;
1 + &#3627408462;
12 &#3627408485;
2 + &#3627408462;
13 &#3627408485;
3 + …………………….. &#3627408462;
1&#3627408475; &#3627408485;
&#3627408475; (≤ = ≥) &#3627408463;
1



&#3627408462;
&#3627408474;1 &#3627408485;
1 + &#3627408462;
&#3627408474;2 &#3627408485;
2 + &#3627408462;
&#3627408474;3 &#3627408485;
3 + …………………… &#3627408462;
&#3627408474;&#3627408475; &#3627408485;
&#3627408475; (≤ = ≥) &#3627408463;
&#3627408474;
Such that &#3627408485;
1 , &#3627408485;
2 ,&#3627408485;
3, &#3627408485;
4……………..&#3627408485;
&#3627408475; ≥0
 Non Negativity Constraints

Where
??????
1 , ??????
2 ,??????
3……………??????
&#3627408475; => These are constant of profit / loss
&#3627408462;
11, &#3627408462;
12, &#3627408462;
13 …………………….. &#3627408462;
&#3627408474;&#3627408475; => Technical Constant
&#3627408463;
1, &#3627408463;
2, &#3627408463;
3 …………………….. &#3627408463;
&#3627408474; => Availability or Requirements


Assumption of Linear Programming
- Certainty
- Diversity
- Additivity
- Linearity

BIET – MBA Programme, Davangere

88
Prof. Vijay K S Business Statistics and Analytics

Certainty: LP model assume that all parameters such as availability of resources, profits
or cost contribution of a unit if decision variable and consumption of resources by a unit
of decision variable must be known and may constant.
Diversity (Continuity): The solution values of decision variable and resources are
assumed to have either whole number (Integer) or mixed number.
Additivity: The values of the objective function for the given values of decision variable
and the total sum of resources used, must be equal to the sum of the contribution (Profit
or cost) earned from each decision variable and the sum of the resources used by each
decision variable.
Example 1: Total profit earned by the sale of two products A and B = Sum
of profit earned separately from A and B
Example 2: Resources consumed by A and B = Sum of resources used for A
and B individually
Linearity / Proportionately: All relationships in the LP model (both objective function and
constraints must be linear)
Example: If production of one unit of a product uses 5 hours of a particular
resources, then making 3 units of that product use 3*5 = 15 hours of that resource
Advantages of Linear Programming
1. Help in attaining optimum use of productive resources
2. Improve quality of decisions, Since it is more of objective than subjective
3. It provides possible and practical solutions
4. Highlighting of bottlenecks in the production processes
5. Linear Programming helps in re-evaluating of a basic plan for changing
condictions

Limitations of Linear Programming / Disadvantages
1. Treats all relationship among decision variable as linear
2. There is no guarantee of getting integer valued solution
3. It doesn’t take into consideration the effect of time and uncertainty
4. It is possible to solve large scale problems in LP with the usage of computer, but
problem can be fragmented into several small problems and solving each
separately
5. Parameters appearing the model are assumed to be constant but in real life they
are frequently neither nor constant.
6. It deals with single objective, where as in real life situation we may come across
conflicting multi-objective problems. In such cases a goal programming model is
used instead of linear programming

BIET – MBA Programme, Davangere

89
Prof. Vijay K S Business Statistics and Analytics


Application Areas of Linear Programming
1. Agriculture Application:
- Efficient production patterns can be specified by the Linear Programming
Model under regional land resources and national demand constraints
- Applied in agriculture planning – resource allocation
2. Military Application
- No of defence units that should be used in a given attack in order to
provide the required level of protection at the lowest possible cost
3. Production Management
- Product mix – The objective is to maximise the total contribution subject
to all constraints
- Production Planning – To manage the operating cost
- Assembly – Line Balancing – to reduce total elapse time
- Blending problem – Find the minimum cost blend
- Trim Loss – Minimise trim loss
4. Financial Management
- Portfolio Selection: To find the allocation, which maximises the total
expected return or minimise the risk under certain limitation
- Portfolio Planning: Maximizing the profit margin from investment in
plant facility and equipment
5. Marketing Management
- Median selection: Maximise the effective exposure subject to limitation of
budget, specified exposure rates to different market segments
- Travelling salesmen problems – To find the shortest route
- Physical distribution – Locating the manufacturing plants and
distribution centres
6. Personal Management
- Staffing Problem: Allocation of optimal resources i.e. manpower to
particular job to reduce the overtime cost
- Determination of equitable salaries
- Job Evaluation and selection – Identifying a suitable person for a
specified job
Other application of linear programming lie in the areas of administration, education,
fleet utilization, Awarding contracts, hospital administration and capital budgeting

BIET – MBA Programme, Davangere

90
Prof. Vijay K S Business Statistics and Analytics

Guideline or Steps for Linear Programming Model Formulation:
Steps in Linear Programming (LP) Model formulation
1. Identify Decision Variable (“n” number of Decision variables)
- How many?
- How much quantity of the decision variable?
- Which?
2. Formulating the Objective Function
- Identify whether the objective function is to be maximised or minimised
- Maximize: Profit, Revenue, Margin, Viewers,
- Minimization: Cost, Time, Number of employee problem
- The function of the objective has to be written, like as follows
Z = C1X1 + C2X2 + C2X3……………..CnXn
3. Identify the Problem Data
- Here we need to provide the actual values for the decision variables
identified earlier. For this we need to know the information given in the
problem to determine those values
- These quantities constitute the problem data

4. Formulate the Constraints (“m” number of constraints)
- Express the constraints in terms of requirements and availability of each
resources
- Convert the verbal expression of the constraints imposed by the resource
availability as a linear equality or inequality in terms if decision variable
defined in step 1
A11X1 + A12X2 + A13X3……………………………….A1nXn (≤ / ≥ / = ) b1
A21X1 + A22X2 + A23X3………………………………..A2nXn (≤ / ≥ / = ) b2
.
.
.
Am1X1 + Am2X2 + Am3X……………………………….AmXn (≤ / ≥ / = ) bm
Here the main aim is to translate a real life problem into mathematical
model

5. Non Negativity Constraints:
X1, X2, X3…….Xn ≥ 0




X1, X2, ………. Xn

BIET – MBA Programme, Davangere

91
Prof. Vijay K S Business Statistics and Analytics

Problems:

1. Suppose that company produces 2 products A and B. Product A gets a profit
of Rs. 200 per units, Product B gets a profit of Rs. 500 per units. Company
uses 3 main resources Labour, Power and Raw materials.
Product A uses 2 Units of Labour
3 Units of Power
2 Units of Raw Materials
Product B Uses 3 Units of Labour
4 Units of Power
3 Units of Raw Materials
For a day Production Company can use 200 units of Labour, 500 Units of
Power and 1000 units of Raw Materials. Formulate LPP.
2. A manufacturer produces 2 types of Model M1 and M2. Each model M1
requires 4 hours of grinding and 2 hours of polishing. Each model M2 requires
2 hours of grinding and 5 hours of polishing. The manufacturer has got 2
grinders and 3 polishers, each grinder works 40 hours a week, and each
polisher works 60 hours a week. The profit on M1 is 30 / Unit and Profit on M2
is 40 / Unit.
Formulate LPP.

3. A company manufactures 2 products Fibre and Viscose. 1 Kg of Fibre fetches a
profit of Rs. 400 and 1 Kg of Viscose gives a profit of Rs. 500. These 2 products
uses 3 basic resources Labour, Power and Raw Materials. To Produce 1 Kg of
Fibre it requires 1 Man days of Labour, 2 units of Power and 2 Units of Raw
Materials. To Produce 1 Kg of Viscose it requires 2 Man days of Labour, 3 Units
of Power and 1 Unit of Raw material is required. The resources are limited in
nature such that company can utilize 100 Man days of Labour, 500 Units of
Power and 800 Units of Raw materials. Formulate LPP

4. A paper mill produces 2 grades of paper namely X and Y. There is a
restriction on availability of raw materials that it cannot produce more than
400 tons of grade X and 300 tonnes of grade Y in a week. It requires 0.2 and
0.4 hours to produce a tonne of products X and Y respectively with
corresponding profits of Rs. 200 and Rs. 400 per tonne.
Formulate the above LPP. There are 160 production hours in a week
5. A person requires 10, 12 and 12 units of chemicals A, B and C respectively for
his garden. A liquid products contains 5, 2 and 1 Units if A, B and C
respectively per Jar. A Dry products contains 1. 2 and 4 Units of A, B and C

BIET – MBA Programme, Davangere

92
Prof. Vijay K S Business Statistics and Analytics

respectively per carton. If the liquid products is sold for Rs. 3 /- per Jar and
Dry product is sold for Rs. 2/- per Carton. How many units of such products
should be purchased in order to minimise the cost and meet the
requirements? Formulate LPP

6. A Company produces two types of leather belts “A” and “B”. A is a superior
quality than B. The respective profits are Rs. 10 and Rs. 15 per belt. The supply
of raw materials is sufficient for making 850 belts per day. For belt A, a special
type of buckle is required and 500 pieces are available per day. There are 700
buckles available for belt B per day. Belt A needs twice as much time as that
required for belt B and Company can produce 500 belts if all of them were of
type A. Formulate LPP.

7. Company manufactures 2 products A and B. Each unit B takes twice as long as
to produce / unit of A. If company is to produce only A it would have time to
produce 2000 per unit day. The availability of Raw materials is sufficient to
produce 1500 per unit/day both A and B combined. Product B uses special
ingredients so only 600 units per day can be produced per day. If A cost Rs. 20
and B cost Rs. 50. Formulate LPP

8. An animal food company must produce 200 KG of mixture consisting of
ingredients X1 and X2 daily. X1 cost Rs. 3 per Kg and X2 cost 8 per Kg. Not more
than 80 Kg of X1 can be used and at least 60 Kg of X2 must be used. Formulate
LP Model to minimise the cost.

9. A firm produces 3 products A, B and C. It uses 2 types of Raw materials I and II
of which 500 and 7500 units respectively are available. The raw materials
requirements per unit of products are given below
Requirements of Products
Raw Materials A B C
I 3 4 5
II 5 3 5
The labour time for each units of products A is tice as that of products A and 3
times of that of products C. The entire labour force of that firm can produce
equivalent of Rs. 3000 Units.
The minimum demand for 3 products if 6oo, 650 and 500 Units respectively.
The ratio of the number of units produced must be equal to 2:3:4. Assuming
the profit per unit of A, B and C is 50, 50 and 80 respectively. Formulate LPP

10. A retired person want to invest an amount of Rs. 30,000 in a fixed income
securities. His broker recommends investing in 2 bonds. Bond A yields 7 % and
bond B yields 10%. After some consideration he decides to invest at most of
Rs. 12,000 in bond B and At least of Rs. 6000 in bond A. He also wants the
amount invested in bond A to be at least equal to amount invested in bond B.
Formulate LP model to maximize returns on investment

BIET – MBA Programme, Davangere

93
Prof. Vijay K S Business Statistics and Analytics


11. A person is inherited with Rs. 1,00,000 from his father-in-law that can be
invested in combination of only 2 stock portfolio’s, with the maximum
investment allowed in either portfolio at Rs. 75,000.
The first portfolio has an average returns of 10% and the 2
nd
has 20%. In
return of risk factor associated with these portfolio, The first has the risk rating
of 4 (on the scale of 0-10) and the 2
nd
has 9. Since he wants maximise the return
that will not accept an average rate of return below 12% or a risk factor above
6. Formulate LPP

12. A manufacturer employees 3 inputs, Man hours, Machine hours and cloth
materials to manufacture 2 type of dresses. Type A dress fetches him a profit
of 160 per piece, while type B that of Rs. 180 per piece. The manufacturer has
enough man hours to manufacture 50 Pieces of type A and 20 Pieces of type B
dresses per day. The machine hours he processes supply only for 36 pieces of
type A and 24 Pieces of type B. Cloth materials available per day is limited but
sufficient enough for 30 pieces of either type of dresses. Formulate LPP

13. A company produces 2 Types of hats each hat of the first type requires twice
as much as labour time of 2
nd
Type. If all hats are of 2
nd
Type only. The company
can produce the total number of 500 hats / day. The market limit daily sales of
the 1
st
and 2
nd
type of 150 and 250 hats. Assuming that the profits per hat are
Rs. 8 for type A and Rs. 15 for type B. Formulate LPP.

14. An animal food company must produce 200 Kg of the mixture of ingredients
X1 and X2 daily. X1 cost Rs. 3 / Kg and X2 cost Rs. 8 /Kg. Not more than 80 KG
of X1 can be used and at least 60 Kg of X2 must be used.
Formulate LPP
Solution to LPP using Graphical Method
Once the given business situation is transformed to LPP. It is solved using
graphical method as follows
Step 1: Plot the constraint equation on the graph sheet and locate the
common region between all the constraints, It is called as common region
Step 2: Identify the extreme points binding the feasible region
Step 3: If solution exist to given LPP it is at any one of these extreme points.
Substitute the values of each extreme point in the objective function.
Solution to the LPP corresponding to that extreme point for which the
values of objective function is optimized

BIET – MBA Programme, Davangere

94
Prof. Vijay K S Business Statistics and Analytics

15. Solve the following LPP using graphical method
Max Z = 8 &#3627408459;
1+ 16 &#3627408459;
2
Subjected to &#3627408459;
1+ &#3627408459;
2 ≤200
&#3627408459;
2 ≤ 125
3&#3627408459;
1+ 6&#3627408459;
2 ≤900
Where &#3627408459;
1 ,&#3627408459;
2 ≥0
16. Find solution to the following LPP graphically
Max Z = 10 &#3627408459;
1+ 8 &#3627408459;
2
Subjected to 2&#3627408459;
1+ &#3627408459;
2 ≤20
&#3627408459;
1+ 3&#3627408459;
2 ≤30
&#3627408459;
1− 2&#3627408459;
2 ≥ −15
Where &#3627408459;
1 ,&#3627408459;
2 ≥0
17. Find Maximum and Minimum values for the function
Z = 8 &#3627408459;
1+ 5 &#3627408459;
2
Subjected to 3&#3627408459;
1− 2&#3627408459;
2 ≥6
− 2&#3627408459;
1+ 7&#3627408459;
2 ≥7
2&#3627408459;
1− 3&#3627408459;
2 ≤6
Where &#3627408459;
1 ,&#3627408459;
2 ≥0

18. Solve the following LPP using Graphical Method
Z = 10 &#3627408459;
1 - 4 &#3627408459;
2
Subjected to 2&#3627408459;
1− 6&#3627408459;
2 ≤0
&#3627408459;
1− 2&#3627408459;
2 ≤2
− 3 &#3627408459;
1− 3&#3627408459;
2 ≥ −24
Where &#3627408459;
1 ,&#3627408459;
2 ≥0

19. Solve the following LPP using Graphical method
Max Z = 10 &#3627408459;
1 + 15 &#3627408459;
2
Subjected to 2&#3627408459;
1 + &#3627408459;
2 ≤26
2&#3627408459;
1 + 4&#3627408459;
2 ≤56
&#3627408459;
1 - &#3627408459;
2 ≥ −5
Where &#3627408459;
1 ,&#3627408459;
2 ≥0

20. Solve the following LPP using graphical method
Max Z = 0.07 &#3627408459; + 0.1 &#3627408460;
Subjected to &#3627408459; + &#3627408460; ≤30,000
&#3627408460; ≤12000
&#3627408459; ≥ 6000
X – Y ≥0
Where &#3627408459;
1 ,&#3627408459;
2 ≥0


21. Solve Graphically

BIET – MBA Programme, Davangere

95
Prof. Vijay K S Business Statistics and Analytics

Max Z = 0.1 &#3627408459; + 0.2 &#3627408460;
Subjected to &#3627408459; + &#3627408460; ≤100,000
&#3627408460; ≤75000
&#3627408459; ≥ 75000
-2X + 3Y ≤0
-2X + 8Y ≥0
Where &#3627408459;,&#3627408460; ≥0

22. Solve Graphically
Max Z = - 150 &#3627408459;
1−100 &#3627408459;
2+2,80,000
Subjected to 20 ≤ &#3627408459;
1 ≤ 60
70 ≤ &#3627408459;
2 ≤ 140
120 ≤ &#3627408459;
1+ &#3627408459;
2 ≤ 140
Where &#3627408459;
1 ,&#3627408459;
2 ≥0

Questions from Previous Year Question Papers:
1. A firm buys castings of P and Q type of parts and sells them as finished product after
machining, boring and polishing. The purchase cost for casting are Rs. 3 and Rs. 4 each for
parts P and Q and selling costs are Rs. 8 and Rs. 10 respectively. The per hour capacity of
machines used for machining, boring and polishing for two products is given below:
Capacity
per hour
Parts
P Q
Machining 30 50
Boring 30 45
Polishing 45 30
The running costs for machining, boring and polishing are Rs. 30, Rs. 22.50 and Rs. 22.50 per
hour respectively. Formulate LPP to find out the product mix to maximize the profit.

2. Mr. X has Rs. 1,00,000 that can be invested in a combination of only two stock portfolios with
maximum investment allowed in either portfolio set at Rs. 75,000. The first portfolio has an
average return of Rs. 10% where as second has Rs. 20%. In terms of risk factors associated
with these portfolios, the first has a risk rating of 4 and second has 9. Since he wants to
maximise his returns, he will not accept an average rate of returns below 12% of risk rating
above 6. How much should he invest in each portfolio? Formulate this as linear programming
problem and solve it graphically.

3. Solve the following problem by using graphical method:
Minimize Z = 3X1 + 5X2
Subjected to -3X1 + 4X2 ≤ 12
2X1 + 3X2 ≥ 12
2X1 – X2 ≥ - 2
And X1 ≤ 4 ; X2 ≥ 2 ; X1, X2 ≥ 0

BIET – MBA Programme, Davangere

96
Prof. Vijay K S Business Statistics and Analytics

4. Solve the following LPP graphically
Maximum Z = 10X1 + 15X2
Subjected to 2X1 + X2 ≤ 26
2X1 + 4X2 ≤ 56
X1 – X2 ≥ - 5
X1, X2 ≥0


5. Solve the following LPP using Graphical method
Minimum Z = 20X1 + 10X2
Subjected to the constraints
X1 + 2X2 ≤ 40
3X1 + X2 ≥ 30
4X1 + 3X2 ≥ 60
Such that X1, X2 ≥ 0

BIET – MBA Programme, Davangere

97
Prof. Vijay K S Business Statistics and Analytics


Unit – 5
Part A
Transportation Problem

BIET – MBA Programme, Davangere

98
Prof. Vijay K S Business Statistics and Analytics

Transportation problem: basic feasible solution using NWCM, LCM, and VAM unbalanced,
restricted and maximization problems.

Transportation Problem
Transportation Problem is a particular case of LPP used to minimise the transportation
cost involved in transporting goods from “m” different origins to “n” different
destinations under the existing supply and demand constraints.
It is to transport various amounts of a single homogeneous commodity that are initially
stored at various origins, to different destinations in such a way that total transportation
cost is minimum.
The cost of transporting one unit of the commodity from each source to each destination
is also known. The commodity is to be transported from various sources to different
destinations in such a way that the requirement of each destination is satisfied and at the
same time the total cost of transportation in minimized.
Example: Pepsi has manufacturing unit at 4 cities in Karnataka and distributes more than
40 distribution centres.
A typical transportation problem contains
• Inputs:
• Sources with availability (Supply)
• Destinations with requirements (Demand)
• Unit cost of transportation from various sources to destinations (Cost)
• Objective:
• To determine schedule of transportation to minimize total transportation cost.

BIET – MBA Programme, Davangere

99
Prof. Vijay K S Business Statistics and Analytics

A transportation problem can be stated mathematically as follows:
Let there be ‘m’ SOURCES and ‘n’ DESTINATIONS
Let &#3627408462;
&#3627408470;
: the availability at the &#3627408470;
&#3627408481;ℎ
source
&#3627408463;
&#3627408471;
: The requirement of the &#3627408471;
&#3627408481;ℎ
destination.
&#3627408464;
&#3627408470;&#3627408471;
: The cost of transporting one unit of commodity from the &#3627408470;
&#3627408481;ℎ
source to the
&#3627408471;
&#3627408481;ℎ
destination
&#3627408485;
&#3627408470;&#3627408471;
: The quantity of the commodity transported from &#3627408470;
&#3627408481;ℎ
source to the &#3627408471;
&#3627408481;ℎ

destination (i=1, 2, …… m; j=1,2, …..n)

&#3627408505;&#3627408518;&#3627408529;&#3627408531;&#3627408518;&#3627408532;&#3627408518;&#3627408527;&#3627408533;&#3627408514;&#3627408533;??????&#3627408528;&#3627408527; &#3627408528;&#3627408519; ??????&#3627408518;&#3627408527;&#3627408518;&#3627408531;&#3627408514;&#3627408525; &#3627408507;&#3627408531;&#3627408514;&#3627408527;&#3627408532;&#3627408529;&#3627408528;&#3627408531;&#3627408533;&#3627408514;&#3627408533;??????&#3627408528;&#3627408527; ??????&#3627408531;&#3627408528;&#3627408515;&#3627408525;&#3627408518;&#3627408526;
Destinations Supply /
Availabilit
y
??????
&#3627409359; ??????
&#3627409360; ??????
&#3627409361; ??????
&#3627408527;




Origin /
Sources
&#3627408506;
&#3627409359;
X11
??????
11
X12
??????
12
X13
??????
13
X1n
??????
1&#3627408475;
&#3627408462;
1
&#3627408506;
&#3627409360;
X21
??????
21
X22
??????
22
X23
??????
23
X2n
??????
2&#3627408475;
&#3627408462;
2

&#3627408506;
&#3627409361;
X31
??????
31
X32
??????
32
X33
??????
33
X1n
??????
3&#3627408475;
&#3627408462;
3

.

.

&#3627408506;
&#3627408526;
Xm1
??????
&#3627408474;1
Xm2
??????
&#3627408474;2
Xm3
??????
&#3627408474;3
Xmn
??????
&#3627408474;&#3627408475;
&#3627408462;
&#3627408474;

Demand /
Requirements
&#3627408463;
1
&#3627408463;
2
&#3627408463;
3
&#3627408463;
&#3627408475;
&#3627408462;
&#3627408474;= &#3627408463;
&#3627408475;
The problem is to determine the values of xij such that total cost of transportation is
minimized.
We assume that the total quantity available is the same as the total requirement. i.e. Σai
= Σbj
• Balanced transportation problems
• Unbalanced transportation problems

BIET – MBA Programme, Davangere

100
Prof. Vijay K S Business Statistics and Analytics

Feasible Solution:
Any set of non-negative allocations which satisfies the row and column sum (Rim
Requirements) is called as feasible solution.
The feasible solution is called a basic feasible solution if the number of non-negative
allocations are equal to m+n-1, where “m” is the number of rows, “n” is the number of
column in a transportation table.
Non-Degenerate Basic Feasible Solution:
Any basic feasible solution to a transportation problem containing “m” Origins and “n”
Destinations is said to be non-degenerate, if it contains “m+n-1” occupied cells and each
allocation is in independent positions.
The allocation are said to be in independent positions, if it is impossible to form a closed
path.
Closed path means by allowing horizontal and vertical lines and all the corner cells are
occupied.
Degenerate Basic Feasible Solution:
If a basic feasible solution contains less than m+n-1 non negative allocations, it is said to
be degenerate.
Solution to Transportation Problem.
Transportation problem can be solved in 3 Phases
I. To find the Initial Basic Feasible Solution (IBFS) using any of the following
methods
A. North West Corner method
B. Least Cost Method
C. Vogel’s Approximation Method

II. Test the IBFS for Optimality using Modified Differences (Modi) / UV Method

III. Optimise Solution using Stepping stone algorithm method

BIET – MBA Programme, Davangere

101
Prof. Vijay K S Business Statistics and Analytics

Initial Basic Feasible Solution (IBFS) by North West Corner Method
(NWCM):
Step 1: Locate the cost situated at North West Corner of given cost matrix and allocate
quantity “Xij” such that it is min (Corresponding ai , bj )
Step 2: Again allocate the North West cell in reduced cost matrix and allocate as before
(Step 1)
Step 3: Continue allocating until all allocations exhaust

1: Find the IBFS by North West Corner Method

D1

D2

D3

D4

Supply
O1
5

3

1

2

30
O2
2

6

3

1

20
O3
6

3

1

5

40
O4
6

1

2

3

10
Demand
20

35

25

20

100

Note: If IBFS or Any Solution to Transportation Problem contains m+n-1 allocated cells
then solution is said to be non-degenerated (IBFS can be further improved for
optimization) Where m – No of rows, n – No of Columns of the given TP.
2: Obtain IBFS by North West Corner Method
P Q R S Supply
A 12 10 12 13 500
B 7 11 8 14 300
C 6 16 11 7 200
Demand 180 150 350 320 1000

BIET – MBA Programme, Davangere

102
Prof. Vijay K S Business Statistics and Analytics

3: Obtain IBFS by North West Corner Method
P Q R S Supply
A 6 4 1 5 14
B 8 9 2 7 16
C 4 3 6 2 5
Demand 6 10 15 4

4: Obtain IBFS by North West Corner Method
A B C D Supply
P 19 30 50 10 7
Q 70 30 40 60 9
R 40 8 70 20 18
Demand 5 8 7 14

5: Obtain IBFS by North West Corner Method
X Y Z Supply
A 50 40 80 400
B 80 70 40 400
C 60 70 60 500
D 60 60 60 400
E 30 50 40 800
Demand 800 600 1100

BIET – MBA Programme, Davangere

103
Prof. Vijay K S Business Statistics and Analytics

6: Obtain IBFS by North West Corner Method
A B C Supply
P 2 7 4 5
Q 3 3 1 8
R 5 4 7 7
S 1 6 2 14
Demand 7 9 18
Initial Basic Feasible Solution (IBFS) by Least Cost Method (LCM)
Step 1: Locate the cell with lowest cost and allocate accordingly to the minimum
(Corresponding ai , bj )
Step 2: Again locate the next least cost cell in the reduced cost matrix and allocate as
before
Step 3: Continue the process till all the allocations are made
7: Find IBFS by Least Cost Method:
D1 D2 D3 D4 Supply
O1 1 2 3 4 6
O2 4 3 2 0 8
O3 0 2 2 1 10
Demand 4 6 8 6
8: Find IBFS by Least Cost Method:
P Q R S Supply
A 6 4 1 5 14
B 8 9 2 7 16
C 4 3 6 2 5
Demand 6 10 15 4
9: Find IBFS by Least Cost Method:
D1 D2 D3 D4 Supply
F1 19 30 50 10 7
F2 70 30 40 60 9
F3 40 8 70 20 18
Demand 5 8 7 14

BIET – MBA Programme, Davangere

104
Prof. Vijay K S Business Statistics and Analytics

10: Find IBFS by Least Cost Method:
D1 D2 D3 Supply
F1 48 60 56 140
F2 45 55 53 260
F3 50 65 60 150
F4 52 64 55 220
Demand 200 320 250

11: Find IBFS by North West Corner Method and Least Cost Method
D1 D2 D3 D4 Supply
F1 19 30 50 10 7
F2 70 30 40 60 9
F3 40 8 7 14 34
Demand 5 8 7 14 34

12: Find the IBFS by LCM Method and NWCM
W1 W2 W3 Supply
F1 48 60 56 140
F2 45 55 53 260
F3 50 65 60 150
F4 52 64 55 220
Demand 200 320 250 770

Initial Basic Feasible Solution by Vogel’s Approximation Method
Step 1: Calculate penalty for each Row and Column by considering the difference
between least cost and next least cost in each row and column
Step 2: Find a row or column having highest penalty. Locate the least cost cell in that
row or column and allocate that cell with min (ai, bj)
Step 3: Again find fresh set of penalties for the reduced cost matrix and allocate as in
Step 2
Step 4: Continue allocating for further reduced cost matrix until all allocations are made

BIET – MBA Programme, Davangere

105
Prof. Vijay K S Business Statistics and Analytics

13: Find the IBFS by Vogel’s Approximation Method
W1 W2 W3 Supply
F1 48 60 56 140
F2 45 55 53 260
F3 50 65 60 150
F4 52 64 55 220
Demand 200 320 250 770
14: Find IBFS by VAM
D E F G Supply
A 7 14 8 12 400
B 9 10 12 5 300
C 11 6 11 4 300
Demand 200 250 300 250

15: Find IBFS by VAM
D1 D2 D3 D4 Supply
A 11 13 17 14 250
B 16 18 14 10 300
C 21 24 13 10 400
Demand 200 225 275 250

16: A diary firm has three plants located in a state. The daily milk production at each as
follows:
Plant 1: 6 Million litres
Plant 2: 1 Million Litres
Plant 3: 10 Million Litres
Each day, the firm must fulfil the needs of its four distribution centres. Minimum
requirement at each centre is as follows
Distribution centres 1: 7 Million Litres
Distribution centres 2: 5 Million Litres
Distribution Centres 3: 3 Million Litres
Distribution Centres 4: 2 Million Litres
Costs in hundreds of rupees of shipping one million litre from each plant to each
distribution centres is given in following table.
D1 D2 D3 D4
P1 2 3 11 7
P2 1 0 6 1
P3 5 8 15 9

BIET – MBA Programme, Davangere

106
Prof. Vijay K S Business Statistics and Analytics

17: Find the initial basic feasible solution for the following transportation problem by
VAM
D1 D2 D3 D4 Supply
O1 3 3 4 1 100
O2 4 2 4 2 125
O3 1 5 3 2 75
Demand 120 80 75 25 300
18: Find the initial solution to the following transportation problem using VAM
D1 D2 D3 D4 Supply
S1 19 30 50 10 7
S2 70 30 40 60 9
S3 40 8 70 20 18
Demand 5 8 7 14
19: Determine an initial basic feasible solution to the following TP by using VAM
D1 D2 D3 D4 Supply
S1 21 16 15 3 11
S2 17 18 14 23 13
S3 32 27 18 41 19
Demand 6 10 12 15

20: Determine an initial basic feasible solution to the following TP by using VAM
D1 D2 D3 D4 Supply
S1 1 2 1 4 30
S2 3 3 2 1 50
S3 4 2 5 9 20
Demand 20 40 30 10

21: Determine an initial basic feasible solution to the following TP by using VAM
D1 D2 D3 D4 Supply
O1 6 4 1 5 14
O2 8 9 2 7 16
O3 4 3 6 2 5
Demand 6 10 15 4

21: Determine an initial basic feasible solution to the following TP by using VAM
A B C D E F Available
O1 9 12 9 6 9 10 5
O2 7 3 7 7 5 5 6
O3 6 5 9 11 3 11 2
O4 6 8 11 2 2 10 9
Requirement 4 4 6 2 4 2

BIET – MBA Programme, Davangere

107
Prof. Vijay K S Business Statistics and Analytics

Unbalanced Transportation Problem:
TP is said to be unbalanced if total demand is not equal to total supply (∑&#3627408462;&#3627408470; ≠ &#3627408463;&#3627408471;) such
a TP is solved by transforming it to balanced TP by adding dummy row or column with
required supply / demand to make it balanced i.e.
Total supply (∑&#3627408462;&#3627408470; ) < Total demand (∑&#3627408463;&#3627408471; ), a dummy row with supply = (∑&#3627408463;&#3627408471;− ∑&#3627408462;&#3627408470; )
is added.
If total Supply (∑&#3627408462;&#3627408470; ) > Total demand (∑&#3627408463;&#3627408471; ), a dummy column with demand = (∑&#3627408462;&#3627408470; −
∑&#3627408463;&#3627408471; ) is added.
TP is then solved using the known procedure
22: Solve the following TP
D E F G Supply
A 7 14 8 12 400
B 9 10 12 5 300
C 11 6 11 4 300
Demand 200 450 300 250

23: Solve the following TP
D E F G Supply
A 7 14 8 12 400
B 9 10 12 5 300
C 11 6 11 4 300
Demand 200 450 300 250

24: A Company is spending Rs. 1200 and transportation of its unit from 3 plants to 4
destination centres. The supply and demand of units with unit cost of transportation is as
follows. What can be the maximum solving by optimal scheduling?
1 2 3 4 Supply
P1 20 30 50 17 7
P2 70 35 40 60 10
P3 40 12 60 25 18
Demand 5 8 7 15

BIET – MBA Programme, Davangere

108
Prof. Vijay K S Business Statistics and Analytics

25: Problem
• Holiday shipments of iPods to distribution centres
• Production at 3 facilities,
• A, supply 200k
• B, supply 350k
• C, supply 150k
• Distribute to 4 centers,
• N, demand 160k
• S, demand 140k
• E, demand 300k
• W, demand 200k
Total demand ≠ total supply. Obtain initial solution in the following transportation
problem by using VAM method
N S E W
A 16 13 22 17
B 14 13 19 15
C 9 20 23 10

BIET – MBA Programme, Davangere

109
Prof. Vijay K S Business Statistics and Analytics

Restricted Transportation Problem:
 Sometimes in a transpiration problem some routes may not be available. This
could be due to a variety of reasons like unfavourable weather condition or a strike
on particular route etc.
 In such a situation there is a restrictions on route available for transportation.
 We assign a very large cost represented by M to each of such routes which are not
available.
 The effect of adding a large cost element would be that such routes would
automatically be eliminated in the final solutions.
26: The XYZ Tobacco Company purchased and stores in warehouses located in the following four
cities
C1 C2 C3 Supply
A 7 10 5
B 12 9 4
C 7 3 11
D 9 5 7
Demand 120 100 110
Because of railroad construction, shipments are temporarily prohibited from warehouse
at city A to company C1. i) Find the IBFS for XYZ tobacco Company
27. Solve the below transportation problem

Factory
Warehouse
Supply W1 W2 W3
F1 16 12 200
F2 14 8 18 160
F3 26 16 90
Demand 180 120 150 450

Maximization of Transportation Problem:
If the TP contains profit matrix with an objective of maximization. It can be solved by
transforming profit matrix into cost matrix by rewriting the cost matric, such that all unit
profits subtracted in highest unit profit of the given profit matrix.
11. Solve for maximum profit
A B C D Supply
X 12 18 6 25 200
Y 8 7 10 18 500
Z 14 3 11 20 300
Demand 180 320 100 400

BIET – MBA Programme, Davangere

110
Prof. Vijay K S Business Statistics and Analytics

Questions from Previous Year Question Papers:
1. Solve the following transportation problem for Maximum profit. Only by initial basic
feasible solution
Per Unit Profit (Rs)
Market
Warehouse A B C D
X 12 18 6 25
Y 8 7 10 18
Z 14 3 11 20




2. Use North West corner method (NWCM) and least cost method (LCM) to find an initial basic
feasible solution to the transportation problem.
D1 D2 D3 D4 Supply
S1 19 30 50 10 7
S2 70 30 40 60 9
S3 40 8 70 20 18
Demand 5 8 7 14 34

3. Solve the following transportation problem for maximum profit
Warehouse Per Unit profit (Rs.) Market
A B C D
X 12 18 6 25
Y 8 7 10 18
Z 14 3 11 20
Availability of Ware Houses Demand in the market
X: 200 Units A: 180 Units
Y: 500 Units B: 320 Units
Z: 300 Units C: 100 Units
D: 400 Units
4. For the following Transportation problem find initial solution using
1) North West Corner method 2) Least cost method
To
I II III Supply
From A
B
C
5 1 7 10
6 4 9 80
3 2 8 55
Demand 75 20 50
Available at warehouse Demand in the market
X: 200 Units A 180 Units
Y: 500 Units B 320 Units
Z: 300 Units C 100 Units
D 400 Units

BIET – MBA Programme, Davangere

111
Prof. Vijay K S Business Statistics and Analytics

5. A company has 3 fabrics S1, S2 and S3 with production capacity of 7, 9 and 18 units (in 100s)
per week of a product respectively. These units are to be shipped to four warehouse D1, D2,
D3 and D4 with requirement of 5, 8, 7 and 14 Units ( in 100s) per week respectively. The
transportation costs (in Rupees) per units between factories to warehouse are given below
D1 D2 D3 D4 Supply
S1 19 30 50 10 7
S2 70 30 40 60 9
S3 40 8 70 20 18
Demand 5 8 7 14
Find initial solution using VAM Method

BIET – MBA Programme, Davangere

112
Prof. Vijay K S Business Statistics and Analytics






Unit – 6

Project Management

BIET – MBA Programme, Davangere

113
Prof. Vijay K S Business Statistics and Analytics

Syllabus:
Project Management:
Introduction – Basic difference between PERT & CPM – Network components and
precedence relationship – Critical path analysis – Project scheduling – Project Timecost
trade off- Resource Allocation, basic concept of project crashing.

PERT & CPM
Project Management evolved as a new field with the development of two analytical
techniques for planning, scheduling and controlling of projects. These are Critical Path
Method (CPM) and Project Evaluation and Review Techniques (PERT)
Application of PERT & CPM Techniques
These methods have been applied to a wide variety of problems in industries and have
found acceptable even in government organizations.
These includes
- Construction of dam or Canal system in a region
- Construction of building / highways
- Maintenance of aeroplanes or Oil refinery
- Space flight
- Cost control of a project using PERT / CPM
- Designing a prototype of a machine
- Development of supersonic planes
Basic Definitions:
Activity: Any individual operations, which utilizes resources and has an end and a
beginning is called activity. An arrow is commonly used to represent an activity with its
head indicating the direction of progress in the project. These are usually classified into
following 4 categories
1. Predecessor Activity
Activity that must be completed immediately prior too the start of another activity are
called as predecessor activity
2. Successor Activity
Activities that cannot be started until one or more of other activities are completed,
but immediately succeed them are called successor activity.
3. Concurrent Activity
Activities which can be accomplished concurrently are known as concurrent
activities. It may be noted that an activity can be predecessor or a successor to an
event or it may be concurrent with one or more of the other activities.

BIET – MBA Programme, Davangere

114
Prof. Vijay K S Business Statistics and Analytics

4. Dummy Activity:
An activity which does not consume any kind of resources but merely depicts the
technological dependence is called a dummy activity.
It may be noted that the dummy activity is inserted in the network to clarify the
activity pattern in the following two situation.
- To make activities with common starting and finishing points distinguishable
- To identify and maintain the proper precedence relationship between activities
that are not connected by events.

Events: An Event represent a point in time signifying the completion of some activities
and the beginning of new ones. This is usually represented by circle “O” in a network,
which is called as node or connector.
The events can be further classified into following 3 categories
- Merge Events:
- When more than one activity comes and joins an event, such event is known as
merge event




- Burst Events
- When more than one activity leaves an event is known as burst event



- Merge and Burst Events
An activity may be merge and burst event at the same time as with respect to some
activities it can be a merge event with respect to some other activities it may be a
burst event

BIET – MBA Programme, Davangere

115
Prof. Vijay K S Business Statistics and Analytics

Basic difference between PERT and CPM



Rules to be followed during the construction of network
1. No single activity can be represented more than once in a network. The length of
an arrow has no significance.
2. The event numbered 1 is the start event and an event with highest number is the
end event. Before an activity can be undertaken, all activities preceding it must
be completed. That is, the activities must follow a logical sequence (or –
interrelationship) between activities.
3. In assigning numbers to events, there should not be any duplication of event
numbers in a network.
4. Dummy activities must be used only if it is necessary to reduce the complexity of
a network.
5. A network should have only one start event and one end event.

CPM PERT
 CPM uses activity oriented network.  PERT uses event oriented Network.
 Durations of activity may be
estimated with a fair degree of
accuracy.
 Estimate of time for activities are not so accurate
and definite.
 It is used extensively in construction
projects.
 It is used mostly in research and development
projects, particularly projects of non-repetitive
nature.
 Deterministic concept is used.  Probabilistic model concept is used.
 CPM can control both time and cost
when planning.
 PERT is basically a tool for planning.
 In CPM, cost optimization is given
prime importance. The time for the
completion of the project depends
upon cost optimization. The cost is
not directly proportioned to time.
Thus, cost is the controlling factor.
 In PERT, it is assumed that cost varies directly
with time. Attention is therefore given to
minimize the time so that minimum cost results.
Thus in PERT, time is the controlling factor.

BIET – MBA Programme, Davangere

116
Prof. Vijay K S Business Statistics and Analytics

Some conventions of network diagram are shown in Figures below:














Errors in Construction of Network Diagram

BIET – MBA Programme, Davangere

117
Prof. Vijay K S Business Statistics and Analytics

BIET – MBA Programme, Davangere

118
Prof. Vijay K S Business Statistics and Analytics


Dummy Activity
A Dummy activity is an imaginary activity. It does not exist in the Project activities. It
is used in the network diagram to show dependency relationship or connectivity
between two or more activities. It is represented by a dotted arrow.

Procedure for drawing a CPM network.
1. Specify the individual activities.
From the Work Breakdown Structure, a listing can be made of all the activities in
the project. This listing can be used as the basis for adding sequence and duration
information in later steps.

2. Determine the sequence of those activities.
Some activities are dependent upon the completion of others. A listing of the
immediate predecessors of each activity is useful for constructing the CPM
network diagram.

3. Draw a network diagram.
Once the activities and their sequencing have been defined, the CPM diagram can
be drawn. CPM originally was developed as an activity on node (AON) network,
but some project planners prefer to specify the activities on the arcs.

4. Estimate the completion time for each activity.
The time required to complete each activity can be estimated using past
experience or the estimates of knowledgeable persons. CPM is a deterministic
model that does not take into account variation in the completion time, so only
one number can be used for an activity’s time estimate.

5. Identify the critical path
The critical path is the longest-duration path through the network. The
significance of the critical path is that the activities that lie on it cannot be delayed
without delaying the project. Because of its impact on the entire project, critical
path analysis is an important aspect of project planning.

BIET – MBA Programme, Davangere

119
Prof. Vijay K S Business Statistics and Analytics


Problems:

1. The following table gives the activities in a project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 4
C A 3
D B 8
E C 14
F D 8
G E F 9


2. The following table gives the activities in a project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 2
C A 3
D A 4
E C B 3


3. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 4
C A 3
D B 8
E B 14
F C D 8

BIET – MBA Programme, Davangere

120
Prof. Vijay K S Business Statistics and Analytics



4. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 4
C A 3
D B 8
E B C 14
F E D 8




5. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 2
B A 4
C A 3
D B 6
E C D 12
F C D 6
G E 9
H G 3

6. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 2
B 4
C 3
D A 6
E C 12

BIET – MBA Programme, Davangere

121
Prof. Vijay K S Business Statistics and Analytics

F A 6
G D B E 9

7. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 3
B 4
C A 3
D A 6
E B 8
G D E 6
H D E 9
I D E 3
J C G 2
K F I 3


8. Draw a network corresponding to the following information
Activity 1 – 2 1 – 3 2 – 6 3 – 4 3 – 5 4 – 6 5 – 6 5 – 7 6 – 7
Duration 4 6 8 7 4 6 5 19 10

A) Draw the network diagram
B) Determine the critical path


9. Draw a network diagram for the following information obtain
 the respective time estimates, Calculate Total Float (TF), Free Float (FF)
and Independent Float (IF)
 Identify Critical Activity
Event Activity Duration
1 – 2 A 4
1 – 3 B 6
2 – 6 C 8
3 – 4 D 7
3 – 5 E 4
4 – 6 F 6
5 – 6 G 5
5 – 7 H 17
6 – 7 J 10

BIET – MBA Programme, Davangere

122
Prof. Vijay K S Business Statistics and Analytics

Note:
Total Float (TF) = Latest Start Time (Lst) – Earliest Start Time (Est)
Free Float (FF) = Total Float (TF) – Head Event Slack (HES)
Head Event Slack (HES) = Latest Finish Time (Lft) – Earliest Finish Time (Eft)
Independent Float (IF) = Free Float – Tail Event Slack (TES)
Tail Event Slack (TES) = Latest Start Time (Lst) – Earliest Start Time (Est)
10. A small project consist of 7 activities with following information
Activity Preceding Activity Duration
A - 4
B - 6
C - 8
D A B 7
E A B 4
F C D E 6
G C D E 5


11. The project has the following characteristics

- Constitute network diagram
- Calculate all time estimates, find the length of the project with critical path activities
using total float

Events Activity Time
1 – 2 A 2
1 – 4 B 2
1 – 7 C 1
2 – 3 D 4
3 – 6 E 1
4 – 5 F 5
4 – 8 G 8
5 – 6 H 4
6 – 9 I 3
7 – 8 J 3
8 – 9 K 5

BIET – MBA Programme, Davangere

123
Prof. Vijay K S Business Statistics and Analytics

12. Draw a network diagram from the following information

A < D, E ; B, D < F ; C < G ; B, D < H ; F, G < I

Construct network diagram, find CP using TF also find total project length.

Task A B C D E F G H I
Time 23 8 29 16 24 18 19 4 10


13. With the following information calculate Total Float and Project duration

Activity A B C D E F G H I J
Preceding
Activity
- - AB B A C E F D F G H I
Duration 2 3 4 1 5 3 2 7 6 3

14. The following table gives activities of a network with time estimates
 Draw a network diagram
 Calculate time duration of the project
 Calculate the Variance of Critical path
 Find the probability that the project will be completed in 41 days
Events Estimated Duration
to (Optimistic
Time)
tm (Most Likely Time
)
tp (Pessimistic Time )
1 – 2 3 6 15
1 – 6 2 5 14
2 – 3 6 12 30
2 – 4 2 5 8
3 – 5 5 11 17
4 – 5 3 6 15
6 – 7 3 9 27
5 – 8 1 4 7
7 – 8 4 19 28

BIET – MBA Programme, Davangere

124
Prof. Vijay K S Business Statistics and Analytics

Questions from Previous Year Question Papers:
1. Given the following information on a small project: A is the first activity of the project and
precedes the activity B and C. The activity D succeeds both B and C whereas only C is required to
start activity E. D Precedes F while G Succeeds E. H is the last activity of the project and succeeds
F and G. Draw a network diagram based on this information.

2. Draw a network diagram corresponding to the following information:
Activity 1-2 1-3 2-6 3-4 3-5 4-6 5-6 5-7 6-7
Duration 4 6 8 7 4 6 5 19 10
A) Draw a network diagram
B) Obtain early and late start time and completion times
C) Determine the critical path.

3. A Small project is composed of 7 activities whose time estimates are listed in the table below in
weeks.
Activity Optimistic Time Pessimistic Time Most Likely Time
1-2 1 7 1
1-3 1 7 4
1-4 2 8 2
2-5 1 1 1
3-5 2 14 5
4-6 2 8 5
5-6 3 15 6
a. Draw the network and find the expected project length
b. What is the probability that the project will be completed at-least 4 weeks earlier than
expected time.
4. Tasks A,B,C…….H, I constitute a project. The precedence relationship are:
A<D; A<E; B<F; D<F; C<G; C<H; F<I; G<I.
Draw a network diagram to represent the project and find the critical path when time in days of
each task is:
Task A B C D E F G H I
Time 8 10 8 10 16 17 18 14 9
Identify critical path with the help of EST, EFT, LST and LFT



 A project consists of nine activities whose time estimates (in Weeks) and other
characteristics are given below.
Activity Proceeding
Activity /lies
Time Estimates (Weeks)
Most
Optimistic
Most Likely Most
Pessimistic
A - 2 4 6
B - 6 6 6
C - 6 12 24
D A 2 5 8
E A 11 14 23

BIET – MBA Programme, Davangere

125
Prof. Vijay K S Business Statistics and Analytics

F B, D 8 10 12
G B, D 3 6 9
H C, F 9 15 27
I E 4 10 16
A) Show the PERT Network for the project
B) Identify the critical activities and find the expected project completion time and its
variance
C) If the project is required to be completed by December 31 of a given year and the
manager wants to be 95% sure of meeting the deadline, when he should start the
project work. Given P (0<Z<1.645) = 0.45

 The following table gives the activities in a construction project
Activity Immediate
Predecessor
Time
(Days )
A - 4
B - 6
C - 2
D A 5
E C 2
F A 7
G D, B, E 4

 A Small Project composed of 7 activities whose time estimates are given below
Activity Time Estimates (Weeks)
Most
Optimistic
Most Likely Most
Pessimistic
1-2 1 1 7
1-3 1 4 7
1-4 2 2 8
2-5 1 1 1
3-5 2 5 14
4-6 2 5 8
5-6 3 6 15
a) Draw the project network diagram.
b) Find the expected duration and variance of each activity. What is the expected length
and project standard deviation?
c) Calculate the probability of completing the project by 13 days


 Draw a network corresponding to the following information
a) Draw the network
b) Obtain early and late start time and completion times
c) Determine the critical path
d) Determine the total float
Activity 1-2 1-3 2-6 3-4 3-5 4-6 5-6 5-7 6-7
Duration 4 6 8 7 4 6 5 19 10