Frequency distributions
Simple and effective way of summarizing categorical data
Done by counting the number of observations falling into each of the categories or levels of the
variables.
E.g. birth weight with levels ‘Very low ’, ‘Low’, ‘Normal’ and ‘big’.
The frequency distribution for newborns is obtained simply by counting the number of newborns
in each birth weight category.
.
3
Tables
Example 1: The blood type of 30 patients were given as follows;
A AB B B A O O AB AB B O A A B B
AB A O AB B AB AB O A AB AB O A O A
Construct table for it?
Type Frequency
A 8
B 6
AB 9
O 7
Total 30
4
Very low, Low, Very low, Normal, Big, Big,
Big, Very low, Very low, Low, Normal,
Normal, Low, Big
5
Weight Frequency
Very Low
Low
Normal
Big
Total
Relative FrequencyRelative Frequency
It is percentages of observations in each category.
The distribution of proportions is called the relative
frequency distribution of the variable.
The total sum of the relative frequency is equal to 100% or
1.
6
Cumulative frequency
It is the number of observations in the category
plus observations in all categories smaller than it.
Cumulative relative frequency
It is the proportion of observations in the category
plus observations in all categories smaller than it.
It is obtained by dividing the cumulative frequency
by the total number of observations.
7
Guidelines for constructing tables
Keep them simple
Limit the number of variables to three or less
All tables should be self-explanatory
Include clear title telling what, when and where
Clearly label the rows and columns
State clearly the unit of measurement used
Explain codes and abbreviations in the foot-note
Show totals
If data is not original, indicate the source in foot-note.
8
Table 1. Distribution of birth weight of newborns
between 1976-1996 at Arba Minch town Ethiopia.
BWT Freq. Cum. Freq Rel.Freq(%) Cum.rel.freq.(%)
Very low 43
Low 793
Normal 8870
Big 268
Total 9974
BWT Freq. Cum. Freq Rel.Freq(%) Cum.rel.freq.(%)
Very low 43 43 0.4 0.4
Low 793 836 8.0 8.4
Normal 8870 9706 88.9 97.3
Big 268 9974 2.7 100
Total 9974 100
9
Simple Frequency Distribution (one variable table)
Primary and secondary cases of syphilis morbidity by age, Ethiopia, 1989
Age group
(years)
Cases
Number RF(Percent)
0-14
15-19
20-24
25-29
30-34
35-44
45-54
>44
230
4378
10405
9610
8648
6901
2631
1278
0.5
10.0
23.6
21.8
19.6
15.7
6.0
2.9
Total 44081 100
10
Two Variable Table
Primary and secondary cases of syphilis morbidity by age, 1989
Age group
(years)
Number of cases
Male Female Total
0-14
15-19
20-24
25-29
30-34
35-44
45-54
>54
40
1710
5120
5301
5537
5004
2144
1147
190
2668
5285
4306
3111
1897
487
131
230
4378
10405
9610
8648
6901
2631
1278
Total 26006 18075 44081
11
Tables can also be used to present more than three or
more variables.
Variable Frequency (n) Percent
Sex
Male
Female
Age (yrs)
15-19
20-24
25-29
Religion
Christian
Muslim
Occupation
Student
Farmer
Merchant
12
2) Describing Quantitative variable:
Frequency Distribution Table
Graphs
Histograms
Frequency polygons
Cumulative frequency curve
Numerical Summary measures
Measures of central location
or central tendency.
◦mean
◦median
◦mode
Measures of
dispersion or
variability.
◦Range
◦Inter-quartile range
◦Variance and
standard deviation
13
Frequency Distribution Frequency Distribution
14
Sturge’s rule:
A guide on the determination of the number of classes (k)
can be the Sturge’s Formula, given by:
◦Where K = number of class intervals
◦n = number of observations
◦W = width of the class interval
◦L = the largest value
◦S = the smallest value
K
SL
W
)3.322(logn1K
15
The age at first marriage of women were given as follows;
17 23 20 21 19 20 28 15 13 18 17 14 20 15 14 22 14 21 20 15 19
15 16 17 16 14 19 14 19 25 18 19 19 19 22 15 21 17 16 18 15 12 19
21 12 16 16 19 17 16 20 17 19 19 19 16 15 14 17 15 15 19 21 27 22
26 16 12 18 18 15 18 15 23 15 16 13 14 17 14 18 21 17 15 15 22 20
18 15 18 14 14 24
Construct frequency distribution for the data.
Solution:
Example 1
16
3.2
7
16
7
1228
K
SL
W
54.792))3.322(log(1)3.322(logn1K
K=7 and W=3
Agefm Mi frequency RF CF RCF
12-14 13 14 14/92 14 14/92
15-17 16 33 33/92 47 47/92
18-20 19 28 28/92 75 75/92
21-23 22 12 12/92 87 87/92
24-26 25 3 3/92 90 90/92
27-29 28 2 2/92 92 92/92
92
Solution
17
Where Mi refers to the class mark
RF=relative frequency
CF= cumulative frequency
RCF=relative cumulative frequency
Example 2:Example 2:
Leisure time (hours) per week for 40 college students:
23 24 18 14 20 36 24 26 23 21 16 15 19 20 22 14 13
10 19 27 29 22 38 28 34 32 23 19 21 31 16 28 19 18
12 27 15 21 25 16
Calculate K, W, F, CF, UCL, LCB, CM
K = 1 + 3.322 (log40) = 6.32 ≈ 6
Maximum value = 38, Minimum value = 10
Width = (38-10)/6 = 4.66 ≈ 5
18
Time
(Hour
s)
Frequency
Relative
Frequency
Cumulative
Frequency
Cumulative
Relative
Frequency
10-14
15-19
20-24
25-29
30-34
35-39
5
11
12
7
3
2
0.125
0.275
0.300
0.175
0.075
0.050
5
16
28
35
38
40
0.125
0.400
0.700
0.875
0.950
1.00
Total 40 1.00
19
Diagrammatic RepresentationDiagrammatic Representation
Pictorial representations of numerical data
Importance of diagrammatic representation
1.Diagrams have greater attraction than mere figures.
2. They give quick overall impression of the data.
3. They have great memorizing value than mere figures.
4. They facilitate comparison
5. Used to understand patterns and trends
24
Specific types of graphs include:
oBar graph
oPie chart
→Histogram
→Frequency polygon
→Cum. Freq. curve
→Box plot
→Line graph
Qualitative data
(Nominal, ordinal
data)
Quantitative
data
25
1. Bar charts (or graphs)1. Bar charts (or graphs)
In a bar chart the various categories into which the
observation fall are represented along horizontal
axis and a vertical bar is drawn above each category
such that the height of the bar represents either the
frequency or the relative frequency of observation
within the class.
The vertical axis should always start from 0 but the
horizontal can start from any where.
The bars should be of equal width and should be
separated from one another so as not to imply
continuity.
Label both axes clearly
26
Types of Bar chart Types of Bar chart
There are different types of bar-chart
depending on the objective and type of
information that we want to present.
Types of Bar Chart
1. Simple Bar Chart
2. Grouped/Multiple Bar Chart
3. Stacked/Complex Bar Chart
27
28
2. Multiple bar chart:
In this type of chart the component figures are
shown as separate bars adjoining each other.
The height of each bar represents the actual
value of the component figure.
It depicts distributional pattern of more than
one variable.
29
Fig. TT Immunization status by marital status of women
15-49 years, Asendabo town, 1996
0
50
100
150
200
250
300
350
Married Single Divorced Widowed
Marital status
N
u
m
b
e
r o
f w
o
m
e
n
ImmunizedNot immunized
30
31
3. Pie chart3. Pie chart
Shows the relative frequency for each category by dividing
a circle into sectors.
The angles are proportional to the relative frequency.
Used for a single categorical variable
Use percentage distributions
32
ExampleExample: Distribution of deaths for females, in : Distribution of deaths for females, in
England and Wales, 1989.England and Wales, 1989.
Cause of death No. of death
Circulatory system
Neoplasm
Respiratory system
Injury and poisoning
Digestive system
Others
100 000
70 000
30 000
6 000
10 000
20 000
Total 236 000
33
Distribution fo cause of death for females, in England and Wales, 1989
Circulatory system
42%
Neoplasmas
30%
Respiratory system
13%
Injury and Poisoning
3%
Digestive System
4%
Others
8%
34
35
S.No Color Frequency
1 Red 30
2 Green 20
3 White 10
4 Black 40
Construct bar chart and pie chart for this data
36
Fig 1. Pie chart for color of materials in AT,
2022
37
Fig 2. Bar chart for color of materials in AT,
2022
4. Histogram 4. Histogram (quantitative continuous data)(quantitative continuous data)
Histograms are frequency distributions with
continuous class interval that have been turned into
graphs.
Given a set of numerical data, we can obtain
impression of the shape of its distribution by
constructing a histogram.
Constructed by choosing a set of non-overlapping class
intervals & counting the number of observations that
fall in each class.
38
How to construct histogram …
1)The horizontal axis is a continuous scale running
from one extreme end of the distribution to the
other.
It should be labeled with the name of the variable
and the units of measurement.
2)For each class in the distribution a vertical rectangle
is drawn with:
i.its base on the horizontal axis extending from one
class boundary of the class to the other class
boundary, there will never be any gap between
the histogram rectangles.
ii.the bases of all rectangles will be determined by
the width of the class intervals.
39
It is necessary that the class intervals be non-
overlapping so that each observation falls in one
and only one interval.
Bars are drawn over the intervals
The area of each bar is proportional to the
frequency of observations in the interval
40
Example: Distribution of the age of women at the time of marriage
Age
group
15-1920-2425-2930-3435-
39
40-
44
45-49
Number11 36 28 13 7 3 2
Age of women at the time of marriage
0
5
10
15
20
25
30
35
40
14.5-19.519.5-24.524.5-29.529.5-34.534.5-39.539.5-44.544.5-49.5
Age group
N
o
o
f
w
o
m
e
n
41
Two problems with histogramsTwo problems with histograms
1.They are somewhat difficult to construct
2.The actual values within the respective groups
are lost and difficult to reconstruct
42