Lect 2 Data organization and Presentation-2013.ppt

latidasalegn4 0 views 43 slides Oct 08, 2025
Slide 1
Slide 1 of 43
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43

About This Presentation

Read this book about health


Slide Content

Methods of Data Methods of Data
Organization and Organization and
PresentationPresentation
Direslgne M (MPH, Ass. Professor)
1

1. Describing categorical variables1. Describing categorical variables
Tables
◦Frequency
◦Relative frequency
◦Cumulative frequencies
Charts
◦Bar charts
◦Pie charts
2

Frequency distributions
Simple and effective way of summarizing categorical data
Done by counting the number of observations falling into each of the categories or levels of the
variables.
E.g. birth weight with levels ‘Very low ’, ‘Low’, ‘Normal’ and ‘big’.
The frequency distribution for newborns is obtained simply by counting the number of newborns
in each birth weight category.
.
3

Tables
Example 1: The blood type of 30 patients were given as follows;
A AB B B A O O AB AB B O A A B B
AB A O AB B AB AB O A AB AB O A O A
Construct table for it?
Type Frequency
A 8
B 6
AB 9
O 7
Total 30
4

Very low, Low, Very low, Normal, Big, Big,
Big, Very low, Very low, Low, Normal,
Normal, Low, Big

5
Weight Frequency
Very Low
Low
Normal
Big
Total

Relative FrequencyRelative Frequency
It is percentages of observations in each category.
The distribution of proportions is called the relative
frequency distribution of the variable.
The total sum of the relative frequency is equal to 100% or
1.
6

Cumulative frequency
It is the number of observations in the category
plus observations in all categories smaller than it.
Cumulative relative frequency
 It is the proportion of observations in the category
plus observations in all categories smaller than it.
It is obtained by dividing the cumulative frequency
by the total number of observations.
7

Guidelines for constructing tables
Keep them simple
Limit the number of variables to three or less
All tables should be self-explanatory
Include clear title telling what, when and where
Clearly label the rows and columns
State clearly the unit of measurement used
Explain codes and abbreviations in the foot-note
Show totals
If data is not original, indicate the source in foot-note.
8

Table 1. Distribution of birth weight of newborns
between 1976-1996 at Arba Minch town Ethiopia.
BWT Freq. Cum. Freq Rel.Freq(%) Cum.rel.freq.(%)
Very low 43
Low 793
Normal 8870
Big 268
Total 9974
BWT Freq. Cum. Freq Rel.Freq(%) Cum.rel.freq.(%)
Very low 43 43 0.4 0.4
Low 793 836 8.0 8.4
Normal 8870 9706 88.9 97.3
Big 268 9974 2.7 100
Total 9974 100
9

Simple Frequency Distribution (one variable table)
Primary and secondary cases of syphilis morbidity by age, Ethiopia, 1989
Age group
(years)
Cases
Number RF(Percent)
0-14
15-19
20-24
25-29
30-34
35-44
45-54
>44
230
4378
10405
9610
8648
6901
2631
1278
0.5
10.0
23.6
21.8
19.6
15.7
6.0
2.9
Total 44081 100
10

Two Variable Table
Primary and secondary cases of syphilis morbidity by age, 1989
Age group
(years)
Number of cases
Male Female Total
0-14
15-19
20-24
25-29
30-34
35-44
45-54
>54
40
1710
5120
5301
5537
5004
2144
1147
190
2668
5285
4306
3111
1897
487
131
230
4378
10405
9610
8648
6901
2631
1278
Total 26006 18075 44081
11

Tables can also be used to present more than three or
more variables.
Variable Frequency (n) Percent
Sex
Male
Female
Age (yrs)
15-19
20-24
25-29
Religion
Christian
Muslim
Occupation
Student
Farmer
Merchant
12

2) Describing Quantitative variable:
Frequency Distribution Table
Graphs
Histograms
Frequency polygons
Cumulative frequency curve
Numerical Summary measures
Measures of central location
or central tendency.
◦mean
◦median
◦mode
Measures of
dispersion or
variability.
◦Range
◦Inter-quartile range
◦Variance and
standard deviation
13

Frequency Distribution Frequency Distribution
14

Sturge’s rule:
A guide on the determination of the number of classes (k)
can be the Sturge’s Formula, given by:
◦Where K = number of class intervals
◦n = number of observations
◦W = width of the class interval
◦L = the largest value
◦S = the smallest value
K
SL
W
)3.322(logn1K



15

The age at first marriage of women were given as follows;
17 23 20 21 19 20 28 15 13 18 17 14 20 15 14 22 14 21 20 15 19
15 16 17 16 14 19 14 19 25 18 19 19 19 22 15 21 17 16 18 15 12 19
21 12 16 16 19 17 16 20 17 19 19 19 16 15 14 17 15 15 19 21 27 22
26 16 12 18 18 15 18 15 23 15 16 13 14 17 14 18 21 17 15 15 22 20
18 15 18 14 14 24
Construct frequency distribution for the data.
Solution:
Example 1
16
3.2
7
16
7
1228
K
SL
W
54.792))3.322(log(1)3.322(logn1K






K=7 and W=3

Agefm Mi frequency RF CF RCF
12-14 13 14 14/92 14 14/92
15-17 16 33 33/92 47 47/92
18-20 19 28 28/92 75 75/92
21-23 22 12 12/92 87 87/92
24-26 25 3 3/92 90 90/92
27-29 28 2 2/92 92 92/92
92
Solution
17
Where Mi refers to the class mark
RF=relative frequency
 CF= cumulative frequency
RCF=relative cumulative frequency

Example 2:Example 2:
Leisure time (hours) per week for 40 college students:
23 24 18 14 20 36 24 26 23 21 16 15 19 20 22 14 13
10 19 27 29 22 38 28 34 32 23 19 21 31 16 28 19 18
12 27 15 21 25 16
Calculate K, W, F, CF, UCL, LCB, CM
K = 1 + 3.322 (log40) = 6.32 ≈ 6
Maximum value = 38, Minimum value = 10
Width = (38-10)/6 = 4.66 ≈ 5
18

Time
(Hour
s)
Frequency
Relative
Frequency
Cumulative
Frequency
Cumulative
Relative
Frequency
10-14
15-19
20-24
25-29
30-34
35-39
5
11
12
7
3
2
0.125
0.275
0.300
0.175
0.075
0.050
5
16
28
35
38
40
0.125
0.400
0.700
0.875
0.950
1.00
Total 40 1.00
19

Time
(Hours)True limit(class
boundary)
Mid-
point
Frequency
10-14
15-19
20-24
25-29
30-34
35-39
9.5 – 14.5
14.5 – 19.5
19.5 – 24.5
24.5 – 29.5
29.5 – 34.5
34.5 - 39.5
12
17
22
27
32
37
5
11
12
7
3
2
Total 40
20

Leisure time (hours) per week for 80 college students:
23 24 18 14 20 36 24 26 23 21 16 15 19 20 22 14 13
10 19 27 29 22 38 28 34 32 23 19 21 31 16 28 19 18
12 27 15 21 25 16 23 24 18 14 20 36 24 26 23 21 16
15 19 20 22 14 13 10 19 27 29 22 38 28 34 32 23 19
21 31 16 28 19 18 12 27 15 21 25 16
Calculate K, W, F,RF, CF, RCF, UCL, UCB, LCB, CM
K = 1 + 3.322 (log80) = 7.32 ≈ 7
Maximum value = 38, Minimum value = 10
Width = (38-10)/7 = 28/7=4
21

KFXCUCLLCBRF CFRCFUCB
1
Total 1
22

KFMiUCLLCBRF CFRCFUCB
10-13
14-17
18-21
22-25
26-29
30-33
34-37
38-41
6
14
22
16
12
4
4
2
11.5
15.5
19.5
23.5
27.5
31.5
35.5
39.5
13
17
21
25
29
33
37
41
9.5
13.5
17.5
21.5
25.5
29.5
33.5
37.5
0.075
0.175
0.275
0.2
0.15
0.05
0.05
0.025
6
20
42
58
70
74
78
80
0.075
0.25
0.525
0.725
0.875
0.925
0.975
1
13.5
17.5
21.5
25.5
29.5
33.5
37.5
41.5
Total80 1
23

Diagrammatic RepresentationDiagrammatic Representation
Pictorial representations of numerical data
Importance of diagrammatic representation
1.Diagrams have greater attraction than mere figures.
2. They give quick overall impression of the data.
3. They have great memorizing value than mere figures.
4. They facilitate comparison
5. Used to understand patterns and trends
24

Specific types of graphs include:
oBar graph
oPie chart
→Histogram
→Frequency polygon
→Cum. Freq. curve
→Box plot
→Line graph
Qualitative data
(Nominal, ordinal
data)
Quantitative
data
25

1. Bar charts (or graphs)1. Bar charts (or graphs)
In a bar chart the various categories into which the
observation fall are represented along horizontal
axis and a vertical bar is drawn above each category
such that the height of the bar represents either the
frequency or the relative frequency of observation
within the class.
The vertical axis should always start from 0 but the
horizontal can start from any where.
The bars should be of equal width and should be
separated from one another so as not to imply
continuity.
Label both axes clearly
26

Types of Bar chart Types of Bar chart
There are different types of bar-chart
depending on the objective and type of
information that we want to present.
Types of Bar Chart
1. Simple Bar Chart
2. Grouped/Multiple Bar Chart
3. Stacked/Complex Bar Chart
27

28

2. Multiple bar chart:
In this type of chart the component figures are
shown as separate bars adjoining each other.
The height of each bar represents the actual
value of the component figure.
It depicts distributional pattern of more than
one variable.
29

Fig. TT Immunization status by marital status of women
15-49 years, Asendabo town, 1996
0
50
100
150
200
250
300
350
Married Single Divorced Widowed
Marital status
N
u
m
b
e
r o
f w
o
m
e
n
ImmunizedNot immunized
30

31

3. Pie chart3. Pie chart
Shows the relative frequency for each category by dividing
a circle into sectors.
 The angles are proportional to the relative frequency.
Used for a single categorical variable
Use percentage distributions
32

ExampleExample: Distribution of deaths for females, in : Distribution of deaths for females, in
England and Wales, 1989.England and Wales, 1989.
Cause of death No. of death
Circulatory system
Neoplasm
Respiratory system
Injury and poisoning
Digestive system
Others
100 000
70 000
30 000
6 000
10 000
20 000
Total 236 000
33

Distribution fo cause of death for females, in England and Wales, 1989
Circulatory system
42%
Neoplasmas
30%
Respiratory system
13%
Injury and Poisoning
3%
Digestive System
4%
Others
8%
34

35
S.No Color Frequency
1 Red 30
2 Green 20
3 White 10
4 Black 40
Construct bar chart and pie chart for this data

36
Fig 1. Pie chart for color of materials in AT,
2022

37
Fig 2. Bar chart for color of materials in AT,
2022

4. Histogram 4. Histogram (quantitative continuous data)(quantitative continuous data)
Histograms are frequency distributions with
continuous class interval that have been turned into
graphs.
Given a set of numerical data, we can obtain
impression of the shape of its distribution by
constructing a histogram.
Constructed by choosing a set of non-overlapping class
intervals & counting the number of observations that
fall in each class.
38

How to construct histogram …
1)The horizontal axis is a continuous scale running
from one extreme end of the distribution to the
other.
It should be labeled with the name of the variable
and the units of measurement.
2)For each class in the distribution a vertical rectangle
is drawn with:
i.its base on the horizontal axis extending from one
class boundary of the class to the other class
boundary, there will never be any gap between
the histogram rectangles.
ii.the bases of all rectangles will be determined by
the width of the class intervals.
39

It is necessary that the class intervals be non-
overlapping so that each observation falls in one
and only one interval.
Bars are drawn over the intervals
The area of each bar is proportional to the
frequency of observations in the interval
40

Example: Distribution of the age of women at the time of marriage
Age
group
15-1920-2425-2930-3435-
39
40-
44
45-49
Number11 36 28 13 7 3 2
Age of women at the time of marriage
0
5
10
15
20
25
30
35
40
14.5-19.519.5-24.524.5-29.529.5-34.534.5-39.539.5-44.544.5-49.5
Age group
N
o

o
f

w
o
m
e
n
41

Two problems with histogramsTwo problems with histograms
1.They are somewhat difficult to construct
2.The actual values within the respective groups
are lost and difficult to reconstruct
42

43
Tags