2 Methods of Data Presentation print.pdf

nafyadguta095 56 views 123 slides Jun 13, 2024
Slide 1
Slide 1 of 123
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112
Slide 113
113
Slide 114
114
Slide 115
115
Slide 116
116
Slide 117
117
Slide 118
118
Slide 119
119
Slide 120
120
Slide 121
121
Slide 122
122
Slide 123
123

About This Presentation

Biostatistics


Slide Content

Methods of Data Presentation
•Thedatacollectedinsurveyorotherempirical
inquiryarecalledrawdata.
•Havingcollectedandeditedthedata,thenext
importantstepistoorganizeit.
•Theprocessofarrangingdataintoclassesor
categoriesaccordingtosimilaritiestechnicallyis
calledclassification.
12/5/2023 1

Cont…
•Classificationisapreliminaryanditprepares
thegroundforproperpresentationofdata.
•Mainly,thepurposeofclassificationisto
dividethedataintohomogeneousgroupsor
class.
12/5/2023 2

Cont..
•The classification of the data generally done on
basis on the following:
–Geographical,
–Choronological,
–Qualitative or
–Quantitative.
12/5/2023
3

Cont…
1) In geographical classification, data are
arranged according to places, areas or
regions.
12/5/2023 4

Cont…
•2) In chorological classification, data are
arranged according to time i.e., weekly,
monthly, quarterly, half yearly, annually, etc.
12/5/2023 5

Cont…
3) In qualitative classification, the data are
arranged according to attributes like sex,
martial status, educational standard, stage
or intensity of diseases etc.
12/5/2023 6

Cont…
4) In quantativeclassification, the data are
arranged according to certain characteristic
that has been measured like height, weight,
income of persons, vitamin content of in a
substance etc.
12/5/2023 7

Cont…
•Once data is collected, it can be presented as:
–Frequency distribution (table)
–Diagram
–Graph
•Distributionisaposition,arrangement,orfrequencyof
occurrence(asofthemembersofagroup)overanareaor
throughoutaspaceorunitoftime.
•Frequencyis the number of times a certain value or class of
values occurs.
12/5/2023 8

Cont…
•FrequencyDistribution:istheorganizationof
rawdataintableformwithclassesand
frequencies.
Categorical Grouped Ungrouped
Frequency
distribution
12/5/2023
9

Cont…
The reasons for constructing a frequency
distribution:
1.Toorganizethedatainameaningful,
intelligibleway.
2.Toenablethereadertodeterminethenature
orshapeofthedistribution.
12/5/2023 10

Cont…
3.Tofacilitatecomputationalproceduresfor
measuresofaverageandspread.
4.Toenabletheresearchertodrawchartsand
graphsforthepresentationofdata
5.Toenablethereadertomakecomparisons
betweendifferentdataset.
12/5/2023 11

Categorical frequency Distribution:
•Used for data that can be placed in to specific categories such
as nominal, or ordinal.
Example 2.1:
–A social worker collected the following data on
marital status for 25 persons.
•where
–M = Married, S = Single, W = Widowed and D = Divorced
12/5/2023 12

Cont…
12/5/2023 13

Cont…
•Solution:
–Since the data are categorical, discrete classes can be used.
–There are four types of marital status M, S, D, and W.
–These types will be used as class for the distribution.
12/5/2023 14

Cont…
Step 1: Tally the data and place the result in
column (2).
Step 2: Count the tally and place the result in
column (3).
12/5/2023 15

Cont…
Step 3: Find the percentagesof values in each class by
using; (f/n) %
Where
f = frequency of the class,
n = total number of value.
•Percentagesarenotnormallyapartoffrequency
distributionbuttheycanbeaddedsincetheyareusedin
certaintypesdiagrammaticsuchaspiecharts.
12/5/2023
16

Cont…
Step 4: Find the total for column (2) and (3).
•Combingtheentirestepsonecanconstructthe
followingfrequencydistribution.
12/5/2023 17

Cont…
•.
Class(1) Tally(2)Frequency
(3)
Percent(4)
M //// 5 20
S ////// 7 28
D ////// 7 28
W ///// 6 24
12/5/2023 18

Ungrouped frequency Distribution:
•Is a table of all the potential raw score values
that could possible occur in the data along with
the number of times each actually occurred.
•Is often constructed for small set or data on
discretevariable.
12/5/2023 19

Cont…
Constructing ungrouped frequency distribution:
• First find the smallest and largest raw score in
the collected data.
• Arrange the data in order of magnitude and
count the frequency.
• To facilitate counting one may include a column
of tallies.
12/5/2023 20

Cont…
•Example:
•The following data represent number of new birth
to 20 hospitals per month.
•Construct a frequency distribution, which is
ungrouped.
80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85
12/5/2023 21

Cont…
Solution:
Step 1: Find the range,
Range = Max –Min = 90 –60 = 30.
??????????????????
Step 2: Make a table as shown
Step 3: Tally the data.
Step 4: Compute the frequency.
12/5/2023 22

Cont…
.
#of new birth Tally Frequency
60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1
12/5/2023 23

Grouped frequency Distribution:
•Afrequencydistributionwhenseveral
numbersaregroupedinoneclass.
•Whentherangeofthedataislarge,thedata
mustbegroupedintoclassesthataremore
thanoneunitinwidth.
12/5/2023 24

Cont…
Definition of some common terms
Classlimits:Separatesoneclassinagrouped
frequencydistributionfromanother.
Thelimitscouldactuallyappearinthedataandhave
gapsbetweentheupperlimitsofoneclassandlower
limitofthenext.
12/5/2023 25

Cont…
Unitsofmeasurement(U):thedistancebetweentwo
possibleconsecutivemeasures.
Itisusuallytakenas1,0.1,0.01,0.001,-----.
Classboundaries:Separatesoneclassinagroupedfrequency
distributionfromanother.
–Theboundarieshaveonemoredecimalplacesthantherow
dataandthereforedonotappearinthedata.
–Thereisnogapbetweentheupperboundaryofoneclassandlower
boundaryofthenextclass.
12/5/2023 26

Cont…
Class width: the difference between the upper and
lower class boundariesof any class.
✓Itisalsothedifferencebetweenthelowerlimitsof
anytwoconsecutiveclassesorthedifference
betweenanytwoconsecutiveclassmarks.
12/5/2023 27

Cont…
Classmark(Midpoints):itistheaverageofthe
lowerandupperclasslimitsortheaverageof
upperandlowerclassboundary.
Cumulativefrequency:isthenumberof
observationslessthan/morethanorequaltoa
specificvalue.
12/5/2023 28

Cont…
•Cumulative frequency above:it is the total
frequency of all values greater than or equal to
the lower class boundaryof a given class.
•Cumulative frequency blow:it is the total
frequency of all values less than or equal to the
upper class boundaryof a given class.
12/5/2023 29

Cont…
•CumulativeFrequencyDistribution(CFD):it
isthetabulararrangementofclassinterval
togetherwiththeircorrespondingcumulative
frequencies.
Itcanbemorethanorlessthantype,dependingon
thetypeofcumulativefrequencyused.
12/5/2023 30

Cont…
•Relativefrequency(rf):itisthefrequencydivided
bythetotalfrequency.Thisgivesthepercentof
valuesfallinginthatclass.
•Relativecumulativefrequency(rcf):itisthe
cumulativefrequencydividedbythetotal
frequency.Givesthepercentofthevalueswhich
arelessthanormorethantheupper/lowerclass
boundary.
12/5/2023 31

Cont…
Guidelines for classes
–There should be between 5 and 20 classes.
–The class widthshould be an oddnumber.
–The classes must be mutually exclusive.
–The classes must be all inclusive or exhaustive.
12/5/2023 32

Cont…
–Theclassesmustbecontinuous.Therearenogaps
inafrequencydistribution.Classesthathaveno
valuesinthemmustbeincluded(unlessit'sthe
firstorlastclasseswhicharedropped).
12/5/2023 33

Creating a Grouped Frequency Distribution
1. Find the largest and smallest values
2. Compute the Range = Maximum -Minimum
3. Select the number of classes desired.
12/5/2023 34

Cont…
This is usually between 5 and 20 or use Sturgesrule
-K = 1+3.322log(n or N)=> Sturgesrule or
-K=2
??????
>Nornor
-bypersonaljudgment.
where
-kisnumberofclassesdesiredand
-norNistotalnumberofobservation
12/5/2023 35

Cont…
4. Find the class widthby dividing the range by the
number of classes and rounding up
*You must round up, not off.
*Normally 3.2 would round to be 3, but in
rounding up, it becomes 4.
12/5/2023 36

Cont…
5. Pick a suitable starting point less than or equal to
the minimum value.
oThe starting point is called the lower limit of the
first class.
oContinue to add the class width to this lower limit
to get the rest of the lower limits.
12/5/2023 37

Cont…
▪Tofindtheupperlimitofthefirstclass,
subtractUmfromthelowerlimitofthesecond
class.
▪Thencontinuetoaddtheclasswidthtothis
upperlimittofindtherestofthe upper
limits.
12/5/2023 38

Cont…
➢Find the boundaries by subtracting 0.5 U units
from the lower limits and adding 0.5 U units
from the upper limits.
➢The boundaries are also half-way between the
upper limit of one class and the lower limit of
the next class.
12/5/2023 39

Cont…
6. Tally the data.
7. Find the frequencies.
▪Find the cumulative frequencies.
❖If necessary, find the relative frequencies
and/or relative cumulative frequencies.
12/5/2023 40

Cont…
Example:
•Construct a frequency distribution for the following data.
Solutions:
Step 1: Find the highest and the lowest value H = 39, L = 6
Step 2: Find the range; R = H –L = 39 –6 = 33
1129633143122271920
18172238232126343927
12/5/2023 41

Cont…
Step 3: Select the number of classes desired
using Sturgesformula;
K= 1+3.32log (20) = 5.32 = 6 (rounding up)
Step 4: Find the class width;
W = R/K= 33/6 = 5.5 = 6 (rounding up)
12/5/2023
42

Cont…
Step 5: Select the starting point, let it be the minimum
observation.
6, 12, 18, 24, 30, 36 are the lower class limits.
Step 6: Find the upper class limit;
E.g. the first upper class = 12 –U = 12 –1 = 11
11, 17, 23, 29, 35, 41are the upper class limits.
12/5/2023 43

Cont…
•So combining step 5 and step 6, one can
construct the following classes.
Class limits
6 –11
12 –17
18 –23
24 –29
30 –35
36 –41
12/5/2023 44

Cont…
Step 7: Find the class boundaries;
E.g. lower class boundary of first class
= 6 -U/2 = 5.5
Upper class boundary = 11 + U/2 = 11.5
* Then continue adding Width on both
boundaries to obtain the rest boundaries.
12/5/2023
45

Cont…
•By doing so one can obtain the following
classes.
Class boundary
5.5 –11.5
11.5 –17.5
17.5 –23.5
23.5 –29.5
29.5 –35.5
35.5 –41.5
12/5/2023 46

Cont…
Step 8: Tally the data.
Step 9: Write the numeric values for the tallies in the
frequency column.
Step 10: Find cumulative frequency.
Step 11: Find relative frequency or/and relative
cumulative frequency.
The complete frequency distribution follows:
12/5/2023 47

Cont…
.
CL CB CMTallyFreqLCFGCFRFLRCF
6–115.5–11.58.5// 22200.100.10
12–1711.5–17.514.5// 24180.100.20
18–2317.5–23.520.5//////711160.350.55
24–2923.5–29.526.5////4159 0.200.75
30–3529.5–35.532.5///3185 0.150.90
36–4135.5–41.538.5// 2202 0.101.00
12/5/2023 48

Example
12/5/2023 49

12/5/2023
50
DIAGRAMMATIC
AND
GRAPHICAL
REPRESENTATION OF
DATA

Cont…
•Appropriately drawn graph allows readers to
obtain rapidly an overall grasp of the data
presented.
•The relationship between numbers of various
magnitudes can usually be seen more quickly
and easily from a graph than from a table.
12/5/2023 51

Construction of graphs
•The choice of the particular form among the
different possibilities will depend on personal
choices and/or the type of the data.
–Bar charts and pie chart are commonly used for
qualitative or quantitative discrete data.
–Histograms, frequency polygons are used for
quantitative continuous data.
12/5/2023 52

Cont…
•General rules that are commonly accepted about
construction of graphs.
•Every graph should be self-explanatory and as
simple as possible.
–Titles are usually placed below the graph and it should
again question what ? Where? When? How classified?
12/5/2023 53

Cont…
–Legends or keys should be used to differentiate variables if
more than one is shown.
–The axes label should be placed to read from the left side
and from the bottom.
–The units in to which the scale is divided should be clearly
indicated.
–The numerical scale representing frequency must start at
zero or a break in the line should be shown.
12/5/2023 54

Specific types of graphs include;
•Bar graph
•Pie chart
•Histogram
•Stem-and-leaf plot
•Box-Whisker plot
•Scatter plot
•Line graph
•Others
Nominal, ordinal
data
Quantitative
data
Cont.….
12/5/2023 55

1. Bar charts
•Categories are listed on the horizontal axis (X-axis)
•Frequencies or relative frequencies are represented
on the Y-axis (ordinate)
•The height of each bar is proportional to the
frequency or relative frequency of observations in
that category
•There are three types of bars
12/5/2023 56

Cont.…
a. Simple bar chart:-used to represent a single
variable.
12/5/2023 57

Method of constructing bar chart
•All the bars must have equal width.
•The bars arenot joined together (leave space between bars)
•The different bars should be separated by equal distances
•All the bars should rest on the same line called the base.
•Label both axes clearly.
Cont…
12/5/2023 58

Cont…
•Example:Construct a bar chart for the
following data
•Distribution of patient in hospital by source of
their referral.
12/5/2023 59

Cont.…
12/5/2023 60

B.Sub-dividedbarchart(component)
•Iftherearedifferentquantitiesformingthesub-divisions
ofthetotals,
•Simplebarsmaybesub-dividedintheratioofthevarious
sub-divisionstoexhibittherelationshipofthepartstothe
whole.
•Theorderinwhichthecomponentsareshownina“bar”
isfollowedinallbarsusedinthediagram.
Cont.…
12/5/2023 61

Example:Plasmodium species distribution for
confirmed malaria cases, Zeway, 20030
20
40
60
80
100
August October December
Percent
2003
Mixed
P. vivax
P. falciparum
Cont.…
12/5/2023 62

C. Multiple bar graph
•Barchartscanbeusedtorepresenttherelationships
amongmorethantwovariables.
•Thefollowingfigureshowstherelationship
betweenchildren’sreportsofbreathlessnessand
cigarettesmokingbythemselvesandtheirparents.
Cont.…
12/5/2023 63

Prevalence of self reported breathlessness among school
childeren, 1998
0
5
10
15
20
25
30
35
Neither One Both
Parents smooking
Breathlessness, per cent
Child never smokedsmoked occassionalychild smoked one/week or more We can see from the graph quickly that the prevalence of the
symptoms increases both with the child’s smoking and with that
of their parents.
Cont.…
12/5/2023 64

2. Pie chart
•Showstherelativefrequencyforeachcategorybydividinga
circleintosectors,theanglesofwhichareproportionaltothe
relativefrequency.
•Usedforasinglecategoricalvariable
•Usepercentagedistributions
Relative frequency Size of Wedge, in degrees
0.5 50%of 360=180 degrees
0.25 25% of 360=90 degrees
P (P)X(100%)x360degrees
12/5/2023 65

Cont.…
Steps to construct a pie-chart
•Construct a frequency table.
•Change the frequency into percentage (P)
•Change the percentages into degrees, where:degree =
Percentage X 360
o
•Draw a circle and divide it accordingly.
12/5/2023 66

•Example: Distribution of deaths for females, in
England and Wales, 1989
Cause of death No. of death
Circulatory system
Neoplasm
Respiratory system
Injury and poisoning
Digestive system
Others
100 000
70 000
30 000
6 000
10 000
20 000
Total 236 000
Cont..
12/5/2023 67

Distribution fo cause of death for females, in England and Wales, 1989
Circulatory system
42%
Neoplasmas
30%
Respiratory system
13%
Injury and Poisoning
3%
Digestive System
4%
Others
8% Cont…
12/5/2023 68

3. Histogram
•Histogramsarefrequencydistributionswith
continuousclassintervalsthathavebeenturned
intographs.
•Toconstructahistogram,wedrawtheinterval
boundariesonahorizontallineandthefrequencies
onaverticalline.
•Non-overlappingintervalsthatcoverallofthedata
valuesmustbeused.
12/5/2023 69

Cont.…
•Barsaredrawnovertheintervalsinsuchaway
thattheareasofthebarsareallproportionalin
thesamewaytotheirintervalfrequencies.
•The area of each bar is proportional to the
frequencyof observations in the interval
12/5/2023 70

Example:Distribution of the age of women at the time of
marriage.
Age
group
15-1920-2425-2930-3435-3940-4445-49
Number 11 36 28 13 7 3 2
Cont.…
12/5/2023 71

Cont…
12/5/2023 72

•Two problems with histograms
1.They are somewhat difficult to construct.
2.The actual valueswithin the respective groups are
lost and difficult to reconstruct.
The other graphic display (stem-and-leaf plot)
overcomes these problems
Cont.…
12/5/2023 73

4. Stem-and-Leaf Plot
•Aquickwaytoorganizedatatogivevisual
impressionsimilartoahistogramwhileretaining
muchmoredetailonthedata.
•Similartohistogramandservesthesame
purposeandrevealsthepresenceorabsenceof
symmetry
12/5/2023 74

Cont…
•Aremosteffectivewithrelativelysmalldatasets
•Arenotsuitableforreportsandother
communications,but
•Helpresearcherstounderstandthenatureof
theirdata.
12/5/2023 75

Example
•43, 28, 34, 61, 77, 82, 22, 47, 49, 51, 29, 36,
66, 72, 41
2 2 8 9
3 4 6
4 1 3 7 9
5 1
6 1 6
7 2 7
8 2
Cont…
12/5/2023 76
steam Leaf

Steps to construct Stem-and-Leaf Plots
1.Separate each data point into a stem and leaf components
•Stem = consists of one or more of the initial digits of the
measurement
•Leaf = consists of the rightmost digit
The stem of the number 483, for example, is 48 and the leaf is 3.
2.Write the smallest stem in the data set in the upper left-hand
corner of the plot
Cont.…
12/5/2023 77

3. Write the second stem (first stem +1) below the first stem
4. Continue with the remaining stems until you reach the
largest stem in the data set
5. Draw a vertical bar to the right of the column of stems
6.Foreachnumberinthedataset,findtheappropriatestem
andwritetheleaftotherightoftheverticalbar
Cont.…
12/5/2023 78

Example: 3031, 3101, 3265, 3260, 3245, 3200, 3248,
3323, 3314, 3484, 3541, 3649 (BWT in g)
Stem Leaf Number
30
31
32
33
34
35
36
31
01
6560450048
2314
84
41
49
1
1
5
2
1
1
1
Cont.…
12/5/2023 79

What do you see???
Cont.…
12/5/2023 80

5. Frequency polygon
•Afrequencydistributioncanbeportrayed
graphicallyinyetanotherwaybymeansofa
frequencypolygon.
•Todrawafrequencypolygonweconnectthe
mid-pointofthetopsofthecellsofthe
histogrambyastraightline.
12/5/2023 81

Cont…
•Thetotalareaunderthefrequencypolygonis
equaltotheareaunderthehistogram.
•Usefulwhencomparingtwoormorefrequency
distributionsbydrawingthemonthesame
diagram.
12/5/2023 82

•Frequency polygon for the ages of 2087 mothers with
<5 children, Adami Tulu, 2003N1AGEMOTH
55.050.045.040.035.030.025.020.015.0
700
600
500
400
300
200
100
0
Std. Dev = 6.13
Mean = 27.6
N = 2087.00
Cont.…
12/5/2023 83

•Itcanbealsodrawnwithouterectingrectangles
byjoiningthetopmidpointsoftheintervals
representingthefrequencyoftheclassesas
follows:
•Age of women at the time of marriage
0
5
10
15
20
25
30
35
40
12 17 22 27 32 37 42 47
Age
No of women
Cont.…
12/5/2023 84

8. Ogive Curve (The Cumulative Frequency Polygon)
•Sometimesitmaybenecessarytoknowthe
numberofitemswhosevaluesaremoreorless
thanacertainamount.
•Wemay,forexample,beinterestedtoknowthe
no.ofpatientswhoseweightis<50Kgor>60
Kg.
12/5/2023 85

Cont…
•Togetthisinformationitisnecessarytochange
theformofthefrequencydistributionfroma
‘simple’toa‘cumulative’distribution.
•Ogivecurve turns a cumulative frequency
distribution in to graphs.
•Are much more common than frequency
polygons
12/5/2023 86

Cumulative Frequency and Cum. Rel. Freq. of Age of 25 ICU
Patients
Age Interval Frequency
Relative
Frequency(%)
Cumulative
frequency
Cumulative Rel.
Freq.(%)
10-19
20-29
30-39
40-49
50-59
60-69
70-79
80-89
3
1
3
0
6
1
9
2
12
4
12
0
24
4
36
8
3
4
7
7
13
14
23
25
12
16
28
28
52
56
92
100
Total 25 100
Cont.…
12/5/2023
87

•Cumulative frequency of 25 ICU patients
Cont.…
12/5/2023
88

Example: Heart rate of patients admitted to hospital Y, 1998Heart rate No. of patients Cumulative frequency
Less than Method(LM)
Cumulative frequency
More than Method(MM)
54.5-59.5 1 1 54
59.5-64.5 5 6 53
64.5-69.5 3 9 48
69.5-74.5 5 14 45
74.5-79.5 11 25 40
79.5-84.5 16 41 29
84.5-89.5 5 46 13
89.5-94.5 5 51 8
94.5-99.5 2 53 3
99.5-104.5 1 54 1

Cont.…
12/5/2023 89

Heart rate of patients admited in hospital Y, 1998
0
10
20
30
40
50
60
54.5 59.5 64.5 69.5 74.5 79.5 84.5 89.5 94.5 99.5
104.5
Heart rate
Cum. freqency
LM MM Cont.…
12/5/2023 90

Percentiles (Quartiles)
•Suppose that 50% of a cohort survived at least 4
years.
•This also means that 50% survived at most 4 years
•We say 4 years is the median.
•The median is also called the 50th percentile.
•We write: P
50 = 4 years.
12/5/2023 91

Similarlywecouldspeakofotherpercentiles:
–P0:Theminimum
–P25:25%ofthesamplevaluesarelessthanorequaltothisvalue.1
st
Quartile
.P25means25
th
percentile.
–P50:50%ofthesamplearelessthanorequaltothisvalue.2
nd
Quartile.
–P75:75%ofthesamplevaluesarelessthanorequaltothisvalue.3
rd
Quartile
–P100:Themaximum
Cont.…
12/5/2023 92

It is possible to estimate the values of percentiles from a
cumulative frequency polygon.
Cont.…
12/5/2023 93

9. Box and Whisker Plot
•Itisanotherwaytodisplayinformationwhenthe
objectiveistoillustratecertainlocations(skewness)in
thedistribution.
•Canbeusedtodisplayasetofdiscreteorcontinuous
observationsusingasingleverticalaxis–onlycertain
summariesofthedataareshown.
•Firstthepercentiles(orquartiles)ofthedatasetmust
bedefined.
12/5/2023
94

•Aboxisdrawnwiththetopoftheboxatthethird
quartile(75%)andthebottomatthefirstquartile(25%).
•Thelocationofthemid-point(50%)ofthedistributionis
indicatedwithahorizontallineinthebox.
•Finally,straightlines,orwhiskers,aredrawnfromthe
centreofthetopoftheboxtothelargestobservationand
fromthecentreofthebottomoftheboxtothesmallest
observation.
Cont.…
12/5/2023
95

•Percentile = p(n+1), p=the required percentile
•Arrange the numbers in ascending order
A. 1
st
quartile = 0.25 (n+1)
th
B. 2
nd
quartile = 0.5 (n+1)
th
C. 3
rd
quartile = 0.75 (n+1)
th
D. 20
th
percentile = 0.2 (n+1)
th
C. 15
th
percentile = 0.15 (n+1)
th
Cont.…
12/5/2023
96

Cont.…
12/5/2023 97

•The p
th
percentile is a value that is P%of the
observations and the remaining (1-p)
%
.
•The p
th
percentile is:
–The observation corresponding to p(n+1)
th
if p(n+1) is an
integer.
–The average of (k)
th
and (k+1)
th
observations if p(n+1) is not
an integer, where kis the largest integer less than p(n+1).
•If p(n+1) = 3.6, the average of 3
th
and 4
th
observations
Cont.…
12/5/2023
98

•Given a sample of size n = 60, find the 10
th
percentile of the data set.
=> p(n+1) = 0.10(60+1) = 6.1
= “Average of 6
th
and 7
th

–10% of the observations are less than or equal to this
value and 90% of them are greater than or equal to
the value
Cont.…
12/5/2023
99

Cont.…
12/5/2023
100

Howcanthelowerquartile,medianandlower
quartilebeusedtojudgethesymmetryofa
distribution?
1.Ifthedistributionissymmetric,thentheupper
andlowerquartilesshouldbeapproximately
equallyspacedfromthemedian.
Cont.…
12/5/2023 101

Cont…
2.Iftheupperquartileisfartherfromthe
medianthanthelowerquartile,thenthe
distributionispositivelyskewed.
3.Ifthelowerquartileisfartherfromthemedian
thantheupperquartile,thenthedistribution
isnegativelyskewed.
12/5/2023 102

Cont.…
12/5/2023
103

Boxplotsareusefulforcomparingtwoormoregroupsof
observations
Cont.…
12/5/2023 104

Outlyingvalues
•Thelinescomingoutoftheboxarecalledthe“whiskers”.
•Theendsofthe“whiskers’arecalled“adjacentvalues)[The
largestandsmallestnon-outlyingvalues].
•Upper“adjacentvalue”=Thelargestvaluethatislessthanor
equaltoP75+1.5*(P75–P25).
•Lower“adjacentvalue”=Thesmallestvaluethatisgreaterthan
orequaltoP25–1.5*(P75–P25).
Cont.…
12/5/2023 105

•Theboxplotisthencompleted:
–Drawaverticalbarfromtheupperquartiletothe
largestnon-outliningvalueinthesample
–Drawaverticalbarfromthelowerquartiletothe
smallestnon-outlyingvalueinthesample
Cont.…
12/5/2023 106

Cont…
–Outliersaredisplayedasdots(orsmallcircles)andare
definedby:
Valuesgreaterthan75thpercentile+1.5*IQR
Valuessmallerthan25thpercentile−1.5*IQR
–AnyvaluesthatareoutsidetheIQRbutarenotoutliers
aremarkedbythewhiskersontheplot.
–IQR=P75–P25
12/5/2023 107

Cont.…
12/5/2023 108

•Numberofcigarettessmokedperdaywas
measuredjustbeforeeachsubjectattemptedto
quitsmoking
Cont.…
12/5/2023 109

10. Scatter plot
•Moststudiesinmedicineinvolvemeasuringmorethanone
characteristic,andgraphsdisplayingtherelationship
betweentwocharacteristicsarecommoninliterature.
•Whenboththevariablesarequalitativethenwecanusea
multiplebargraph.
•Whenoneofthecharacteristicsisqualitativeandtheother
isquantitative,thedatacanbedisplayedinboxand
whiskerplots
12/5/2023 110

•Fortwoquantitativevariablesweusebivariate
plots(alsocalledscatterplotsorscatter
diagrams).
•Inthestudyonpercentagesaturationofbile,
informationwascollectedontheageofeach
patienttoseewhetherarelationshipexisted
betweenthetwomeasures.
Cont.…
12/5/2023 111

•A scatter diagram is constructed by drawing X-and Y-axes.
•Each point represented by a point or dot(•) represents a pair of
values measured for a single study subject
Cont.…
12/5/2023
112

•Thegraphsuggeststhepossibilityofapositive
relationshipbetweenageandpercentage
saturationofbileinwomen.
Cont.…
12/5/2023 113

11. Line graph
•Usefulforassessingthetrendofparticularsituation
overtime
•Helpsformonitoringthetrendofepidemics.
•Thetime,inweeks,monthsoryears,ismarkedalongthe
horizontalaxis,and
•Valuesofthequantitybeingstudiedismarkedonthe
verticalaxis.
12/5/2023
114

Cont…
•Valuesforeachcategoryareconnectedby
continuousline
•Sometimestwoormoregraphsaredrawnon
thesamegraphtakingthesamescalesothat
theplottedgraphsarecomparable.
12/5/2023 115

No.ofmicroscopicallyconfirmedmalariacasesby
speciesandmonthatZewaymalariacontrolunit,20030
300
600
900
1200
1500
1800
2100
JanFebMarAprMayJunJulAugSepOctNovDec
No. of confirmed malaria cases
Months
Positive
P. falciparum
P. vivax
Cont.…
12/5/2023 116

•Line graph can be also used to depict the
relationship between two continuous variables like
that of scatter diagram.
•Thefollowinggraphshowslevelofzidovudine
(AZT)inthebloodofAIDSpatientsatseveraltimes
afteradministrationofthedrug,forwithnormalfat
absorptionandwithfatmalabsorption.
Cont.…
12/5/2023 117

Response to administration of zidovudine in two groups of AIDS
patients in hospital X, 1999
0
1
2
3
4
5
6
7
8
10 20 70 80
100 120 170 190 250 300 360
Tim e since adm inistration (Min.)
Blood zidovudine
concentration
Fat malabsorption Normal fat absorption Cont.…
12/5/2023 118

Exercise
Suppose a sample of 38 female university students was asked
their weights in pounds. This was actually done, with the
following results:
130 108 135 120 97 110 105 120
130 112 123 117 170 124 130 120
120 133 87 130 160 128 115 120
110 135 115 127 102 130 100
89 135 87 135 115 110 125
12/5/2023 119

Cont…
(a) Suppose we want 9 class intervals. Find CW.
(b) Construct a frequency distribution.
(c) Construct the corresponding histogram.
(d) Construct a frequency polygon and cumulative
frequency polygon
(e) Construct a stem and leaf plot
12/5/2023 120

Agraphisatool.
Itisnotartworkto
hangaboveyoursofa!
Itismoreimportantthatitis
easytocorrectlyinterpret
thanitisthatitispretty!
Remember:
12/5/2023 121

Exercise
1.Whichpictorialpresentationofdataisappropriatetopresent
theassociationbetweenbloodcholesterollevelandHeartrate?
2.Iftheinvestigatorisinterestedtoseethedistributionofbirth
weightwithregardtosexofthenewbornwhichmethodis
bettertopresentdata?
3.Ifavaluerepresentsthe95
th
percentile,this
means:______________________________
12/5/2023
122

Cont…
4.Inanumericaldata,thevalueforQ3can
neverbesmallerthanthevalueforQ1.
5.TheslopeofanOgivecurvecanneverbe
negativeifweusecumulativefrequency
morethanmethod.
12/5/2023 123
Tags