Descriptive Statistics, Biostatistics Course

abelyegon7 7 views 112 slides Oct 25, 2025
Slide 1
Slide 1 of 112
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112

About This Presentation

Descriptive statistics in Biostatistics involves methods for summarizing and presenting biological and health-related data. It includes calculating measures like the mean, median, mode, and standard deviation, and creating graphs and tables to easily understand the characteristics and distribution o...


Slide Content

Descriptive Statistics
Measures of Central Tendency And Dispersion

Descriptive statistics
Concerned with Exploring in the initial stages of data analysis, Visualizing, and
summarizing data but without fitting the data to any models

Since no models are involved, it can not be used to test hypotheses or to make
testable predictions
Nevertheless, it is a very important part of analysis that can reveal many interesting
features in the data
Descriptive statistics are used throughout data analysis in a number of different
ways. Simply stated, they refer to means, ranges, and numbers of valid cases of one
variable
First, descriptive statistics are important in data cleaning
Second, a typical use of  descriptive analysis is to produce a situation analysis or a
snap shot of the situation under study

Descriptive Statistics
•Standard Descriptive Statistics analysis for Continuous Variables

Descriptive statistics are summary measures used to describe and
understand the main features of numerical (continuous) data

They help nurses and researchers to summarize, interpret, and
communicate data about measurements such as blood pressure, weight, or
temperature

Continuous variables can take any value within a range, so descriptive
statistics help simplify and summarize large amounts of such data
StatisticDescription Nursing Example
Mean
Sum of all values ÷ number of
observations
Average systolic blood pressure of
patients in a ward
Median
Middle value when data are arranged in
order
Median weight of newborn babies
Mode Most frequently occurring value
Most common body temperature
recorded among patients

Measures of Dispersion (Variability)
Statistic Description Nursing Example
Range
Difference between the highest
and lowest values
Range of patients’ pulse
rates in ICU
Variance (s²)
Average of squared deviations
from the mean
Variability in blood glucose
levels
Standard Deviation
(SD)
Square root of variance; shows
how much data deviate from the
mean
SD of systolic BP among
diabetic patients
Interquartile Range
(IQR)
Range between 25th and 75th
percentiles
Spread of birth weights after
removing extreme low/high
values
•These describe how spread out or consistent the data are around the mean
•A small SD means data are closely clustered around the mean; a large SD
shows wide variability

Measures of Distribution Shape
Statistic Description Interpretation in Nursing Data
Skewness Indicates symmetry of data
Positive skew longer right tail (e.g.,

hospital stay length)
Kurtosis
Indicates "peakedness" or
flatness of data
High kurtosis many values near mean

(narrow peak)
These describe the pattern or symmetry of the data

Other Useful Descriptive Measures
Statistic Purpose Example
Minimum and Maximum
Show the range of
observed values
Minimum and maximum
temperatures recorded in
patients
Percentiles / QuartilesDivide data into equal parts
50th percentile = median
blood pressure
Confidence Interval (CI)
Gives range within which
the true mean likely falls
Mean hemoglobin ± 95% CI

Graphical Summaries: To visualize
continuous data:
Graph Type Purpose Example
Histogram
Shows frequency
distribution
Distribution of BMI among
students
Box-and-Whisker
Plot
Shows median, IQR, and
outliers
Comparing patient temperature
distributions between wards
Line Graph
Shows changes over
time
Daily average pulse rate trends

Descriptive Statistics
Category Statistic Purpose
Central Tendency Mean, Median, Mode Show the “average” value
Dispersion Range, Variance, SD, IQR Show variability/spread
Distribution Shape Skewness, Kurtosis
Show data symmetry and
shape
Position Percentiles, Quartiles Show relative standing
Graphical Tools Histogram, Box plot, Line graphVisual presentation

Standard Descriptive Statistics for Discrete Variables
A discrete variable is a quantitative variable that represents countable values
(usually whole numbers)

Examples include the number of patients, number of doses, or number of
infections

Since these variables are counted rather than measured, their descriptive statistics focus
on frequencies and proportions rather than continuous measurements

Frequency Distribution

This is the most common descriptive statistic for discrete variables

It shows how often each value occurs (mode)
Relative Frequency (Proportion or Percentage)

Shows the proportion of each category relative to the total

Measures of Central Tendency

Even though discrete data are countable, we can still describe their average (typical) values
Measures of Dispersion (Variability)

These show how much the counts vary from the average
Measures of Proportion / Rate

Used when expressing discrete events relative to a population or total count

Graphical Representations

Discrete variables are best presented using graphs for count data

Frequency Distribution
Frequency tables summarize counts and
percentages of occurrences
Table 1: Frequency Distribution Table for Pregnant
Mothers Attending Antenatal Care on Their First
Visit at Karatina University Clinic, January 2025
Variables (Trimesters (first visit) Frequency (Numbers)
1
st
Trimester 33
2
nd
Trimester 33
3
rd
Trimester 34
Total 100

Relative Frequency (Proportion or Percentage
Relative frequency is a measure that shows how often a particular value or
category occurs in relation to the total number of observations

It expresses each category’s frequency as a fraction (proportion) or
percentage of the total.
Relative frequency tells you “what fraction or percentage of the total” each
group represents
Table 2: Ralative Frequency Distribution for Pregnant Mothers Attending Antenatal
Care on Their First Visit at Karatina University Clinic, January 2025
Trimester Frequency (f)
Relative Frequency
(f/N)
Percentage (%)
First Trimester 33 0.33 33.0%
Second Trimester 33 0.33 33.0%
Third Trimester 34 0.34 34.0%
Total N=100 1.00 100.0%

Measures of Proportion and Rate
These measures describe how often an event occurs in relation to a total
population, and they are very important in nursing, public health, and
epidemiology — for monitoring disease, health service use, and outcomes
A proportion is a type of ratio in which the numerator is part of the
denominator
It tells us what fraction of the total population has a particular characteristic
A proportion shows “how many out of how many” have something
If 20 out of 100 ANC mothers tested positive for anemia:
Proportion=20/100=0.20=20%
Interpretation: 20% of the mothers had anemia at their first ANC visit

Measures of Proportion and Rate
Rate

A rate measures how quickly or how often an event occurs in a defined
population over a specific time period

It reflects both the number of cases and the time component

A rate adds the element of time to a proportion

Measures of Proportion and Rate

Ratio

A ratio compares two separate quantities that are not
necessarily related

Standard Descriptive Statistics for Categorical
Variables
Categorical variables are variables that represent qualities or
characteristics, rather than numerical quantities
They group individuals or observations into categories (e.g., gender, blood
group, marital status, type of delivery)

These statistics help summarize and describe how frequently each category
occurs in a dataset

Allowing nurses and researchers to understand patterns and distributions
in qualitative data

Categorical data are summarized by counts, proportions, and
percentages, not by means or standard deviations
Mode is the only measure of central tendency applicable
Graphical tools (bar/pie charts) make data easy to interpret in clinical and public
health settings

Outlier in data
An outlier is a data value that is unusually high or low compared to the
rest of the dataset
It lies far away from the main cluster of data points and can distort
statistical results such as the mean and standard deviation
An outlier is a value that “doesn’t fit in” with the rest of your data
Dataset Possible OutlierExplanation
Hemoglobin levels (g/dL) of
mothers: 7.5, 9.0, 10.2,
10.5, 10.7, 11.0, 16.8
16.8
Extremely high — may be due to
lab error or unusual condition
Weight of newborns (kg):
1.8, 2.4, 2.6, 2.8, 3.0, 6.5
6.5
Abnormally high — possible data
entry error or exceptional case
Body temperature (°C):
36.8, 37.0, 37.2, 37.1, 39.5
39.5 Possible fever outlier

Outlier

Causes of Outliers

Measurement error (e.g., faulty instrument, lab mistake)

Data entry error (e.g., typing 65 instead of 6.5)

Sampling error (e.g., selecting non-representative cases)

Natural variation (e.g., a genuinely extreme but valid value)

Changes in conditions (e.g., acute illness temporarily affecting lab values)
In Nursing Practice
Outliers are important because they may:
Indicate measurement or recording errors (lab mistake)
Reflect serious patient conditions (very high BP, low Hb)
Affect clinical decisions and data interpretation

Effect of Outliers on Statistics
Statistic Effect of Outliers
Mean Greatly affected (pulled toward the outlier)
Median Hardly affected
Mode Not affected
Standard
Deviation
Increases (appears more variable)
Range Greatly affected
IQR Not affected much

How to Handle Outliers
Action When to Use Explanation
Verify Always first step
Check for data entry or
measurement errors
Retain If it is a true valid observation
Include but report its
influence
Remove
If clearly due to error or not
representative
Exclude from analysis
Use robust statistics
When data have true but
extreme values
Use median and IQR
instead of mean and SD

Arithmetic Mean

The mean is the sum of all observations divided by the
total number of observations
Characteristic Description
Mathematically definedBased on a fixed formula, making it reliable and exact
Uses all observationsEvery data point contributes to the mean value
Sensitive to outliersA single extreme value can greatly affect it
Most commonly used Preferred for continuous and symmetrical data
Unique For any dataset, there is only one mean

Importance of the Mean in Biostatistics
Reason Explanation
1. Represents the central
tendency
The mean gives the overall “average” of
the data — a single value summarizing the
entire dataset
2. Basis for further statistical
analysis
Many advanced methods (e.g., standard
deviation, variance, t-tests, ANOVA,
regression) are based on the mean
3. Useful in comparing groups
Helps compare averages between
populations (e.g., mean BP between males
and females)
4. Sensitive indicator of change
Changes in the mean can reflect effects of
interventions or treatments
5. Supports evidence-based
decisions
Used in health planning and evaluation
(e.g., average hospital stay, average birth
weight)
6. Describes normal
(symmetrical) distributions
Mean coincides with median and mode in
normal data — useful for interpreting
normal curves
7. Facilitates quality control
Helps track changes in laboratory or clinical
measurements over time

Applications of Mean in Nursing and
Health Research
Application AreaExample / Use
Hospital
administration
Mean hospital stay (e.g., 4.3 days) to evaluate efficiency
Maternal and child
health
Mean birth weight of newborns to monitor nutrition
Clinical practice
Mean haemoglobin level of antenatal mothers to assess
anaemia prevalence
Public health Mean age of malaria cases to identify target populations
Nursing educationMean test scores to measure class performance
Quality assuranceMean laboratory values to detect systematic errors
Research &
evaluation
Mean blood pressure before and after intervention to assess
effectiveness

Advantages & Limitations of the Mean
Advantages Description
Simple and easy to calculate Works with small or large datasets
Uses all data values More accurate measure than mode or median
Suitable for further statistical testingMean forms the base for many inferential analyses
Has mathematical properties Used in regression, variance, and hypothesis testing
Limitation Description
Affected by extreme valuesOutliers can distort the average
Not suitable for categorical dataCan’t be used with non-numeric variables like blood group
Misleading for skewed dataMedian may be better if data are not symmetrical

Median
The median is the middle value when all data
are arranged in order
Characteristics

Divides the data into two equal halves

Not affected by extreme values

Best measure of central tendency for skewed data

Often used for income, hospital stay, weight, etc

Steps to Calculate the Median
Step 1: Arrange the Data

Order all the observations from smallest to largest (ascending order).

Example:
3, 5, 7, 8, 10, 12, 15

Step 2: Determine the Total Number of Observations (n)

Count how many data points you have. Example: n=7
Step 3: Identify Whether n is Odd or Even

If n is odd, median = middle value

If n is even, median = average of the two middle values
Step 4: Locate the Median Position

Use this formula to find the position:

Median position=n+1/2
Case 1: n is Odd

Example: 3, 5, 7, 8, 10, 12, 15
n=7n = 7n=7
Median position = (7 + 1) / 2 = 4th value


Median = 8
Case 2: n is Even
Example: 3, 5, 7, 8, 10, 12
n=6n = 6n=6
Median position = (6 + 1) / 2 = 3.5 → between
3rd and 4th value
→ Median = (7 + 8) / 2 = 7.5

Applications of Median in Biostatistics and Nursing
Area of Application Example / Use
Hospital statistics Median hospital stay to represent typical patient recovery time
Maternal and child healthMedian age of first ANC visit to identify common timing of care-seeking
Public health researchMedian income or median household size to describe socioeconomic data
Clinical studies Median haemoglobin levels to describe a typical patient group
Epidemiology Median incubation period of a disease in outbreak investigations
Quality assurance Median laboratory turnaround time to assess service efficiency
Nursing education Median exam score to represent central performance level of a class
Patient satisfaction Median satisfaction score (1–5 scale) for service evaluation

Importance of Median in Biostatistics
Reason / Importance Explanation
1. Not affected by
outliers
Unlike the mean, the median is stable even when extreme values
(outliers) exist. For example, a few long hospital stays won’t distort the
median length of stay
2. Best for skewed data
When data are not normally distributed (e.g., income, hospital stay), the
median gives a better representation of central tendency than the mean
3. Suitable for ordinal
data
The median can be used for ranked (ordered) data, such as pain scores or
satisfaction ratings
4. Divides population
into two equal parts
Helps to describe distributions, especially when comparing lower and upper
halves (e.g., percentiles, quartiles)
5. Useful in non-
parametric statistics
Median is used in non-parametric tests that do not assume normal
distribution (e.g., Mann-Whitney U test)
6. Easy to interpret
Median provides a clear and intuitive understanding of a “typical” or “central”
observation
7. Stable in small
samples
Median gives a consistent summary even for small sample sizes, unlike the
mean which may fluctuate
8. Describes real data
situations
In health data where extreme values are common (e.g., hospital costs,
length of illness), the median reflects the real-world midpoint

Advantages and Limitations of Median
Advantage Explanation
Not influenced by outliersExtreme values have no effect
Appropriate for skewed dataRepresents center of non-symmetrical distributions
Suitable for ordinal dataCan be used for ranked observations
Simple interpretation Easy to explain to non-statistical audiences
Divides data into two halvesUseful in percentiles and quartile calculations
Limitation Description
Not based on all observationsOnly considers middle values, ignoring others
Difficult with open-ended classesRequires grouped data with clear boundaries
Not suitable for further computation
Cannot be used in advanced mathematical analyses
like SD or variance
Less stable with small samplesMedian may shift slightly if dataset changes

Mode

The mode is the most frequently occurring value in a dataset
It is the observation or class interval that appears most often

The Mode tells you what is most common or what occurs most frequently in your
data
Characteristic Description
Represents typical value
It identifies the most common or popular
observation
Easiest to determine
Can be found just by inspecting the data (especially
for categorical data)
Not affected by extreme values
(outliers)
Mode remains stable even if a few extreme values
are added
Can be used with all data types
Works with nominal, ordinal, discrete, and
continuous data
May have one, two, or more
modes
Data can be unimodal, bimodal, or multimodal

Types of Mode
Type Meaning Example
Unimodal One value occurs most often Mode = 36 weeks (ANC visit week)
Bimodal Two values occur most often Modes = 28 and 36 weeks
MultimodalMore than two frequent valuesMultiple peaks in frequency

Importance of Mode in Biostatistics
Reason Explanation / Importance
1. Represents the most
typical case
In healthcare data, Mode shows the most common
observation (e.g., most frequent age group, diagnosis, or lab
value)
2. Easy interpretation
Especially helpful for non-numerical (categorical) data such
as blood group, gender, or disease type.
3. Not affected by
outliers
Useful when data have extreme values that distort the mean
4. Best for nominal data
Only measure of central tendency applicable for nominal
variables (e.g., most common blood type)
5. Useful for health
planning
Helps identify most common health needs, disease
patterns, or age group utilization in hospitals
6. Aids in communication
Describes findings in a simple, non-technical way — easy for
health staff and policy makers to understand
7. Supports decision-
making
Identifies target groups for interventions (e.g., most common
infection age group)

Applications of Mode in Nursing and
Biostatistics
Application Area Example / Use of Mode
Public health planning
Mode of age for malaria cases helps plan preventive
campaigns
Clinical research
Mode of haemoglobin levels identifies the most common
patient category
Hospital statistics Mode of hospital stay duration shows typical discharge time
Maternal and child health
Mode of parity (number of children) shows most common
fertility pattern
Epidemiological data
Mode of disease type or blood group helps understand
disease distribution
Patient satisfaction surveysMost common rating helps gauge public opinion

Range
The range is the difference between the highest and lowest values
Range= Maximum -Minimum

Simple to calculate but affected by outliers

Gives a quick sense of spread of the data

Does not show how data are distributed between the extremes

Example

Range = 45 18 =

27 years
Health Scenario Use of Range
Clinical trials To describe variability in drug response between patients
Epidemiology reports To summarize age range of infected individuals
Public health reporting
To indicate range of coverage or service utilization (e.g.,
immunization rates)
Hospital management
To evaluate variation in patient waiting times or bed
occupancy

Applications of Range in Health Setup
Area of
Application
Example / Use of Range Purpose / Interpretation
1. Clinical
laboratory
results
Range of haemoglobin levels among
pregnant women (e.g., 7.8 – 14.2 g/dL)
Shows how much patients differ in haemoglobin
levels — helps detect anaemia extremes.
2. Hospital stay
duration
Range of hospital stay = 2 to 21 days
Indicates variation in recovery time — helps plan bed
capacity and resource use.
3. Body
temperature
monitoring
Range of temperatures among patients =
36.4°C – 39.8°C
Detects presence of fever spikes and variation in
patient condition.
4. Blood
pressure
readings
Range of systolic BP = 100 – 180 mmHg
Shows spread in patient blood pressure — may
indicate hypertensive cases.
5. Maternal
health
Range of gestational ages at first ANC = 8 –
30 weeks
Helps identify delays in seeking care and plan targeted
health education.
6. Child growth
monitoring
Range of birth weights = 2.1 – 4.3 kg
Reveals variation in newborn health — identifies
underweight or macrosomic infants.
7.
Epidemiological
data
Range of ages in malaria cases = 1 – 70 years
Helps describe affected population groups for
intervention planning.
8. Laboratory
quality control
Range of control values for glucose = 4.8 –
5.2 mmol/L
Detects measurement consistency and test precision.
9. Nursing
education
Range of student exam scores = 42 – 89%Describes variation in academic performance.
10. Community
health surveys
Range of household incomes = KSh 3,000 –
60,000
Illustrates socioeconomic disparity in a community.

Advantages and Limitations of Range
Advantage Explanation
Simple to calculate and
interpret
Only two values (max and min) are needed.
Quick overview of variabilityGives a rough idea of how spread out data are.
Useful for preliminary analysis
Helps identify data spread before computing advanced
statistics.
Good for small datasets Especially when full data analysis tools are unavailable.
Limitation Explanation
Affected by outliers One extremely high or low value can distort the range
Does not show distribution Doesn’t tell how values are spread within the range
Not reliable for large datasetsBecause it only depends on two extreme values

Interquartile Range (IQR)
The IQR is the range of the middle 50% of the data — the difference
between the 75th percentile (Q3) and the 25th percentile (Q1)
IQR= 3 1IQR=Q3 Q1
?????? −?????? −
Characteristics

Not affected by outliers

Represents the spread around the median

Used when data are skewed or not normally distributed

Example

Q1 = 22 years, Q3 = 31 years IQR = 31 22 = 9 years
→ −

Standard Deviation (SD)
The SD measures the average deviation of each data point from the mean
Characteristics

Shows how spread out data are around the mean

Small SD data close to the mean (less variable)


Large SD data widely spread (more variable)


Used only with the mean (not median)

Best for normally distributed data (Mean=Median=Mode)
The Standard Deviation (SD) is a key tool in the health sector
for evaluating variability, consistency, and reliability in data
Whether in clinical laboratory quality control, public health
research, or nursing education, SD helps healthcare
professionals make accurate, evidence-based decisions by
understanding how much variation exists in the data

Importance of Standard Deviation in the Health Sector
Importance Explanation / Use
1. Measures
consistency and
reliability
SD helps determine how consistent clinical measurements are — for example,
variation in blood glucose levels after fasting among diabetic patients.
2. Helps in quality
control of laboratory
tests
In clinical laboratories, SD is used to monitor precision of tests (e.g., control
samples in haemoglobin estimation). A low SD means reliable results.
3. Evaluates treatment
effectiveness
In drug trials, SD shows how consistently patients respond to treatment. A
smaller SD means most patients benefited similarly.
4. Assesses population
variability
SD is used in epidemiology to measure how much variables like BMI, blood
pressure, or cholesterol vary in a population.
5. Aids in risk
assessment
Health risk factors (e.g., blood sugar, cholesterol) with high SD indicate greater
variability and potential risk in the population.
6. Used in biostatistical
analysis
SD is crucial for computing confidence intervals, t-tests, and hypothesis testing
in medical research.
7. Monitors
performance in
healthcare delivery
Used to check variability in patient waiting times, hospital stay durations, or
satisfaction scores — helps identify inefficiencies.
8. Nursing education
and evaluation
In nursing schools, SD helps measure variability in students’ exam scores —
showing performance consistency.
9. Clinical decision-
making
Helps interpret normal versus abnormal findings. For example, “normal” lab
ranges are often defined as ±2 SD from the mean.
10. Supports evidence-
based practice
Research studies report mean ± SD to describe data accurately, guiding clinical
practice and policy formulation.

Examples of Application in Health Setup
Scenario Application of SD Interpretation
1. Blood pressure among
adults
Mean = 130 mmHg, SD = 5
mmHg
BP readings are fairly
consistent (low variation).
2. Fasting blood sugar in
diabetics
Mean = 7.2 mmol/L, SD = 2.5
mmol/L
High variation — indicates
some patients have poor
control.
3. Length of hospital stayMean = 4.0 days, SD = 1.0
Most patients stay between
3–5 days.
4. Birth weight Mean = 3.2 kg, SD = 0.5 kg
68% of newborns weigh
between 2.7 and 3.7 kg
(within ±1 SD).
5. Nursing student exam
scores
Mean = 72%, SD = 12%
Large SD indicates varied
performance levels among
students.

Interpretation of SD in Normal Distribution

In a normal (bell-shaped) distribution:

68% of values fall within ±1 SD of the mean

95% of values fall within ±2 SD

99.7% of values fall within ±3 SD

Example:
If mean systolic BP = 120 mmHg and SD = 10 mmHg,
then:

68% of people have BP between 110–130 mmHg

95% have BP between 100–140 mmHg

Advantages& Limitations of SD in Health
Statistics
Advantage Explanation
Comprehensive measure Uses all data points for accuracy
Supports inferential
statistics
Essential for t-tests, regression, and confidence
intervals
Describes reliability Indicates precision and consistency in measurements
Applicable in research Widely used in medical and nursing studies
Objective and precise
Mathematically defined and less affected by small
changes
Limitation Explanation
Sensitive to outliers Extreme values can inflate the SD
Requires normal distributionInterpretation assumes data are approximately normal
Difficult to understand intuitivelyMore complex than range or IQR for non-statisticians

Comparison: Mode vs Mean vs Median
MeasureWhen Useful Data Type
Effect of
Outliers
Example in
Nursing
Mean
For symmetric,
quantitative data
Interval/Ratio Affected
Average body
temperature
Median
For skewed
distributions
Ordinal/ContinuousNot affected
Median income of
nurses
Mode
For nominal or
typical values
Nominal/Ordinal Not affected
Most common
blood group

Summary Table
Measure Type Purpose
Sensitive to
Outliers?
Used When Data AreReported With
Mean Central tendencyAverage value Yes Normal (symmetric)
Standard
deviation
Median Central tendencyMiddle value No Skewed IQR or Range
Mode Central tendencyMost common value No Any
Alone or with
others
Range Dispersion
Spread between
extremes
Yes Any
Alone or with
median
IQR Dispersion Spread of middle 50% No Skewed With median
SD Dispersion Variation around mean Yes Normal With mean

Reporting together Measures of Central
Tendency and Dispersions
Type of Data DistributionMeasures Reported
Example Reporting
Format
Normal (Symmetrical)
Mean ± Standard Deviation
(SD)
Mean age = 27.4 ± 4.5
years
Skewed (Asymmetrical)
Median (IQR) or Median
(Range)
Median age = 26 (22–31)
years
Categorical (Qualitative)
Mode, Frequency,
Percentage
Most common trimester:
3rd (34%)

Skewness (Skewedness) in Biostatistics
Skewness describes the asymmetry (lack of symmetry) in
the distribution of data values around the mean
If the data values are symmetrically distributed, the left
and right sides of the graph look similar

No skewness
(Normal distribution)

If one tail is longer or stretched out, the data are skewed
Skewness tells us which side the “tail” of the graph
points to

When the Mean is greatest (CT)
Scenario Description
Distribution
Type
Order of Central
Tendency
Length of hospital
stay
Most patients stay 2–4
days, but a few stay
15–20 days
Positively skewed
(right-skewed)
Mean > Median > Mode
Cost of medical
treatment
Most treatments are
inexpensive, but some
are very costly
Positively skewedMean > Median > Mode
Income of nurses
Most nurses earn
moderate salaries, but
a few in management
earn much more
Positively skewedMean > Median > Mode
•This pattern indicates a positively skewed (right-skewed) distribution
•In a positively skewed distribution, the tail of the curve extends to the
right (toward higher values)
•A few very high (large) values pull the mean upward, while most of the
data are concentrated on the left (lower) side

When Mode is greater (CT)
This pattern represents a negatively skewed (left-skewed)
distribution
In a negatively skewed distribution, the tail of the curve
extends to the left (toward lower values)
A few very low (small) values pull the mean downward, while
most of the data are concentrated on the right (higher) side
Therefore, the Mode is the greatest, and the Mean is the
smallest measure of central tendency
Order of Central Tendency (Left-Skewed Data):

Mean<Median<Mode

Example in Health Context
Scenario Description Distribution Type
Order of Central
Tendency
Age at death in a
hospital
Most patients are
elderly, but a few die
young (early deaths
pull mean down)
Negatively skewed
(left-skewed)
Mean < Median <
Mode
Age at retirement
of nurses
Most nurses retire
around 60, but a few
retire early due to
illness
Negatively skewed
Mean < Median <
Mode
Patient satisfaction
scores
Most patients rate
services very high,
but a few give very
low ratings
Negatively skewed
Mean < Median <
Mode
Blood pressure
levels
Most patients have
high-normal BP, but
a few with very low
BP reduce the mean
Negatively skewed
Mean < Median <
Mode

Graphical Presentations
Graphical presentation refers to displaying data visually using charts,
graphs, or diagrams

It helps to summarize, interpret, and compare data easily — making trends,
patterns, and variations clear at a glance
Graphs turn numbers into pictures that tell a story about your data
Importance Explanation
Makes data easy to understandVisuals simplify complex numerical information
Highlights trends and patternsShows changes over time (e.g., disease cases per month)
Aids quick comparison Between groups, hospitals, or interventions.
Enhances decision-making
Visual summaries help health managers make informed
choices
Useful in reports and research
Commonly used in nursing research, epidemiological
studies, and health statistics reports
Engages audience
Visual data attract more attention in presentations and
publications

Types of Graphical Presentations in
Descriptive Statistics
Bar Chart

Used for: Categorical or
discrete data

Structure: Bars of equal
width, separated by spaces

Orientation: Vertical or
horizontal

Example in Health:
Number of pregnant
mothers attending ANC
by trimester
Cases of malaria by county

Interpretation: Height of
the bar represents frequency

Types of Graphical Presentations in
Descriptive Statistics
Pie Chart

Used for: Categorical data showing
proportion or percentage of each
category in a whole

Structure: Circle divided into slices
(each slice = a category)

Example in Health:
Proportion of blood donors by
blood group
Distribution of patients by diagnosis

Interpretation: Larger slices
represent higher proportions

Types of Graphical Presentations in
Descriptive Statistics
Histogram

Used for: Continuous (quantitative) data divided into class
intervals

Structure: Bars touch each other (no gaps)

Shows: Frequency distribution.

Example in Health:

Distribution of patients’ ages

Haemoglobin levels among pregnant women.

Interpretation: Shape shows skewness or normality of data

Types of Graphical Presentations in
Descriptive Statistics
Line Graph

Used for: Data that change
over time (time-series
data)

Structure: Points
connected by lines

Example in Health:

Monthly malaria cases from
January to December.

Yearly trend in HIV
prevalence

Interpretation: Shows
upward or downward trends

Types of Graphical Presentations in Descriptive
Statistics

Scatter Diagram (Scatter Plot)

Used for: Showing relationship
(correlation) between two continuous
variables

Structure: Points plotted in X–Y plane

Example in Health:

Relationship between rain fall intensity
and malaria cases

Relationship between BMI and fasting
blood sugar

Interpretation: Direction of the dots
shows type of correlation:



Upward = Positive correlation



Downward = Negative correlation

Random = No correlation

Types of Graphical Presentations in
Descriptive Statistics
Box and Whisker Plot (Box Plot)

Used for: Summarizing distribution using
median, quartiles, and outliers.

Structure: A box (IQR) with lines
(“whiskers”) and points (outliers).

Example in Health:

Distribution of hospital stay durations.

Variation in patient body temperature.

Interpretation: Quickly shows spread,
skewness, and outliers
Interpretation:
•The red circles represent outliers (e.g.,
unusually long hospital stays of 30 and 35
days).
•The box captures the central 50% (IQR) of
data.
•The median line inside the box shows the
middle stay duration.
•The whiskers extend to the smallest and
largest non-outlier values

Types of Graphical Presentations in
Descriptive Statistics
Type of
Graph
Type of Data Main Use Example in Health Setup
Bar Chart
Categorical /
Discrete
Compare categories
ANC attendance by
trimester
Pie Chart Categorical Show proportions Distribution of blood groups
Histogram Continuous
Show frequency
distribution
Age of patients
Frequency
Polygon
Continuous Compare two distributions
Male vs female patients’
ages
Line Graph Time-series Show trends over time Monthly malaria cases
Scatter PlotContinuous Show correlation BMI vs Blood sugar
Box Plot Continuous Show spread and outliersLength of hospital stay
Pictogram Categorical Visual education Births per month

Statistical software
•Interactive packages
Such as SPSS, excel, which allow the user to
perform many standard statistical operations at
the click of a mouse
•Its easy to use and useful for applying standard
methods
•The user is required to perform operations that
they do understand other wise applying all sorts
of statistical methods may lead to poor
interpretations

Descriptive Statistics

Measures of Central Tendency
Descriptive statistics(from a frequency distribution)
1.Measures of central tendency
a. mode - most frequent class (of frequency
distribution)
b. median (ordinal or ratio/interval data) - middle
class
c. mean (ratio/interval data) = “average”; calculation:
x/n

1. Mode = can be used for any kind of data but
only measure of central tendency for nominal or
qualitative data.
Formula: value that occurs most often or the
category or interval with highest frequency.

Example for Nominal Variables:

Religion frequency cf Rel. Freq.% Cum
%
Catholic 17 17 .4141 41
Protestant4 21 .1010 51
Jewish2 23 .055 56
Muslim1 24 .022 58
Other9 33 .229 80
None8 41 .2020 100

Total41 1.00100%
Central Tendency:MODE = largest category = Catholic

Central Tendency (cont.)
2. Median = exact centre or middle of
ordered data. The 50th percentile.
Formula:
Array data.
When sample even #, median falls halfway
between two middle numbers.
To calculate: find(n/2)and (n/2)+1, and divide
the total by 2 to find the exact median.
When sample is odd #, median is exact
middle (n+1) /2)

Example for Raw Data:
Suppose you have the following set of test
scores:
66, 89, 41, 98, 76, 77, 69, 60, 60, 66, 69, 66,
98, 52, 74, 66, 89, 95, 66, 69

1. Array data:

9898 95 89 897776746969
69 66 66 66 666660605241
N = 20 (N is even)

To calculate:
- find middle numbers(n/2)+(n/2 )+1
- add together the two middle numbers
- divide the total by 2
First middle number: (20/2) = the 10
th
number
2
nd
middle number: (20/2)+1 = the 11
th

Look at data:
the middle numbers are 69 and 69
The median would be (69+69)/2 = 69

Median for grouped data:
Find real limits (+/- .5) for class intervals first.
Find interval that contains the median
(using above method)
Formula:
Md = real lower limit + ( N(.50) – cf below) i
f
(where i is the interval width, cf is the cumulative frequency of the interval
below the one containing the median and where f is the frequency of the
interval containing the median.)

Example for Grouped Data:
Create frequency table:
Score tallyfcum f Real limits
41-50 11 40.5-50.5
51–60 34 50.5-60.5
61–70 812 60.5-70.5
71–80 315 70.5-80.5
81–90 217 80.5-90.5
91–100 320 90.5-100.5

Example (cont)
From above, we know that the median is the
average of the 10
th
and 11
th
cases.
We can see from the table that these cases
would lie in the 61-70 interval.
Frequency of this interval: f = 8
Interval width: i = 10
The real lower limit of this interval is 60.5
The cum f of the interval below is 4
N = 20

Example (cont)
Substitute into formula:
Md = real lower limit + ( N(.50) – cf below) i
f
= 60.5 + ( (20(.50) – 4) X 10)
8
= 68
The median for the grouped data is 68.

Properties of median:
- for numerical data at interval or ordinal level
-"balance point“
-not affected by outliers
-median is appropriate when distribution is
highly skewed.

Mean for Raw Data
The mean is the sum of measurements /
number of subjects

Formula: (X-bar) = ΣX
i
/ N
Data (from above):
66, 89, 41, 98, 76, 77, 69, 60, 60, 66, 69, 66,
98, 52, 74, 66, 89, 95, 66, 69

Example for Mean
Formula: = ΣX
i / N
= 1446 / 20
= 72.3
The mean for these test scores is 72.3

Grouped Data—Class Intervals
Some guidelines when using uniform class-
intervals:
Decide on an appropriate number of class-
interval groupings:
Optimum number depends on the range of values and
the size of the data set.
Large data sets can support a large number of class
groupings and
small data sets can support fewer class groupings.
To start, try creating class-intervals that are of equal and
convenient length (e.g., 10-year age intervals).
Normally, 3 to 12 such class-intervals are sufficient.

Determine the class interval width. This can be
determined with the formula:
Interval width = Max-Mean)/no. of class groupings
eg, to create 4 class groupings for a data set with a
maximum of 52 and minimum of 5, the

class interval width = (52 - 5) / 4 = 11.75, which for
the current purpose can be “rounded” down to 10
or rounded up to 15.


Set endpoint conventions.

If an observation falls on the boundary between two class
intervals, we need to know in which class interval it will be
counted.

The two choices are to: (a) include the left boundary and
exclude the right boundary or (b) include the right
boundary and exclude the left boundary.
When faced with this choice, we will use the option (a). For
example, when consideration the 15 unit class-interval of
15 to 30, we will exclude the right boundary of 30, so
that the interval is really 15 to 29.99....
For convenience, this may be written 15-29.

Count and tabulate: Once boundaries are
established, data are tabulated in the usual manner.
Here’s a frequency table for the data {21, 42, 5, 11,
30, 50, 28, 27, 24, 52} using 15-year class-intervals:
Range Tally Freq. Rel.Freq Cum.Freq
------ ------ ----- -------- -------
0-14 // 2 20% 20%
15-29 ////4 40% 60%
30-44 // 2 20% 80%
45-59 // 2 20% 100%
------------------------------------------
TOTAL 10 100% --


We can also use non-uniform class intervals for the
frequency table if the purpose suits us. For example,
here’s a data set with ages grouped into interval
according to school-age”

AGE RANGE, YRS (SCHOOL AGE) | Freq RelFreq CumFreq
----------------------------------------
3 - 4 (PRE) | 11 1.7% 1.7%

5 - 11 (ELEM) | 469 71.7% 73.4%
12 - 13 (MIDDLE)| 100 15.3% 88.7%
14 - 19 (HIGH) | 74 11.3% 100.0%
-----------------------------------------
Total | 654 100.0%

Mean for Grouped Data:
To calculate the mean for grouped data, you
need a frequency table that includes a
column for the midpoints, the squared
midpoints, the product of f(m).
Formula: = Σ (fm)
N

Frequency table:
Score fm f(m)

41-50 1 45.5 45.5
51-60 3 55.5 166.5
61-70 8 65.5 524
71-80 3 75.5 226.5
81-90 2 85.5 171
91-100 3 95.5 286.5
N = 20 Σ (fm) =1420

Calculating Mean for Grouped Data:
Formula: = Σ (fm)
N
= 1420 / 20
= 71
The mean for the grouped data is 71.

Properties of the Mean:
- only for numerical data at interval level
- "balance point“
- can be affected by outliers = skewed distribution

- tail becomes elongated and the mean is pulled in
direction of outlier.
Example…
no outlier:
$30000, 30000, 35000, 25000, 30000 then mean = $30000
but if outlier is present, then:
$130000, 30000, 35000, 25000, 30000 then mean = $50000
(the mean is pulled up or down in the direction of the outlier)

NOTE:
When distribution is symmetric,
mean = median = mode
For skewed, mean will lie in direction of skew.
i.e. skewed to right,
mean > median (positive skew)
skewed to left, median > mean (negative skew)

Symbols for statistics (sample) and
parameters (population)
  Parameter Statistic
Mean  =

x/n x
=
x/n
(=“x-bar”)
Variance 
2
=

(x-)
2
/ns
2

=
(x-x)
2
/n-1

Standard

Deviation


=


2
s
=
(s
2
)
(=“SD”)

2. Dispersion
Describe the amount that each observation is likely
to vary from the mean/median
–maximum, minimum (range): sensitive to extreme
values
–interquartile range:
break ordered data into four equal sections
(quartiles, Q1, Q2, Q3); middle 50% of observations
(Q3 – Q1; difference between 25
th
and 75
th
percentiles)
–sum of squares (SS): (x -x)
2
–variance: SS/n
–standard deviation:  variance

Measures of Dispersion
Describe how variable the data are.
i.e. how spread out around the mean
Also called measures of variation

2.1. Range (for numerical data)
Range = difference between largest and
smallest observations
i.e. if data are kshs.130000, 35000, 30000,
30000, 30000, 30000, 25000, 25000
then range = 130000 - 25000 = Kshs.105000

2.1Interquartile Range (Q):
-This is the difference between the 75th and the 25th
percentiles (the middle 50%)
-Gives better idea than range of what the middle of
the distribution looks like.
Formula: Q = Q
3 - Q
1 (where Q
3 = N x .75,
and Q
1 = N x .25)
Using above data: Q = Q
3 - Q
1 = (6
th
– 2
nd
case)
= $30000-25000 =$5000
The interquartile range (Q) is $5000.

2.3. Variance and Standard Deviation:
For raw data at the interval/ratio level.
Most common measure of variation.
The numerator in the formula is known as
the sum of squares, and the denominator is
either the population size N or the sample
size n-1
The variance is denoted by S
2
and the
standard deviation, which is the square root
of the variance, by S

Definitional Formula for Variance and
Standard Deviation:
Variance: s
2
= Σ (x
i - )
2
/ N
S.D.: s =

Working formula (the one you use) for s.d is:
1
N N ∑ X
i
2
- ( ∑ X
i
)
2

Example for S and S
2
:
Data: 66, 89, 41, 98, 76, 77, 69, 60, 60, 66,
69, 66, 98, 52, 74, 66, 89, 95, 66, 69
1.Find ∑ X
i
2
: Square each X
i and find total.
2.Find (∑ X
i)
2
: Find total of all X
i and square.
3.Substitute above and N into formula for S.
4.For S
2
, simply square S.
S = 14.76 S
2
= 217.91

Another working formula for the standard
deviation:
Note that the definitional formula for s.d. is
not practical for use with data when N>10.
The working formulae should be used instead.
All three formulae give exactly the same result.
2
2
X
N
X
S
i


Properties of s:
always greater than or equal to 0
the greater the variation about mean,
the greater s is n-1 (corrects for bias
when using sample data.)
s tends to underestimate the population s.d. so
to correct for this, we use n-1. The larger the
sample size, the smaller difference this
correction makes. When calculating the s.d.
for the whole population, use N in the
denominator.

NOTE:
, N and Mu (µ) denote population
parameters
s, n, x-bar ( ) denote sample statistics

Remember the Rounding Rules!
Always use as many decimal places as your
calculator can handle.
Round your final answer to 2 decimal places,
rounding to nearest number.
Engineers Rule: When last digit is exactly 5
(followed by 0’s), round the digit before the
last digit to nearest EVEN number.

Calculating descriptive univariate statistics
personal calculator - assignments:
calculate descriptive stats for: 13.4, 13.8, 14.2,
17.0, 15.3, 15.8, 14.9, 12.3, 16.2, 16.4

x = 14.9, SD = 1.49 [ = 1.41], s
2
= 2.22 [
2

= 2.00]

2.4Coefficient of variation (CV)
CV- expresses SD as a percent of the mean
a.CV = (SD/x) *100
b.Used to compare relative variation in one variable between groups
with different means
c. example:
meanSD CV
Group 1 14.22.517.6
Group 2 7.21.825.0
Note that group 2 is relatively more variable despite a greater SD in group
1.

Calculating descriptive univariate statistics
 personal calculator - assignments:
calculate descriptive stats for:
13.4, 13.8, 14.2, 17.0, 15.3, 15.8, 14.9, 12.3,
16.2, 16.4

Calculation of standard deviation
Calculation of standard deviation
Lead concentration
x x-x’
 
(x-x’)
2
x
2
0.4 -1.1 1.21 0.16
0.6 -0.9 0.81 0.36
0.8 -0.7 0.49 0.64
1.1 -0.4 0.16 1.21
1.2 -0.3 0.09 1.44
1.3 -0.2 0.04 1.69
1.5 0 0 2.25
1.7 0.2 0.04 2.89
1.9 0.4 0.16 3.61
1.9 0.4 0.16 3.61
2 0.5 0.25 4
2.2 0.7 0.49 4.84
2.6 1.1 1.21 6.76
3.2 1.7 2.89 10.24
Total 22.5 0 9.96 43.71

Calculation of standard deviation

Calculation of the standard deviation from qualitative
discrete data
Calculation of the standard deviation from qualitative discrete data

Number of visits
to or by
doctor
Number of
children
Col (2) x Col (1)Col (1) squared Col (2) x Col (4)
0 3 0
1 9 1
2 29 4
3 48 9
4 39 16
5 18 25
6 5 36
7 2 49
Total 152  
Mean number of visits =
Calculate the sd
Calculate 68% confidence interval
Calculate 95% confidence interval

Exercises

In the campaign against smallpox a doctor inquired into the number of
times 160 people aged 16 and over in a Kenyan village had been
vaccinated. He obtained the following figures:

never, 12 people; once, 24; twice, 42; three times, 38; four times,
30; five times, 4.

What is the mean number of times those people had been
vaccinated and what is the standard deviation?

Exercise 1.

Obtain the mean and standard deviation of the data in and an
approximate 95% range.

Exercise 2.

Which points are outside the range mean - 2SD to mean + 2SD?

What proportion of the data outside?

Exercise 3: Home runs in Baseball

Ruth and Aaron with at least 350 at bats

Ruth Aaron
29 25 46 13 40 44 34
54 47 41 27 34 39 40
59 60 34 26 45 29 20
35 54 22 44 44 44 12

41 46 30 24 38

46 49 39 32 47

Stem-and-Leaf Plots
1. Sketch a Stem-and-Leaf
Plot of Home run Data
2. What to look for.
(a) Center: Median
(b) Shape: Symmetric or Skewed
(c) Gaps and Outliers
3. What do we see here?
Sketch Three Distribution Shapes
Calculate the mean, Median and Standard Deviation

Exercise 4: Stemplot
The stem-and-leaf plot (stemplot) is a excellent way to begin an
analysis. Consider this small data set

containing 10 AGE values:

21 42 05 11 30 50 28 27 24 52
To construct a stemplot, start by dividing each value into a stem
component and leaf component. For
these data, digits in the tens-place becomes stem components and
digits in the units-place becomes leaf
components. For example, “21" has a stem component of 2 and leaf
component of 1.

Stem-values are listed in numerical order to form an
axis. Vertical lines may be drawn to outline the
stem:
0|
1|
2|
3|
4|
5|
×10
An axis-multiplier is included to allow the reader to
decipher the magnitude of values. Here, the
multiplier (×10) is used to show that a stem value of 5
represents 50 and not 5, for instance.

The value of each leaf is plotted in its
appropriate location. For example, 21 is
plotted as:
0|
1|
2|1
3|

4|
5|
×10

Exercise 5
The following pollution levels were found in a
river water samples
{1.4, 1.7, 1.8, 1.9, 2.2, 2.2, 2.3, 2.4, 2.6, 2.6,
2.7, 2.8, 2.9, 3.0, 3.0, 3.0, 3.1, 3.2, 3.3, 3.4,
3.4, 3.5, 3.6, 3.7, 3.8}.

What to Look for in a Distribution
Shape
Location and Spread

Exercise 6
The following data, obtained in an air pollution study, are 80 determinations of the average daily emission of
sulphur oxides (in tonnes) from an industrial plant.
The maximum standard set for sulphur oxide emission is that 27 tonnes must not be recorded at more than
random occurrences.
It is suspected that the plant is in violation of the standard.
The data are to be used to determine whether action should be taken against the company responsible for
the industrial plant.
Draw a histogram, Calculate the mean, median and the standard deviation
15.826.417.211.223.924.818.713.99.013.222.79.822.214.717.526.112.8
28.617.623.726.822.718.020.511.020.915.519.41.719.719.115.222.026.6
20.421.419.221.616.919.018.523.024.620.116.218.017.713.523.514.514.4
29.619.417.120.824.322.524.618.418.18.321.912.322.313.311.819.320.0
25.731.825.410.515.927.518.117.99.424.120.128.5

Exercise 7
The ages at which the adolescent growth spurt began in a sample of
35 boys and 40 girls who transferred to secondary school were;
Boys: 16.0, 14.9, 14.1, 14.8, 14.4, 14.0, 13.6, 14.6, 16.1, 13.2, 13.2,
14.6, 15.3, 14.4, 14.8, 15.9, 14.7, 14.5, 14.6, 13.5, 15.1, 13.5, 15.0,
15.2, 15.4, 15.9, 13.7, 14.9, 14.1, 15.4, 14.4, 13.8, 15.3, 14.7, 14.8
Girls: 12.2, 13.7, 13.3, 12.3, 12.5, 12.9, 14.1, 11.8, 12.8, 12.9, 11.6,
14.3, 12.3, 11.6, 13.1, 12.6, 11.7, 13.5, 11.9, 11.6, 13.4, 12.4, 12.6,
13.7, 12.1, 13.5, 12.5, 13.4, 13.1, 13.3, 13.5, 14.7, 12.7, 12.7, 12.0,
11.4, 13.5, 12.4, 12.1, 12.1
Make a comparison between boys and girls of the age of onset of
the growth spurt using histograms and boxplots, comparing shape,
spread and location.

Exercise from the First Lecture
A. Statistics ____
B. Parameter __

C. All inclusive ____
D. Discrete _______
E. Mutually
Exclusive___
F. Zero _______
G. Continuous _____
I. Arithmetic Mean
___
J. Primary Data ____
1. A place for every outcome
2. Do not contain the same outcome

3. The use of sample statistics to
draw conclusions concerning the
population
4. A numerical characteristic of a
sample

5. Only finite values can exist on the
axis
6. Published by the original collector
7. Severely affected by a few
extreme values
8. Measurement may assume any
value associated with uninterrupted scale
9. A numerical characteristic of a
population
10. Sum of the deviations around the
mean

Exercise from the First Lecture

A. Statistics _4___
B. Parameter _9_
C. All inclusive _1___
D. Discrete ___5___
E. Mutually
Exclusive_2_
F. Zero ___10__
G. Continuous _8__
H. Inferential
Staitistics_3

I. Arithmetic Mean _7_

J. Primary Data __6_
1. A place for every outcome
2. Do not contain the same outcome

3. The use of sample statistics to
draw conclusions concerning the
population
4. A numerical characteristic of a
sample

5. Only finite values can exist on the
axis
6. Published by the original collector
7. Severely affected by a few
extreme values
8. Measurement may assume any
value associated with uninterrupted scale
9. A numerical characteristic of a
population
10. Sum of the deviations around the
mean
Tags