BIOSTATICS & RESEARCH METHODOLOGY UNIT-1.pdf

1,082 views 31 slides Jan 19, 2025
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

Description for PPT on Biostatistics and Research Methodology:
This presentation provides a comprehensive overview of Biostatistics and Research Methodology, emphasizing their critical role in scientific research and evidence-based decision-making. It highlights the following key areas:
Introduction...


Slide Content

BIOSTATISTICS
MS. KRUTIKA R. CHANNE

COURSE CONTENT
Unit-1 (Introduction, Measures of Central Tendency, Measures of Dispersion &
Correlation)
Unit-2 (Regression, Probability & Parametric Test)
Unit-3 (Non Parametric Test, Introduction to Research, Graph, Designing the
Methodology)
Unit-4 (Regression Modeling, Introduction to Practical Components of Industrial
and Clinical Trials Problems)
Unit-5 (Design and Analysis of Experiments- Factorial Design & Response Surface
Methodology)

UNIT- 1
‘Science of King’
Used to keep Records (Population, Births, Deaths, Livestock)
Administrative Purpose
Derived from the Latin word ‘status’, the Italian word ‘Statista’ or
German word ‘statistik’ c/d Political State k/n as STATISTICS
Widely used in Physics, Economics, Social Science & Medical Science
etc.
Introduction to Statistics, Biostatistics, Frequency Distribution

Meaning of Statistics
Plural Sense (Numerical statements or Facts)
Singular Sense (Refer to subject of study)
HORACE SECRIST- “Aggregate of facts affected to a marked extent by a
multiplicity of causes, numerically expressed, enumerated or estimated
according to a reasonable standard of accuracy, collection in a
systematic manner for a predetermined purpose & place in relation to
each other”.
CROXTON & COWDEN- “Statistics is the science which deals with
collection, classification, tabulation, presentation, analysis, testing &
interpretation of data.”

STATISTICAL DATA
A single figure is not called statistics e.g. sale of tablets in a shop A IS 500;
BUT sales of tablets in three shops A,B,C are 600,500,900.
If the phenomenon is of qualitative nature, it should be numerically
expressed.
Figures should be enumerated or estimated with reasonable standard of
accuracy.
DATA GRAPHICS
Data collected- represented in simple manner. Compare between 2 or more
figure. Ex. Diagrams & Graphs. These types of graphs are called Data
Graphics

VARIABLES
Collect the information for various characteristics. These
characteristics are c/d as VARIABLES.
Qualitative Variables Quantitative Variables
Cannot be measured in any unitCan be measured in some unit
Ex. Gender, Education, ColourEx. L, B, Vol, wt, h
Discrete Variables Continuous Variables
Gaps or interruptions in the valuesRelevent interval in the value
Ex. Size of Variable Ex. Height

Collection of Data
1st step in statistics; required for Problem/ Research.
Drug is useful in curing the disease or not?-v
Types of Sources- Primary Source ; Secondary Source
Primary Source First hand source
1. Observation MethodInvestigator records from his obsevation
2. Questionnaire MethodQuestion are asked for inquiry; objective
3. Interview Method Information is collected in the form of interview
Secondary Source
Info is collected from Published Literature or Published
Repord

Classification of DATA
Collect from primary or secondary source
Difficult to answer the question
Need to classify the data
Preliminary classification
Need to classify (single variable)
REQUIRE to group the observation in each group. This process is c/d
Frequency Distribution

FREQUENCY OF DISTRIBUTION
Frequency- Repetition of Observation (figure)
Class- Group of value c/d class e.g 0-10
Class Limits- (min value) lower class; (max value) upper class e.g 5-50
Class Boundaries- upper limit of previous class is equal to the lower
limit of next class e.g 10-19, 20-29 uL- 19 & LL- 20 they are not same
substract 0.5 from ll & Add 0.5 from ul e.g 9.5- 19.5, 19.5-29.5 cont
c/d class boundries.

Class Interval- difference between class boundries c/d
class intervals e.g 10-20, 20-30
Class Frequency- No. Of observation lying in particular
class
Tally Marks- (|) put for systematic counting
Frequency Distribution- Table of class limit &
Corresponding frequency c/d fd.
Cumulative Frequencies- Total of frequency

Preparation of Frequency Distribution
Ex. 1 Following data gives number of children in some families. Prepare a ungrouped frequency
distribution. Also find cumulative frequency. No. 0,1,2,1,8,8,8,2,8,3,4,5,2,0,0,1,5,4,1,3,5,3,6,6,5,2,1,3,7,1,3,2,1,0,5,6.
X No. Of childrens Tally Marks Frequency
0 IIII 4
1 IIII II 7
2 IIII 5
3 IIII 5
4 II 2
5 IIII 5
6 III 3
7 I 1
8 IIII 4
TOTAL — 36

CUMULATIVE FREQUENCY
X No. Of childrens Frequency C. F.
0 4 4
1 7 11
2 5 16
3 5 21
4 2 23
5 5 28
6 3 31
7 1 32
8 4 36

Ex.2 The following data gives ages of patients in years. Prepare a
grouped frequency distribution.
Ages in years- 20, 63, 11, 65, 89, 11, 54, 57, 21, 86, 25, 67, 35, 59,
55, 25, 10, 19, 32, 38, 39, 67, 22, 17, 67, 12, 43, 28, 25, 40, 67, 16,
32, 65, 23, 47, 38.

DATA GRAPHICS
Data we collect & classify- REPORT
Attention to figures & compare two or more set of observation
A classified frequency data i.e grouped frequency data is presented
in the form of graph
1. Histogram
2. Frequency Polygon
3. Cumulative Frequency Polygon

Pie chart: A circle graph that shows the relationship of parts to a whole 


Bar graph:  A graph that uses bars to represent data, which can be vertical or horizontal 


Line graph: A graph that shows points plotted on a graph, connected to form a line 


Column chart: A graph that's ideal for presenting chronological data 


Scatter plot:  A graph that shows the relationship between items based on two different variables and data sets 


Area graph:  A graph that shows changes over time, similar to a line chart
Bubble chart: A graph that's similar to a scatter plot 


Gauge chart: A graph that shows whether data values fit on a scale of acceptable to not acceptable 


Stacked Venn chart:  A graph that shows overlapping relationships between multiple data sets 


Mosaic plot: A graph that uses a grid of rectangles to show how categorical variables interact with each other 


Gantt chart: A bar chart that illustrates a project schedule 


Flowchart: A graph that displays a schematic process, often used by companies to depict the stages of a project 

Ex. For the following data draw histogram, frequency polygon & cumulative
frequency polygon.
Soln- Histogram & Frequency Polygon
Class limit20-40 40-60 60-80 80-100 100-120120-140
Frequency 8 15 23 18 9 4
Class limitFrequency (f)Cumulative f (cf)
20-40 8 8
40-60 15 23
60-80 23 46
80-100 18 64
100-120 9 73
120-140 4 77
Total 77 - 0
6
12
18
24
Class Limit
​​​​​​
Frequency Polygon
Scale- x- axis : 1 unit= 20 unit; Y- axis: 1 unit= 6 unit

Cumulative Frequency Polygon 0
20
40
60
80
​​​​​​ 77.00 73.00 64.00 46.00 23.00 8.00
Scale- x-axis : 1 unit= 20 units; y-axis : 20

Measures of Central Tendency
Statistical data may be in any form, individual, discrete series or continuous series.
Require to compare two or more series of observations, must form a representative
figure for the data.
With such figure we may compare two or more series or the whole data.
To find such representative figures if we look at the data we note down a general
tendency of figures.
This tendency is together at the centre or together about a particular value.
This Tendency is c/d Central Tendency
The figure about which all the remaining figures are gathered will be representative figure.
This is also called as Measure of Central Tendency.

(1) It should be simple to understand.
(2) It should be easy to calculate.
(3) It should be rigidly defined.
(4) It should be based on each and every observation.
(5) It should have sampling stability.
(6) It should not be affected by extreme values.
(7) It should be capable of further algebraic treatment.

Different measures of central tendency.
ARITHMETIC MEAN
A.M= Sum of all observation/
Total number of observation
∑ X = Summation
X= Sum of all observation
N= Total number of observation
X
̄
= ∑ X/ N
X
̄
= ∑ fX/ N where N= ∑ f
X
̄
= A+ (∑ fd

/ N)* C
where A= assumed mean
d’= d/c’; c= common factor
d= X-A
N= ∑ f

Ex. The following data gives weight of tablets in mg. Calculate average weight of a tablet.
Weight of tablets (in mg) 25, 28, 25, 30, 29, 24, 23, 27, 28, 25.
Sol.: This is an individual type of data, hence
X
̄
= ∑ X / N
= 25 + 28 + 25 + 30 + 29 + 24 + 23 + 27 + 28 + 25 / 10
= 264 / 10
=26.4
1. Calculate the mean from the data showing marks of students in a class in a test: 40, 530, 355, 758,
758,898,897, 234.
2. A total of 25 patients admitted to a hospital are tested for levels of blood sugar, (mg/dl) and the
results obtained were as follows:
87, 71, 83, 67, 85, 77, 69, 76, 65, 85, 85, 54, 70, 68, 80, 73, 78, 68, 85, 73, 81, 78, 81, 77, 75
Find the mean (mg/dl) of the above data.

Ex: Following data gives age in years in case of child deaths. Find the average age.
Age in Years0 1 2 3 4 5
No. Of deaths42 55 32 22 15 6
Sol. : This is a discrete data (discrete series). Hence Age in years is called 'X' and No. of
deaths is frequency ‘f'.
AGE (X)No. of Death (f)F * X

**
0
1
2
3
4
5
TOTALN= ∑ f ∑ fX= ?
X
̄
= ∑ fX/ N

From the following Data Calculate Mean
Class limit0-30 30-60 60-90 90-120120-150150-180
Frequency 8 13 22 27 18 7
Class limitFrequency fMid pointd= mp- A D’= d/c Fd’
Total N= ∑ F - - - ∑fd’

Different measures of central tendency.
ARITHMETIC MEDIAN
M= N+1/2 ; when n is odd
Avg of (n/2) & (n/2+1) ; when n is even
Median = l1 + (N/2-C.F./F*1)

Different measures of central tendency.
ARITHMETIC MODE
Most popular observation i.e. observation with highest frequency is called as mode
For grouped data Mode= l1+(f1-fo/ 2f1-fo-f2 * i)
l1= lower limit
f1= frequency
fo= frequency of previous class
f2= frequency of after class
i = Class interval

Scattering of DATA
Types of Measures- Absolute & Relative ; This are expressed as ratio c/d
coefficients.
Measures of Dispersion-
1. Range R= L - S where l= largest observation, s= smallest observation
Coefficient of R= L - S / L+ S * 100
Ex. The following data gives variations in a patient's blood pressure in 8
hours. Observation taken per hour. Find range and its cofficient.
Dispersion or variation of data
Hour 1 2 3 4 5 6 7 8
Blood Pressure105145124 120 140 110 120 123
Range is calculated in single series

2. Mean Absolute Deviation (M.D)
M.D. = ∑ I D I / N
| D | = absolute value of D or Modulus value of D
D = X- Mean, X- Median, X- Mode
Coefficient of M.D.= M.D/Mean * 100 or M.D/Median * 100 or M.D/Mode * 100
For grouped data M.D. = ∑ f | D | / N
Ex. From given table calculate the mean FROM Mean deviation, X= 35, 40, 45, 20, 30
Mean = X
̄
= ∑ X / N
X D= X- X
̄
Bar | D |
Total - ∑ | D |
MD= ?, Coe of MD =?

Ex. From the following data calculate mean deviation from mode
Find mean deviation from median and its coefficient
X 3 6 9 12 15 18 21
F 5 15 21 32 18 12 5
Class limits0-10 10-20 20-30 30-40 40-50
Frequency 8 15 22 15 8

STANDARD DEVIATION
σ = √ ∑ (X-X
̄
)
2
/ N 0R σ = √ ∑ X
2
/ N - (X
̄
)
2
IF WE ASSUME MEAN THEN THE FORMULA IS,
σ = √ ∑ d
2
/ N - (∑ d / N)
2
d= X-A
A= Assumed Mean
When n is small in number then we use the formula, σ = √ ∑ (X-X
̄
)
2
/ N -1
For grouped data or discrete series formula, σ = √ ∑ f d’
2
/ N - (√ ∑ fd’/ N )
2 *
C
Where d’= d/C, d= M.P-A
C= Common factor
A= Assumed Mean

Coefficient of Variance (C.V.)
This is relative measure of dispersion
C.V. = σ/X*100
IF C.V. is less we can say that figures are more consistent.

Questions
Difference Between Statistics and Biostatistics ?
Describe Dispersion and Range
Enumerate different graphs used for representing qualitative data
Calculate Mean, Median, Mode (10m)
Write a short note on Historical Design (5m)
Class limit0-30 30-60 60-90 90-120120-150150-180
Frequency 8 13 22 27 18 7