PranavKulkarni187769
0 views
39 slides
Oct 21, 2025
Slide 1 of 39
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
About This Presentation
BIOSTATISTICS
Size: 2.03 MB
Language: en
Added: Oct 21, 2025
Slides: 39 pages
Slide Content
BIOSTATISTICS PART 2 DR PRANAV KULKARNI Dept of Periodontology
DEFINITION STANDARD DEVIATION VARIANCE SAMPLING AND SAMPLE DESIGNS METHODS OF SAMPLING TESTS OF SIGNIFICANCE PREVIOUSLY COVERED DEFINITION COMMON STATISTICAL TERMS DATA TYPES OF DATA PRESENTATION OF DATA MEASURES OF CENTRAL TENDANCY USES OF DATA USES OF STATISTICS
Statistic or datum means a measured or counted fact or piece of information stated as a figure such as height of one person , birth of a baby ,etc. An art and science of collection , compilation, presentation, analysis and logical interpretation of biological data affected by multiplicity of factors. It is the term used when the tools of statistics are applied to data that is derived from biological sciences such as medicine or dentistry.
Standard deviation is a measure of variation of dispersion of a group of values around an arithmetic mean A low standard deviation indicates that the values tend to be close to the mean (also called the expected value ) of the set, while a high standard deviation indicates that the values are spread out over a wider range. σ = √ (value – mean ) 2 n-1
Nothing else but square of standard deviation and denoted by σ 2 Day of reporting with complaint No. of patient reported Ix- x I 2 1 10 32.49 2 25 86.49 3 35 372.49 4 05 110.49 5 10 32.49 6 10 32.49 7 15 1.4 Total 668.34 Mean = 110/7 = 15.7 SD = √ 668.34 = 10.5 6
Properties Bell shaped Symmetrical Height is maximum at mean , Mean=Median=Mode Maximum number of observation at mean and it decreases on either side Relation between mean and standard deviation forms basis of tests of significance Normal distribution or Gaussian curve
Following are some definitions of coefficient of dispersion Mean deviation x 100 ii) Mean deviation x 100 Median Mean iii) Standard deviation x 100 Mean iii th is most commonly used CV
When variability of two series is compared, Series having greater CD is said to have more variation Series having lower CD is said to be more homogenous CD = 100 x SD Mean
Skewness means lack of symmetry … indicates whether the curve is turned more to one side than to the other + ve skewness if curve is more elongated to the right I.e. if mean is more than mode - ve skewness if curve is elongated to the left I.e. mean is less than the mode
Formula for Skewness Karl Pearson’s coefficient of skewness = Mean – Mode SD Coefficient of skewness is 0 for a symmetrical distribution If coefficient of skewness is positive , distribution is positively skewed If coefficient of skewness is negative , distribution is negatively skewed
A sample is a part of a population, called the ‘universe’ , ‘reference’ or ‘parent’ population. It is the process or technique of selecting a sample of appropriate characteristics and adequate size. Sampling frame is the total of the elements of the survey population, redefined according to certain specifications. It consists of sampling units ,which are individual entities that form the focus of the study.
Reduces cost of investigation, time and the number of personnel involved. Allows thorough investigation of the units of observation. Provide adequate and in depth coverage of the sample units.
Purposive sampling: Representing the sample as a whole. Great temptation to deliberately or purposively select the individuals…. B. Random selection: All the characteristics of the population are reflected in the sample Selecting the units at random
1. Simple random : Sample units - individuals from the community or group. Sampling frame - community or group from where the sample will be drawn. The word ‘random’ indicates haphazard which seems to be misnomer in this sampling context. The basic procedure is : Prepare a sampling frame Decide on the size of the sample Select the required no of units The sample can be drawn by using a random number table or the lottery method
The population to be sampled is subdivided into groups known as strata, such that each group is homogeneous in its characteristic. A simple random sample is then chosen from each stratum. Is used when the population is heterogeneous. This methods ensures more representativeness, provides greater accuracy and can concentrate on a wider geographical area. The limitation : care should be taken while dividing the population into strata regarding the homogeneity in each stratum. 2. Stratified random sampling
It is formed by selecting one unit at random and then selecting additional units at evenly spaced intervals till the required sample size has been reached. Used when complete list of population is available. 3. Systematic random sampling
Used when the population forms natural groups or clusters such as villages, wards, blocks, children in school, etc. Sampling units - clusters Sampling frame - group of clusters First using simple random sampling the clusters are selected and then all the units in each of the selected clusters are surveyed. Method is simple and less expensive than random sampling. Example…to find out the prevalence of flurosis in rural community.
Part of information is collected from the whole sample and a part from sub-sample. This method may be adopted when there is interest in any specific disease. Survey by this procedure is less costly, less laborious and more purposeful. mULTIPHASE sampling 5.
The first stage is to select the groups or clusters. Then subsamples are taken in as many subsequent stages as necessary to obtain the desired sample size. Eg : 1 st stage: choice of states within countries 2 nd stage: choice of towns within each state 3 rd stage: choice of neighborhoods within each town
1. Accidental/ incidental/ convenience : Here sample is selected by the convenience of the situation for the examiner. 2. Judgment/ purposive/ deliberate sampling : It is a non-representative subset of some larger population, and is constructed to serve a very specific need or purpose.
3. SNOWBALL SAMPLING: A subset of a purposive sample is a snowball sample (chain referral sampling) So named because one picks up the sample along the way, analogous to a snow all accumulating snow. Snowball samples are particularly useful in hard-to-track population, such as those with illegal behavior like drug users, homeless people, etc.
4. Quota sampling : General composition of the sample is decided in advance. The only requirement is that the right number of people be somehow found to fill these quotas. This is generally done to insure the inclusion of a particular segment of the population.
Parametric test:- one in which population constants are used such as mean , variance etc and data tend to follow one assumed or established distribution such as normal, binomial ,Poisson, etc Non- parametric tests:- no constants are used ,data do not follow any specific distribution and no assumptions are made . E.g. to classify good, better and best you allocate arbitrary no. to each category. Tests of significance
t-test- paired/unpaired ANOVA Chi square test Fisher’s test Mann Whitney U test Wilcoxon signed rank test Mc nemar’s test Kruskal Wallis test Parametric tests Non-parametric tests Classification of tests
Developed by Karl Pearson Used to determine whether there is a statistically significant difference between observed values and expected values in one or more categories. Sample size should be > 30 and when dealing with 2 or more groups of subject.
Steps Test the null hypothesis Then the x² statistic is calculated : x²= Ʃ (O-E)² E O= observed frequency E= expected frequency
Designed by W.S. Gossett Applied to find the difference between two means Criteria for applying t test Random samples Quantitative data Sample size < 30 Variable normally distributed
Unpaired t test when sample in two groups give individual value, to test for the difference in between the groups Paired t test When each individuals give a pair of observations, to test for the difference in the pair of values
Compares more than two samples drawn from corresponding normal population If the difference between their means is significant - different agents used do have different effect on the decrease in microbial load To assess this difference in means- ANOVA test is important E.g : to check if different agents used for subgingival irrigation have an effect on the decrease in microbial load. Use 3 groups (chlorhexidine , saline, povidone iodine)
ONE WAY CLASSIFICATION Suppose there are three different preparations of tonic say A, B and C which are to be compared for their effectiveness in controlling anaemia . 3 groups of 7 patients and one control group getting no tonic at all. The three preparations and one group having none (called as control) are referred to as 'treatments' (likewise fertilisers , drugs, etc., are also called as treatments) in ANOVA.
Response of patients in the 4 treatment groups will vary from one patient to another and from one treatment to the other. Part of the variation may be due to pure chance which is called as error
TWO-WAY ANOVA CLASSIFICATION Two-way analysis of variance is utilized when there is a need to study the impact of two factors on variations in a specific variable. E.g : i . The effect of age and sex on variations in height. It is known that in children height increases with increasing age. Also males are taller than females ii. The effect of protein levels and calories on the gain in body weights.
The assumptions made in this type of ANOVA are i ) The subjects must be chosen at random, ii) The variable under study must have normality characteristics ( I.e. coeff of skewness = 0 ) Variance between comparable groups are mostly same or homogenous There is no interaction between the two factors
Used to test the significance of difference in means for large samples (>30) Pre-requisites are: Sample must be randomly selected Quantitative data Sample size >30 Variable normally distributed Observation-mean standard deviation