TESTS OF SIGNIFICANCE
Deals with techniques to know how far the difference between the estimates of different samples is due to sampling variation.
Standard error (S.E) of Mean = S.D/√n
Standard error (S.E) of Proportion = √pq/n
Tests of significance:
Can be broadly classified into 2 typ...
TESTS OF SIGNIFICANCE
Deals with techniques to know how far the difference between the estimates of different samples is due to sampling variation.
Standard error (S.E) of Mean = S.D/√n
Standard error (S.E) of Proportion = √pq/n
Tests of significance:
Can be broadly classified into 2 types
1. Parametric tests (or) standard tests of hypothesis
2. Non – Parametric tests (or) distribution free-test of hypothesis
PARAMETRIC TESTS:
Parametric test is a statistical test that makes assumptions about the parameters of the population distribution(s) from which ones data is drawn.
When to use parametric test???
Subjects should be randomly selected
Data should be normally distributed
Homogeneity of variances
The important parametric tests are:
1) z-test
2) t-test
3) ANOVA
4) Pearson correlation coefficient
Z - Test:
This is a most frequently used test in research studies.
Z - test is based on the normal probability distribution and is used for judging the significance of several statistical measures, particularly the mean.
Z - test is used when sample size greater than 30. Test of significance for large samples
Z = observation – mean
SD
Prerequisites to apply z- test
Sample must be selected randomly
Data must be quantitative
Variable is assumed to follow normal distribution in the population
Sample size must be greater than 30. if SD of population is known, z test can be applied even sample size is less than 30
2) t- Test
• In case of samples less than 30 the Z value will not follow the normal distribution
• Hence Z test will not give the correct level of significance
• In such cases students t test is used
• It was given by “WS Gossett” whose pen name was student. So, it is also called as Student test.
There are two types of student t Test
1. Unpaired t test
2. Paired t test
Criteria for applying t- test
1. Random samples
2. Quantitative data
3. Variable normally distributed
4. Sample size less than 30
Unpaired test:
• Applied to unpaired data of independent observation made on individuals of 2 separate groups or samples drawn from the population.
• To test if the difference between the 2 means is real or it can be due to sampling variability.
Paired t - test:
• It is applied to paired data of observation from one sample only (observation before and after taking a drug)
Examples:
1. Pulse rate before and after exertion
2. Plaque scores before and after using oral hygiene aid
3) ANOVA ( Analysis of Variance):
• Investigations may not always be confined to comparison of 2 samples only
• In such cases where more than 2 samples are used ANOVA can be used.
• Also when measurements are influenced by several factors playing their role e.g. factors affecting retention of a denture, ANOVA can be used.
Indications:
To compare more than two sample means
Types:
1. one-way ANIVA
2. Two-way ANOVA
3. Multi-way ANOVA
Pearson’s correlation
Size: 1.66 MB
Language: en
Added: Oct 25, 2022
Slides: 78 pages
Slide Content
TESTS OF SIGNIFICANCE PARAMETRIC TESTS By Dr. Lasya
CONTENTS Introduction History Data Measures of Central tendency Measures of Dispersion Normal Distribution Hypothesis and types of errors
Parametric tests – 1) Z- test 2) t- test 3) ANOVA 4) Pearson’s correlation coefficient Conclusion References
INTRODUCTION Statistics - It is the science of compiling, classifying & tabulating numerical data and expressing the results in a mathematical/graphical form. Bio statistics - is that branch of statistics concerned with mathematical facts and data relating to biological events.
Application and uses of Biostatistics In Physiology and Anatomy : To define the limits of normality for variable such as height or weight or Blood Pressure etc in a population . 2) Variation more than natural limits may be pathological i.e abnormal due to play of certain external factors. 3) To find the difference between means and proportions of normal at two places or in different periods
In Pharmacology : 1) To find the action of the drug 2) To compare the action of two different drugs or two successive dosages of the same drug 3) To find the relative potency of a new drug with respect to a standard drug
In Medicine : 1) To compare the efficacy of particular drug, operation or line of treatment 2) To find the an association between two attributes such as cancer and smoking 3) To identify signs and symptoms of a disease In C ommunity Medicine and Public Health : 1) To test usefulness of sera and vaccines in the field 2) In epidemiological studies – the role of causative factors is statistically tested.
HISTORY In 1925, Ronald Fisher advanced the idea of statistical hypothesis which he called as “tests of significance” in his publication Statistical Methods for research workers. He suggested a probability of one in twenty(0.05) as a convenient cutoff level to reject the null hypothesis. In 1933, Jery Neyman and Egon Pearson called this cutoff the significance level, which they named a
These tests are the mathematically used methods by which probability of an observed difference by chance is found. It may be difference between means or proportions of the sample and universe or between the estimates of experiment or control groups
DATA A collective recording of observations either numeric or otherwise is called data Understanding the data is crucial in biostatistics, since the type of data determines the selection of appropriate test of significance.
DATA Quantitative Qualitative Nominal Ordinal Discrete Continuous Dichotomous
Qualitative Data or Categorical Data This data exists in mutually exclusive categories. It deals with attributes or qualities of sampling units. Nominal Data: Categorical variables that have neither measurement scales nor direction . Examples : Recording of blood groups, hair color, marital status Reasons for extraction of teeth 1 ) Caries 2) periodontitis 3) therapeutic 4) others
Ordinal (ranked) data: Characterized in terms of more than two variables and have a clearly implied direction but the data is not measured on a measurement scale Examples : Severity of pain perceived by the patient 1 ) No pain 2) mild pain 3) moderate pain 4) severe pain Most popular persons on social media Best books of 2019
Dichotomous data : (Binary Variable) The variable can have only 2 values Examples : Gender : Male / Female Exam results : Pass / Fail Do you have caries: yes/ No
Quantitative / Numerical Data Observations follow a direction and are quantified on a scale of measurement Continuous data not only show the position of different observations relative to each other but also show the extent to which one observation differs from another It enables the investigators to make more detailed inferences than do nominal / ordinal data
Discrete Data: When the variable under observation takes only fixed values like whole numbers, the data is discrete Example : DMFT score (0-32) Number of students in a class
Continuous Data: If the variable can take only value in a given range, decimal or fractional Example : BMI, Height, B.P, arch length, mesio -distal width of erupted teeth Depending upon the source of data can be divided into primary data and secondary data
Primary Data : Obtained directly from the source It is first hand information Data obtained by means of questionnaire, interviews or clinical experiments Secondary Data : Obtained from pre-existing records It is second hand information Data obtained from government and hospital records
Measures of Central tendency It is the central value around which the other values are distributed. Also known as statistical averages S hould satisfy following properties It should 1) Be easy to understand and compute 2) Be based on each and every item in the series 3) Not be affected by extreme observations 4) Have sampling stability
Mean – mathematical estimate Median – positional estimate Mode – based on frequency
1) Mean/ Arithmetic Mean/ Arithmetic Average Obtained by adding all the individual observations and divided by total number of observations Mean = Σ x i Eg : No. of decayed teeth in group of 10 children aged 5 years are 2,2,4,1,3,0,5,2,3,4 Mean = 2+2+4+1+3+0+5+2+3+4 10 Mean = 2.6 n
2) Median When all the observations of a variable are arranged in either ascending or descending order, the middle observation is known as Median.
Eg : No. of visits to a dentist by 10 patients in one year 13,8,4,3,5,2,8,1,7,4 first arrange them in order 1,2,3,4,4,5,7,8,8,13 = 4.5
3) Mode Mode or modal value is that value in a series of observations that occurs with the greatest frequency. Eg : Age at eruption of the canine is 6,6,5,7,8,6,7,5 Mode = 6 When mode is ill-defined Mode = 3 Median – 2 mean
Measures of Dispersion/ Measures of variability/ Measures of variation or scatter Dispersion is the degree of spread / variation of the variable about a central value Uses : Determine reliability of an average Serve as a basis of control of variability Comparison of 2 or more series Facilitate further statistical analysis
i) Range Difference between maximum and minimum values Simplest method Gives no information about the values that lie between the extreme values Subjected to fluctuations from sample to sample
ii) Mean Deviation The average of the deviations from the arithmetic mean, ignoring the + and – sign M.D = Σ (X – Xi) / n Σ = sum of X = arithmetic mean Xi = value of each observation in the data n = no. of observations in the data
iii) Standard Deviation Most important and widely used measure of variation Also known as root mean square deviation It is square root of the mean of the squared deviations from arithmetic mean Greater the deviation – greater the magnitude of dispersion from mean Small standard deviation – higher degree of uniformity of the observations.
S.D = Steps : Calculate the mean – X Find the deviations (or) of the individuals Square these deviations and add them up Σ ) Divide the result by total no. of observations – n (or n-1 if sample size is less than 30) Then obtain square root. This gives standard deviation
Uses : Summarizes the deviations of a large distribution Indicates whether the variation from mean is by chance or real Helps in finding standard error, suitable sample size S.D is only interpretable as a summary measure for variation having approximately symmetric preparations
Normal Curve /Gaussian Distribution / Normal distribution When data is collected from a very large number of people and a frequency distribution is made with narrow class intervals, the resulting curve is smooth and symmetric and it is called a normal curve. In a normal curve, a) Mean + 1 S.D covers 68.3% of the observations b) Mean + 2 S.D covers 95.4% of the observations c) Mean + 3 S.D covers 99.7% of the observations
Standard Normal Curve Bell Shape Perfectly Symmetrical Max. number of observations is at the mean and the number of observations gradually decrease on e ither side with few observations at the extreme points Total area of curve 1 Mean 0 S.D 1
All the 3 measures of central tendency, the mean, median and mode coincide If mean > 2 S.D Indicates values are normally distributed Mean ≥ 2 SD = Normal distribution
Skewness Skewness is a measure of the degree of asymmetry or tail age of a frequency distribution Frequency
Probability P robability may be defined as relative frequency or probable chance of occurrence Probability is usually expressed by the symbol ‘p’. It ranges from zero (O) to one (1). When p = O. It means there is no chance of an event happening or its occurrence is impossible. Eg . Chances of survival after rabies is zero or nil.
If p = 1, It means the chances of an event happening are 100%. Eg . Chances of survival after sandfly fever is 100% The P-value can be more than α or less than α depending on data, when P-value is less than α result is statistically significant
The level of significance is usually fixed at 5% (0.05) 1 % (0.01) 0.1 % (0.001) 0.5 % (0.005) Maximum desirable is 5% level 0.05-0.01 = statistically significant < 0.01= highly statistically significant < 0.001 or 0.005 = very highly significant
Hypothesis Can be defined as tentative prediction or explanation of the relationship between 2 or more variables Null Hypothesis : States that there is no real (true) difference between the means (or proportions) of the groups being compared. Generally symbolized as H O
Alternative Hypothesis : It states that the sample result is different i.e., greater or smaller than the hypothetical value of population. Generally symbolized as H 1 Eg : weight gain / loss due to new feeding regimen 1. Zone of Acceptance 2. Zone of Rejection
Zone of Acceptance If the result of a sample falls in a plain area i.e., within the mean + 1.96 SE the H is accepted, hence this area is called Zone of acceptance for null hypothesis Zone of Rejection If the result of a sample falls out of the plain area or shaded area i.e., beyond mean + 1.96 SE it is significantly different from the universe value. H is rejected and H 1 is accepted. This area is called Zone of Rejection for H
Types of Errors Type -I Error Rejection of hypothesis which should have been accepted Denoted by Type – II Error Accepting the hypothesis which should have been rejected Denoted by
Tests of Significance Can be broadly classified into 2 types 1. Parametric tests (or) standard tests of hypothesis 2. Non – Parametric tests (or) distribution free-test of hypothesis
PARAMETRIC TESTS Parametric test is a statistical test that makes assumptions about the parameters of the population distribution(s) from which ones data is drawn.
When to use parametric test??? Subjects should be randomly selected Data should be normally distributed Homogeneity of variances
The important parametric tests are: 1 ) z -test 2 ) t -test 3 ) ANOVA 4) Pearson correlation coefficient
Z - Test This is a most frequently used test in research studies. z-test is based on the normal probability distribution and is used for judging the significance of several statistical measures, particularly the mean. z test is used when sample size greater than 30. Test of significance for large samples Z = observation – mean SD
Pre-requisites to apply z- t est Sample must be selected randomly Data must be quantitative Variable is assumed to follow normal distribution in the population Sample size must be greater than 30. if SD of population is known, z test can be applied even sample size is less than 30.
Z – test for means has two applications 1) To test the significance of difference between the sample mean(X) and a know value of population( ). O bserved differences between sample Z = sample mean – population mean SE of sample mean
2) To test the significance of difference between 2 sample means or between experimental and control sample means. Z = observed difference between 2 sample means SE of difference between 2 sample means
One-tailed and Two-tailed Z - tests Z value on either side of the mean are calculated as -Z / +Z Value larger than mean +Z Value smaller than mean -Z
One-tailed Z - Test In the test of significance when one wants to specifically know if the difference between the two groups is higher or lower i.e the direction plus or minus side is specified. Then one end or tail of the distribution is excluded. Eg . if one wants to know if malnourished children have less mean IQ than well nourished, then higher side of the distribution will be excluded Such test of significance is called one tailed test
Two-tailed Z - Test This test determines if there is a difference between the two groups without specifying whether difference is higher or lower. It includes both ends and tails of the normal distribution. Such test is called Two tailed test. Eg : W hen one wants to know if mean IQ in malnourished children is different from well nourished children but does not specify if it is more or less.
t - Test In case of samples less than 30 the Z value will not follow the normal distribution Hence Z test will not give the correct level of significance In such cases students t test is used It was given by WS Gossett whose pen name was S tudent . So, it is also called as Student Test .
There are two types of student t Test Unpaired t test Paired t test
Criteria for applying t - test Random samples Quantitative data Variable normally distributed Sample size less than 30
Unpaired test Applied to unpaired data of independent observation made on individuals of 2 separate groups or samples drawn from the population To test if the difference between the 2 means is real or it can be due to sampling variability
Steps in unpaired t- test: As per null hypothesis, assume that there is no real difference between the means of 2 samples calculate the mean of two samples Calculate observed difference between means of 2 samples X 1 – X 2 Calculate the standard error of mean which is given by SE = SD
t = x 1 -x 2 SE Determine the degree of freedom which is one less than no of observation in a sample (n -1). if it is for 2 samples here combined degree of freedom will be df = (n 1 – 1) + (n 2 – 1 ) = n 1 + n 2 -2
Paired t - test It is applied to paired data of observation from one sample only . The individual gives a pair of observation i.e. observation before and after taking a drug Examples : Pulse rate before and after exertion Plaque scores before and after using oral hygiene aid
Steps in paired t- test : As per null hypothesis, assume that there is no real difference between the means of before and after experiment Calculate the mean difference in paired observation i.e. before and after = x 1 – x 2 = X Calculate SE = SD n Determine t = X SE
Determine the degree of freedom Since there is one sample df = n-1 Refer to table and find the probability of the t value corresponding to degree of freedom P< 0.05 states difference is significant P> 0.05 states difference is not significant
ANOVA (Analysis of Variance) Investigations may not always be confined to comparison of 2 samples only In such cases where more than 2 samples are used ANOVA can be used Also when measurements are influenced by several factors playing their role e.g. factors affecting retention of a denture, ANOVA can be used. ANOVA helps to decide which factors are more important
Indications: To compare more than two sample means Criteria for applying ANOVA: Randomly selected samples from the corresponding populations Quantitative data Variables are normally distributed
Types: One way ANOVA Two way ANOVA Multi way ANOVA
One way ANOVA The design includes only one independent variable (e.g., treatment group), the technique applied is called One-way ANOVA Eg : 1 . Compare control group with three different doses of aspirin in rats 2 . Effect of supplementation of vitamin C in each subject before, during and after the treatment.
Two way ANOVA Used to determine the effect of two nominal predictor variables on a continuous outcome variable. A two-way ANOVA test analyzes the effect of the independent variables on the expected outcome along with their relationship to the outcome itself.
Multi way ANOVA Three or more factors affect the result or outcomes between the groups
Knowledge, Attitude, and Perceived Barriers toward Evidence‑Based Practice among Dental and Medical Academicians and Private Practitioners in Pune: A Comparative Cross‑sectional Study
Pearson’s correlation coefficient Relationship or association between two quantitatively measured or continuous variables Eg : Height and weight, temperature and pulse, age and vital capacity, etc.. The extent of relationship of two quantitative variables is measured by Pearson’s correlation coefficient. It is denoted by letter ‘ r ’. -1 ≤ r ≤ +1
Types of correlation Perfect positive correlation , r = +1 Perfect negative correlation r = -1 Absolutely no correlation, r = 0
Z - Test t - Test ANOVA Pearson correlation coefficient Type of Data Continuous data Independent variable – qualitative(nominal) Dependent variable – quantitative(continuous) Continuous data Sample size ˃ 30 ˂ 30 Large enough - Types One tailed Z test Two tailed Z test Paired t- test Unpaired t- test 1) One way ANOVA 2) Two way ANOVA 3) Multi way ANOVA Perfect positive correlation Perfect negative correlation Perfect no correlation Application To compare the differences between the proportions To compare the means of two independent or two related samples To compare the means of 3 or more independent samples To determine the r elationship or association between two quantitatively measured or continuous variables
Examples 1) Proportion of patients surviving in a treated group differs from that in an untreated group 1) UNPAIRED t- test Compare the mean systolic blood pressure of male and female participants 2) PAIRED t- test Plaque scores before and after using oral hygiene aid 1) Effect of supplementation of vitamin C in each subject before, during and after the treatment 1) Correlation between diastolic and systolic blood pressure
CONCLUSION Tests of significance play an important role in conveying the results of any research and thus the choice of an appropriate statistical test is very important as it decides the fate of outcome of the study. Hence the emphasis placed on tests of significance in clinical research must be tempered with an understanding that they are tools for analyzing data and should never be used as a substitute for knowledgeable interpretation of outcomes.
REFERENCES Katz DL, Elmore JG, Wild DMG, Lucan SC. Jekel’s Epidemiology, Biostatistics and Preventive Medicine. 4 rd edition. Philadelphia: Elsevier Publishers; 2014. Kothari CR. Research Methodology-Methods and Techniques: 4 th Edition: New Age International publishers; 2019. Mahajan BK. Methods in Biostatistics. 8 th ed. New Delhi: Jaypee Publishers; 2009.
Peter S. Essentials of preventive and community dentistry. 6 th edition Arya publishers; 2017. Kim JS and Dailey RJ. Biostatistics for oral healthcare. 1 st edition. Jaakkola S, Rautava P, Alanen P, Aromaa M, Pienihäkkinen K, Räihä H, Vahlberg T, Mattila ML, Sillanpää M. Dental fear: one single clinical question for measurement. The open dentistry journal . 2009;3:161.
Valizadeh S, Eil N, Ehsani S, Bakhshandeh H. Correlation between dental and cervical vertebral maturation in Iranian females. Iranian Journal of Radiology . 2013 Jan;10(1):1.