Statistical Methods in Research

Statistical Methods in Research
Dr Kiran Gaur
Associate Professor & Head
Department of Statistics, Mathematics & Computer Science
SKNCollogeof Agriculture, Jobner

Statistics
Descriptive statistics –Methods of organizing, summarizing, and
presenting data in an informative way
Inferential statistics –The methods used to determine something about a
population on the basis of a sample
Inference is the process of drawing conclusions or making decisions
about a populationbased on sampleresults

Types of variables
Variables
QuantitativeQualitative
Dichotomic Polynomic Discrete Continuous
Gender, marital
status
Brand of Pc, hair
color
Children in family,
Strokes on a golf
hole
Amount of income
tax paid, weight of a
student

Types of Measurement Scale
Nominal Scale
Colour, Region , gender etc.
Ordinal Scale
Size, grades, SEB etc.
Interval Scale
Temperature, certain size measurement etc.
Ratio Scale
Height, weight, income etc.

Frequency distribution
The frequencywith which observations are assigned to each category
or point on a measurement scale.
Most basic form of descriptive statistics
May be expressed as a percentage of the total sample found in
each category
The distribution is “read” differently depending upon the
measurement level
Nominal scales are read as discrete measurements at each level
Ordinal measures show tendencies, but categories should not be
compared
Interval and ratio scales allow for comparison among categories

Cross Tabulation

Chart Guide

Commonly Used Graphs in Business Research

A Taxonomy of Statistics

11
Central Tendency
•Statistical measure that determines a single value that accurately describes
the center of the distribution and represents the entire distribution of
scores.
•By identifying the "average score," central tendency allows
researchers to summarize or condense a large set of data into a single
value.
•In addition, it is possible to compare two (or more) sets of data by
simply comparing the average score (central tendency) for one set
versus the average score for another set.

Measures of central tendency
•These measures give us an idea what the ‘typical’ case in a distribution
•Mean-
•The ‘average’ score—sum of all individual scores divided by the number of scores
•Has a number of useful statistical properties
however, can be sensitive to extreme scores (“outliers”)
•many statistics are based on the mean
•Mode -the most frequent score in a distribution
•good for nominaldata
•Median -the midpoint or mid score in a distribution.
•50% cases above/50% cases below
insensitive to extreme cases
Ordinal or ratio

0
20
40
60
80
100
120
140
160
1
q1
min
median
max
q3 Box-Plot Chart

Dispersion
•Some statistics look at how widely scattered over the scale the
individual scores are
•Groups with identical means can be more or less widely dispersed
•To find out how the group is distributed, we need to know how far
from or close to the mean individual scores are
•Like the mean, these statistics are only meaningful for interval or
ratio-level measures

Estimates of Dispersion
•Range
•Distance between the highest and lowest scores in a distribution;
•sensitive to extreme scores;
•Can compensate by calculating inter quartile range (distance between the 25th and 75th
percentile points) which represents the range of scores for the middle half of a
distribution

Variance (S
2
)
•Average of squared distances of individual points from the mean
•sample variance
•High variance means that most scores are far away from the mean. Low variance
indicates that most scores cluster tightly about the mean.
•The amount that one score differs from the mean is called its deviation score
(deviate)
•The sum of all deviation scores in a sample is called the sum of squares
Estimates of dispersion
StandardDeviation(SD)
A summary statistic of how much scores vary from the mean
Square root of the Variance
•expressed in the original units of measurement
•Represents the average amount of dispersion in a sample
•Used in a number of inferential statistics

Measures thepeackednessof a distribution;
Leptokurtic (positive excess kurtosis, i.e. fatter tails),
Mesokurtic,
Platykurtic (negative excess kurtosis, i.e. thinner tails),
Skewness:
Kurtosis:
Measures the skewnessof a distribution;
Positive or Negative skewness
Shape of the Distribution

Negatively
Skewed
Mode
Median
Mean
Symmetric
(Not Skewed)
Mean
Median
Mode
Positively
Skewed
Mode
Median
Mean

Normal distribution
•Many characteristics are distributed through the
population in a ‘normal’ manner
•Normal curves have well-defined statistical properties
•Parametric statistics are based on the assumption that the
variables are distributed normally
Most commonly used statistics
•This is the famous “Bell curve” where many cases fall near
the middle of the distribution and few fall very high or
very low

I.Q. Distribution

Data Transformation
•With skewed data, the mean is not a good measure of central
tendency because it is sensitive to extreme scores
•May need to transform skewed data to make distribution appear
more normal or symmetrical
•Must determine the degree & type of skewness prior to
transformation

Correlation and Regression
Correlationdescribes the strength of a linearrelationship between two variables
Linear means “straight line”
Measures-
Scatter Plot
Karl Pearson Correlation Coefficient
Spearman’s Rank Correlation

Regressiontells us how to draw the straight line described by the correlation.
It is the technique concerned with predicting some variables by knowing others i.e
the process of predicting variable Y using variable X

Multiple regression analysis
Multiple regression analysis is a straight forward extension of simple regression
analysis which allows more than one independent variable.
Y = a + b
1X
1+ b
2X
2 + …b
kX
k;
The b’s are called partial regression coefficients

Statistical Inference
Use a random sample to learn something about a
larger population
Statistical inference: Drawing conclusions about the whole
population on the basis of a sample
Precondition for statistical inference: A sample is randomly
selected from the population (probability sample)

Hypotheses
The null hypothesis, denoted H
0, is the claim that is initially assumed to be true. The alternative hypothesis,
denoted by H
a, is the assertion that is contrary to H
0. Possible conclusions from hypothesis-testing analysis are
reject H
0or fail to rejectH
0.
Rules for Hypotheses
H
0is always stated as an equality claim involving parameters.
H
ais an inequality claim that contradicts H
0. It may be one-sided (using either > or <) or two-sided (using ≠).

Steps for Hypothesis Testing
Draw Marketing Research Conclusion
Formulate H
0and H
1
Select Appropriate Test
Choose Level of Significance
Determine Prob
Assoc with Test Stat
Determine Critical
Value of Test Stat
TS
CR
Determine if TS
CR
falls into (Non)
Rejection Region
Compare with Level
of Significance, 
Reject/Do not Reject H
0
Calculate Test Statistic TS
CAL

Choice of an Appropriate Test

What size sample do We need?
Theanswertothisquestionisinfluencedbyanumberoffactors,viz
➢Thepurposeofthestudy
➢Populationsize
➢Theriskofselectinga“bad”sample
➢Theallowablesamplingerror
➢Mostofallwhetherundertakingaqualitativeorquantitativestudy
Different approaches for study designs , such as cross section, case-control, cohort
design, longitudinal study, diagnostics test study etc.

Sample Size Determination
Criteria
➢Level of confidence ( Normally 95%)
➢Margin of Error (Usually 1%, 3% or 5%)
➢Degree of variability in the attributes being measured (Prevalence)
Morehomogeneouspopulation→Smallersamplesize
Moreheterogeneouspopulation→Largesamplesizefordesiredprecision.

Sample size
Quantitative Qualitative
n=
Z
2
σ
2
??????
2
n=
(Z
2
σ
2
??????)
e
2
??????−1+Z
2
σ
2
n=
Z
2
&#3627408451;&#3627408452;
e
2
n=
(Z
2
&#3627408451;&#3627408452;??????)
e
2
??????−1+Z
2
&#3627408451;&#3627408452;
Infinite
Population
Finite
Population

Sample Size Table

Online Sample Size Calculator
https://www.surveysystem.com/sscalc.htm
https://www.calculator.net/sample-size-calculator
http://www.raosoft.com/samplesize.html
https://www.stat.ubc.ca/~rollin/stats/ssize/n2.html

P-Value
Definition: P-value is the probability of obtaining a sample “more extreme” than
one observed from the sample data, if the null hypothesis is true

Understanding P value

Interpreting P-value

Caution : The P-value was never intended to be a substitute for scientific reasoning

Multivariate Analysis Techniques
•Multiple regression
•Canonical correlation
•Discriminant analysis
•Logistic regression
•Survival analysis
•Principal component analysis
•Factor analysis
•Cluster analysis

Thank You…
“All the statistics in the world can’t measure the warmth of a smile.” Chris Hart

Statistical Methods in Research

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Statistical Methods in Research

About This Presentation

Slide Content

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 24

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 41

Slide 42

Slide 43

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

TLE-9-Prepare-Salad-and-Dressing.pptxkkk

LESSON 1 ABOUT MEDIA AND INFORMATION.pptx

GRADE-8-AQUACULTURE-WEEKQ1.pdfdfawgwyrsewru

Feelings PP Game FOR CHILDREN IN ELEMENTARY SCHOOL.pptx

Jeopardy_Figures_of_Speech_Template.pptx [Autosaved].pptx

Jeopardy_Figures_of_Speech.pptxvdsvdsvsdvsd