1. Review Statistics and Probability.pdf

MuhammadMishbah1 32 views 82 slides Jul 18, 2024
Slide 1
Slide 1 of 82
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82

About This Presentation

pdf


Slide Content

Introduction and
Descriptive Statistics
Review Statistics and Probability
Modified by:
Dr.AchmadNizar Hidayanto
Nur Fitriah AyuningBudi
KhumaisaNuraini

Learning Outcomes
•Review key statistical and
research terms
1
•Review the concept of central
tendency
2
•Review the concept of
variability
3

Introduction to Statistics
PowerPoint Lecture Slides
Essentials of Statistics for the
Behavioral Sciences
Eighth Edition
by Frederick J Gravetterand Larry B. Wallnau

1.1 Statistics, Science and
Observations
•“Statistics” means “statistical procedures”
•Uses of Statistics
–Organize and summarize information
–Determine exactly what conclusions are
justified based on the results that were
obtained
•Goals of statistical procedures
–Accurate and meaningful interpretation
–Provide standardized evaluation procedures

1.2 Populations and Samples
•Population
–The set of all the individuals of interest in a
particular study
–Vary in size; often quite large
•Sample
–A set of individuals selected from a population
–Usually intended to represent the population
in a research study

Figure 1.1
Relationship between population and sample

Variables and Data
•Variable
–Characteristic or condition that changes or has
different values for different individuals
•Data (plural)
–Measurements or observations of a variable
•Data set
–A collection of measurements or observations
•A datum (singular)
–A single measurement or observation
–Commonly called a score orraw score

Parameters and Statistics
•Parameter
–A value, usually a
numerical value, that
describes a population
–Derived from
measurements of
the individuals in
the population
•Statistic
–A value, usually a
numerical value, that
describes a sample
–Derived from
measurements of
the individuals in
the sample

Descriptive & Inferential Statistics
•Descriptivestatistics
–Summarize data
–Organize data
–Simplify data
•Familiar examples
–Tables
–Graphs
–Averages
•Inferential statistics
–Study samples to make
generalizations about
the population
–Interpret experimental
data
•Common terminology
–“Margin of error”
–“Statistically significant”

Sampling Error
•Sample is never identical to population
•Sampling Error
–The discrepancy, or amount of error, that
exists between a sample statistic and the
corresponding population parameter
•Example: Margin of Error in Polls
–“This poll was taken from a sample of registered
voters and has a margin of error of plus-or-minus 4
percentage points” (Box 1.1)

Figure 1.2
A demonstration of sampling error

Figure 1.3
Role of statistics in experimental research

1.3 Data Structures, Research
Methods, and Statistics
•Individual Variables
–A variable is observed
–“Statistics” describe the observed variable
–Category and/or numerical variables
–Descriptivestatistics
•Relationships between variables
–Two variables observed and measured
–One of two possible data structures used to
determine what type of relationship exists

Relationships Between Variables
•Data Structure I: The Correlational Method
–One group of participants
–Measurement of two variables for each
participant
–Goal is to describe type and magnitude of the
relationship
–Patterns in the data reveal relationships
–Non-experimental method of study

Figure 1.4
Data structures for studies evaluating the
relationship between variables

Correlational Method Limitations
•Can demonstrate the existence of a
relationship
•Does notprovide an explanation for the
relationship
•Most importantly, does notdemonstrate a
cause-and-effect relationshipbetween the
two variables

Relationships Between Variables
•Data Structure II: Comparing two (or
more) groups of Scores
–One variable defines the groups
–Scores are measured on second variable
–Both experimental and non-experimental
studies use this structure

Figure 1.5
Data structure for studies comparing groups

Experimental Method
•Goal of Experimental Method
–To demonstrate a cause-and-effect
relationship
•Manipulation
–The level of one variable is determined by the
experimenter
•Control rules out influence of other
variables
–Participant variables
–Environmental variables

Figure 1.6
The structure of an experiment

Independent/Dependent Variables
•Independent Variableis the variable
manipulated by the researcher
–Independent because no other variable in the
study influences its value
•Dependent Variableis the one observed
to assess the effect of treatment
–Dependent because its value is thought to
depend on the value of the independent
variable

Experimental Method: Control
•Methods of control
–Random assignment of subjects
–Matching of subjects
–Holding level of some potentially influential variables
constant
•Control condition
–Individuals do not receive the experimental treatment.
–They either receive no treatment or they receive a neutral,
placebo treatment
–Purpose: to provide a baseline for comparison with the
experimental condition
•Experimental condition
–Individuals do receive the experimental treatment

Non-experimental Methods
•Non-equivalent Groups
–Researcher compares groups
–Researcher cannot control who goes into which
group
•Pre-test / Post-test
–Individuals measured at two points in time
–Researcher cannot control influence of the
passage of time
•Independent variable is quasi-independent

Figure 1.7
Two examples of non-experimental studies
Insert NEW Figure 1.7

1.4 Variables and Measurement
•Scores are obtained by observing and
measuring variables that scientists use to
help define and explain external behaviors
•The process of measurement consists of
applying carefully defined measurement
procedures for each variable

Constructs & Operational Definitions
•Constructs
–Internal attributes
or characteristics
that cannot be
directly observed
–Useful for
describing and
explaining behavior
•Operational
–Identifies the set of
operations required to
measure an external
(observable) behavior
–Uses the resulting
measurements as both
a definitionand a
measurement of a
hypothetical construct

Discrete and Continuous
Variables
•Discrete variable
–Has separate, indivisible categories
–No values can exist between two neighboring
categories
•Continuous variable
–Have an infinite number of possible values
between any two observed values
–Every interval is divisible into an infinite
number of equal parts

Figure 1.8
Example: Continuous Measurement

Real Limits of Continuous
Variables
•Real Limits are the boundaries of each
interval representing scores measured on
a continuous number line
–The real limit separating two adjacent scores
is exactly halfway between the two scores
–Each score has two real limits
•The upper real limit marks the top of the
interval
•The lower real limit marks the bottom of the
interval

Scales of Measurement
•Measurement assigns individuals or events to
categories
–The categories can simply be names such as
male/female or employed/unemployed
–They can be numerical values such as 68 inches
or 175 pounds
•The complete set of categories makes up a
scale of measurement
•Relationships between the categories determine
different types of scales

Scales of Measurement
Scale Characteristics Examples
Nominal •Label and categorize
•No quantitative distinctions
•Gender
•Diagnosis
•Experimental or Control
Ordinal •Categorizes observations
•Categories organized by
size or magnitude
•Rankin class
•Clothing sizes (S,M,L,XL)
•Olympic medals
Interval •Ordered categories
•Interval between categories
of equal size
•Arbitrary or absent zero
point
•Temperature
•IQ
•Golf scores (above/below
par)
Ratio •Orderedcategories
•Equal interval between
categories
•Absolute zero point
•Numberof correct answers
•Time to complete task
•Gain in height since last
year

Central Tendency
PowerPoint Lecture Slides
Essentials of Statistics for the Behavioral
Sciences
Seventh Edition
by Frederick J Gravetter and Larry B. Wallnau

1.5 Overview of central tendency
•Central tendency
–A single score to define the “center” of a
distribution
•Purpose: find the single score that is most
typical or best represents the entire group

Figure 1.9
What is the “center” of each distribution?

1.6 The Mean
•The mean is the sum of all the scores
divided by the number of scores in the
data.
PopulationMeanSample MeanN
X
 n
X
M

The Mean: Three definitions
•Sum of the scores divided by the number
of scores in the data
•Amount each individual receives when
total is divided equally among all: M = ∑X /
n
•The balance point for the distribution

Figure 1.10

Computing the Mean from a
Frequency Distribution Table
QuizScore (X) f fX
10 1 10
9 2 18
8 4 32
7 0 0
6 1 6
Total n = Σf= 8 ΣfX= 66
M=??

The Weighted Mean
•Combine two sets of scores
•Three steps:
–Determine the combined sum of all the scores
–Determine the combined number of scores
–Divide the sum of scores by the total number
of scores21
21
mean (weighted) overall
nn
XX
M





Characteristics of the Mean
•Changingthe value of any score changes the
mean.
•Introducing a newscoreor removing a score
usually changes the mean.
•Addingor subtractinga constant from each
score changes the mean by the same constant.
•Multiplyingor dividingeachscore by a constant
multiplies or divides the mean by
that constant.

Figure 1.11

1.7 The Median
•The median is the midpoint of the scores
in a distribution whenthey are listed in
order from smallest to largest.
•The median divides the scores into two
groups of equal size.

Figure 1.12

Figure 1.13

The Precise Median for a
Continuous Variable
•A continuous variable can be infinitely divided
•The precise median is located in the interval
defined by the real limits of the value.
•It may be necessary to determine the fraction of
the interval needed to divide the distribution
exactly in half.
•interval in thenumber
50%reach toneedednumber
fraction

Figure 1.14

Median, Mean, and Middle
•Meanis the balance point of a distribution
–Defined by distances
–Often is not the midpoint of the scores
•Medianis the midpoint of a distribution
–Defined by number of scores
–Often is not the balance point of the scores
•Bothmeasure central tendency, using two
different concepts of middle or “central.”

Figure 1.15

1.8 The Mode
•The modeis the score or category that has
the greatest frequency of any in the
frequency distribution
–Can be used with any scale of measurement
–Corresponds to an actual score in the data
–The only one used with nominal data
•It is possible to have more than one mode

Figure 1.16

1.9 Selecting a Measure of Central
Tendency
Measure of
Central
Tendency
Appropriate to choose
when …
Should not be used
when…
Mean Nosituation precludes it•Extreme scores
•Skewed distribution
•Undetermined values
•Open-ended distribution
•Ordinal scale
•Nominal scale
Median •Extreme scores
•Skewed distribution
•Undetermined values
•Open-ended distribution
•Ordinal scale
•Nominal scale
Mode •Nominal scales
•Discrete variables
•Describingshape
•Interval or ratio data, except
to accompany mean or
median

Figure 1.17

Figure 1.18
Means or Medians in a Line Graph

Figure 1.19
Means or Medians in a Bar Graph

•Symmetrical distributions
–Mean and median have same value
–If exactly one mode, it has same value as the
mean and the median
–Distribution may have more than one mode,
or no mode at all
1.10 Central Tendency and the
Shape of the Distribution

Figure 1.20

Central Tendency in Skewed
Distributions
•Mean is found far toward the long tail (positive or
negative)
•Median is found toward the long tail, but not as
far as the mean
•Mode is found near the piled-up scores.
•If positivelyskewed, order from left to right is
mode, median, mean;
•If negativelyskewed, order from left to right is
mean, median, mode

Figure 1.21

Variability
PowerPoint Lecture Slides
Essentials of Statistics for the Behavioral
Sciences
Seventh Edition
by Frederick J Gravetter and Larry B. Wallnau

1.11 Overview
•Variabilitycan be defined several ways
–A quantitative measure of the differences
between scores
–Describes the degree to which the scores are
spread out or clustered together
•Purposes of Measure of Variability
–Describe the distribution
–Measure how well an individual score
represents the distribution

Figure 1.22
Population Distributions: Height, Weight

Three Measures of Variability
•The Range
•The Standard Deviation
•The Variance

1.12 The Range
•The distance covered by the scores in a
distribution
–From smallest value to highest value
•For continuous data, real limits are used
•For discrete variables range is number of
categories
range = URL for X
max—LRL for X
min

1.13 Standard Deviation and
Variance for a Population
•Most common and most important measure
of variability
–A measure of the standard, or average, distance from
the mean
–Describes whether the scores are clustered closely
around the mean or are widely scattered
•Calculation differs for population and samples

Developing the Standard Deviation
•Step One: Determine the Deviation Score (distance
from the mean) for eachscore:
•Step Two: Calculate Mean (Average) of Deviations
–Deviations sum to 0 because Mis balance point of the
distribution
–The Mean (Average) Deviation will always equal 0;
another method must be found
Deviation score = X —μ

Developing the Standard Deviation (2)
•Step Three: Get rid of negatives in
Deviations:
–Square each deviation score
–Using the squared values, compute the Mean
Squared Deviation, known as the Variance

•Variability is now measured in squared
units and is called the Variance.
Population variance equals the mean squared
deviation --Variance is the average squared
distance from the mean

Developing the Standard Deviation (2)
•Step Four:
–Variance measures the average squared
distance from the mean; not quite on goal
•Correct for having squared all the
deviations by taking the square root of the
varianceVariance Deviation Standard 

Figure 1.23
Calculation of the Variance

Formulas for Population
Variance and Standard Deviation

•SS(sum of squares) is the sum of the
squared deviations of scores from the
mean
•Two equations for computing SSscores of number
deviations squared of sum
Variance

Two formulas for SS
Definitional Formula
•Find each deviation
score (X–μ)
•Square each deviation
score, (X–μ)
2
•Sum up the squared
deviations
Computational Formula 
2
 XSS
•Square each score and
sum the squared scores
•Find the sum of scores,
square it, divide by N
•Subtract the second
part from the first
N
X
XSS
2
2


Population Variance: Formula
and Notation
FormulaN
SS
N
SS
deviation standard
variance

 Notation
•Lowercase Greek letter
sigma is used to denote
the standard deviation of
a population:
σ
•Because the standard
deviation is the square
root of the variance, we
write the variance of a
population as σ
2

Figure 1.24
Graphic Representation of Mean and Standard Deviation

1.14 Standard Deviation and
Variance for a Sample
•Goal of inferential statistics:
–Draw general conclusions about population
–Based on limited information from a sample
•Samples differ from the population
–Samples have lessvariability
–Computing the Variance and Standard
Deviation in the same way as for a population
would give a biasedestimate of the
population values

Figure 1.25
Population of Adult Heights

Variance and Standard Deviation
for a Sample
•Sum of Squares (SS) is computed as
before
•Formula has n-1rather than Nin the
denominator
•Notation uses sinstead of σ1
1
2




n
SS
n
SS
s sample of deviation standard
s sample of variance

Degrees of Freedom
•Population variance
–Mean is known
–Deviations are computed from a known mean
•Sample variance as estimate of population
–Population mean is unknown
–Using sample mean restricts variability
•Degrees of freedom
–Number of scores in sample that are
independent and free to vary
–Degrees of freedom (df)= n –1

1.15 More about Variance and
Standard Deviation
•Unbiased estimate of a population
parameter
–Average value of statistic is equal to parameter
–Average value uses all possible samples of a
particular size n
•Biasedestimate of a population parameter
–Systematicallyoverestimates or
underestimates (as with variance) the
population parameter

Table 4.1 Biased & Unbiased
Estimates
Sample Statistics
Sample 1
st
Score 2
nd
Score Mean
Biased
(used n)
Unbiased
(usedn-1)
1 0 0 0.00 0.00 0.00
2 0 3 1.50 2.25 4.50
3 0 9 4.50 20.25 40.50
4 3 0 1.50 2.25 4.50
5 3 3 3.00 0.00 0.00
6 3 9 6.00 9.00 18.00
7 9 0 4.50 20.25 40.50
8 9 3 6.00 9.00 18.00
9 9 9 9.00 0.00 0.00
Totals 36.00 63.00/9 126.00/8
Actualσ
2
= 14
This is an adaptation of Table 4.1

Figure 1.26
Sample of n= 20, M = 36, and s= 4

Transformations of Scale
•Adding a constant to each score
–The Mean is changed
–The standard deviation is unchanged
•Multiplying each score by a constant
–The Mean is changed
–Standard Deviation is also changed
–The Standard Deviation is multiplied by
that constant

Variance and Inferential
Statistics
•Goal of inferential statistics: To detect
meaningful and significant patterns in
research results
•Variability in the data influences how easy it
is to see patterns
–High variability obscurespatterns that would
be visible in low variability samples
–Variability is sometimes called error variance

Figure 1.27
Experiments with high and low variability
Tags