Standard Deviation
•Instatistics,thestandarddeviationisameasureoftheamountof
variationofarandomvariableexpectedaboutitsmean.
•Alowstandarddeviationindicatesthatthevaluestendtobecloseto
themeanoftheset,whileahighstandarddeviationindicatesthat
thevaluesarespreadoutoverawiderrange
Raw Scores
•Thedefinitionofarawscoreinstatisticsisanunaltered
measurement.
•Rawscoreshavenotbeenweighted,manipulated,calculated,
transformed,orconverted.Anentiredatasetthathasbeenunaltered
isarawdataset.
Z -Score
•Z-score isa statistical measure that quantifies the distance between a
data point and the mean of a dataset.
•It's expressed in terms of standard deviations. It indicates how many
standard deviations a data point is from the mean of the distribution.
Z -Score
•For a recent final exam in STAT 500, the mean was 68.55 with
a standard deviation of 15.45.
•If you scored an 80%:??????=(80−68.55)/15.45=0.74, which
means your score of 80 was 0.74 SD above the mean.
•If you scored a 60%:??????=(60−68.55)/15.45=−0.55, which
means your score of 60 was 0.55 SD below the mean.
Z -Score
•The scores can be positive or negative.
•For data that is symmetric (i.e. bell-shaped) or nearly symmetric, a
common application of Z-scores for identifying potential outliers is for
any Z-scores that are beyond ±3.
Using z-scores to standardisea distribution
•Every X value in a distribution can be transformed into a
corresponding z-score
•Any normal distribution can be standardized by converting its values
into z scores.
•Z scores tell you how many standard deviations from the mean each
value lies.
•Converting a normal distribution into a z-distribution allows you to
calculate the probability of certain values occurring and to compare
different data sets
Using z-scores to make comparison
•we can compare performance [values] in two different distributions,
based on their z-scores.
•Lower z-scoremeans closer to the meanwhile higher means more far
away.
•Positive means to the right of the mean or greater while negative
means lower or smaller than the mean
Using z-scores to make comparison
•Jaredscoreda92onatestwithameanof88andastandard
deviationof2.7.Jasperscoredan86onatestwithamean
of82andastandarddeviationof1.8.FindtheZ-scoresfor
Jared'sandJasper'stestscores,andusethemtodetermine
whodidbetterontheirtestrelativetotheirclass.
Using z-scores to make comparison
•Step 1: Compute each test score's Z-score using the mean
and standard deviation for that test.
•For Jared's test, the Z-score is:
??????=(??????−??????)/??????= (92−88)/2.7=4/2.7 = 1.48
•For Jasper's test, the Z-score is:
??????=(??????−??????)/??????= (86−82)/1.8 = 4/1.8 = 2.22
Using z-scores to make comparison
•Step2:UseZ-scorestocompareacrossdatasets.
•Jared'sZ-scoreof1.48saysthathisscoreof92wasbetween
1and2standarddeviationsabovethemean.Jasper'sZ-score
of2.22saysthathisscoreof86wasabitmorethan2
standarddeviationsabovethemean.So,Jasper'sscoreof86
wasrelativelyhigherforhisclassthanJared's92wasforhis
class.
Probability
•Probability is simplyhow likely something is to happen.
•Whenever we're unsure about the outcome of an event, we
can talk about the probabilities of certain outcomes—how
likely they are.
•The analysis of events governed by probability is called
statistics.
What are Equally Likely Events?
•When the events have the same theoretical probability of happening, then
they are called equally likely events. The results of a sample space are
called equally likely if all of them have the same probability of occurring.
For example, if you throw a die, then the probability of getting 1 is 1/6.
Similarly, the probability of getting all the numbers from 2,3,4,5 and 6, one
at a time is 1/6. Hence, the following are some examples of equally likely
events when throwing a die:
•Getting 3 and 5 on throwing a die
•Getting an even number and an odd number on a die
•Getting 1, 2 or 3 on rolling a die
are equally likely events, since the probabilities of each event are equal
Random sampling
Simple random sample
•Each member of the population has an equal chance of being
selected
Independent random sample
•Each member of the population has an equal chance of being
selected
AND
•The probability of being selected stays constant from one selection
to the next [if more than one individual is selected]
•i.e. Sampling with replacement
Independent Random Sampling
•Probability of event A =
���������������������??????�??????����??????
??????����������������??????�����������
Probability and Frequency distributions
•Probability usually involves a population of scores displayed in a
frequency distribution graph.
•What is the probability of obtaining an individual score of less than 3?
[i.e. either 1 or 2?]
N = 20
Probability and the normal distribution
•In any normal distribution the percentage of values that lie within a
specified number of standard deviations from the mean is the same
Graphing Probability …
68 –95 -99.7% Rule of Thumb revisited
•One standard deviation either side of the mean captures:
•Approx68% of our data
•Mathematically: 68.26%
•Two standard deviations either side of the mean captures:
•Approx95% of our data
•Mathematically: 95.44%
•Three standard deviations either side of the mean captures:
•Approx99.7% of our data
•Mathematically: 99.73%
68% –95% -99.7% Rule of Thumb revisited
68.26% –95.44% –99.73% Maths calculation
Probability
What is the probability that a randomly selected data value in a normal distribution
lies more than 1 standard deviation below the mean?
p(z < -1.00)
What is the probability that a randomly selected data value in a normal distribution
lies more than 1 standard deviation above the mean?
p(z > 1.00)
Calculating probability in a normal distribution
•When calculating the probability we should calculate the Z-Score
Standardise the distribution [z-score calculation],
z =
??????−??????
??????
If scores on a test were normally distributed with:
•mean of ??????= 60, and a standard deviation of ??????= 12,
•what is the probability [of a randomly selected person who took the
test] of a score greater than 84?
Probability using Unit Normal Table
•Quite often the values we are interested in are not exactly 1, 2 or 3
standard deviations away from the mean. Statistical tables [or online
probability calculators] can be used to calculate the probability
Probability using Unit Normal Table
The body always corresponds to the larger part of the distribution
•can be located on the left or the right of the distributions
The tail always corresponds to the smaller part of the distribution
•again, can be located on the left or the right of the distributions
Probability using Unit Normal Table
Example
InformationfromthedepartmentofMotorVehiclesindicatesthatthe
averageageoflicenseddriversis??????=45.7yearswithastandard
deviationof??????=12.5years.Assumingthatthedistributionofdrivers’
agesisapproximatelynormal,
1.Whatproportionoflicenseddriversareolderthan50yearsold?
z =
??????−??????
??????
=
50−45.7
12.5
=
4.3
12.5
= 0.34
2.Whatproportionoflicenseddriversareyoungerthan30yearsold?
z =
??????−??????
??????
=
30−45.7
12.5
=
−15.7
12.5
= -1.26 [so, 30 is 1.26 sdsbelow]
Examples
•The length of a human pregnancy is normally distributed with a mean
of 272 days with a standard deviation of 9 days .
1.State the random variable.
2.Find the probability of a pregnancy lasting more than 280 days.
3.Find the probability of a pregnancy lasting less than 250 days.
4.Find the probability that a pregnancy lasts between 265 and 280
days..
5.Suppose you meet a woman who says that she was pregnant for
less than 250 days. Would this be unusual and what might you
think?