edu.Chapter-3.1-Statistic-Refresher-1.pptx

lmareno 49 views 100 slides Aug 19, 2024
Slide 1
Slide 1 of 100
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100

About This Presentation

Psychological Statistics Refresher


Slide Content

Chapter 3 A statistics refresher

STATISTICS Is the science of gathering ,analyzing, interpreting and presenting data Test scores are frequently expressed as numbers, and statistical tools are used to describe, make inferences from, and draw conclusions about numbers.1 In this statistics refresher, we cover scales of measurement , tabular and graphic presentations of data , measures of central tendency , measures of variability , aspects of the normal curve , and standard scores .

Scales of Measurement

Measurement as the act of assigning numbers or symbols to characteristics of things (people, events, whatever) according to rules. The rules used in assigning numbers are guidelines for representing the magnitude (or some other characteristic) of the object being measured. Scale is a set of numbers (or other symbols) whose properties model empirical properties of the objects to which the numbers are assigned

Type of Scale DISCRETE SCALE, contains numeric data that have a finite number of possible values and can only be whole numbers. ex. it can be constant number, anything that is enduring or fixed like number of chair, score in a test, the number of children in a school is discrete data. You can count whole individuals. You can’t count 1.5 kids. Discrete data consist of ordinal or nominal data

Continous Scale, exists when it is theoretically possible to divide any of the values of the scale. Ex. Variable, in which anything varies or changes like age, level of stress, you can measure your height at very precise scales — meters, centimeters, millimeters and etc. Continous Scale consist of Ratio and Interval data.

FOUR TYPES OF SCALES OF MEASUREMENT The French word for black is noir (pronounced “‘nwaˇ re”). N=nominal O=Ordinal I=interval R=ratio Stanley Smith Stevens developed these four scales of measurements in 1946. Analysts continue to use them today because how you record your data affects what you can learn from them and the statistical analyses you can perform.

Stanley Smith Stevens developed these four scales of measurements in 1946. Analysts continue to use them today because how you record your data affects what you can learn from them and the statistical analyses you can perform.

Nominal Scale

NOMINAL are the simplest form of measurement . These scales involve classification or categorization based on one or more distinguishing characteristics, where all things measured must be placed into mutually exclusive and exhaustive categories. The nominal scale is the lowest level of measurement because you can only group the observations, and you cannot order the groups. Ex.Gender,male/female , hair color, nationalities, names of people,coding -labels and so on.

nominal data include counting for the purpose of determining how many cases fall into each category and a resulting determination of proportion or percentages.

Ordinal Scale

ORDINAL Like nominal scales, ordinal scales permit classification. However, in addition to classification, rank ordering on some characteristic is also permissible with ordinal scales. In business and organizational settings, job applicants may be rank-ordered according to their desirability for a position. Ordinal scales have no absolute zero point . In the case of a test of job performance ability, every testtaker, regardless of standing on the test, is presumed to have some ability. No testtaker is presumed to have zero ability Because there is no zero point on an ordinal scale, the ways in which data from such scales can be analyzed statistically are limited. One cannot average the qualifications of the

Alfred Binet Alfred Binet, a developer of the intelligence test that today bears his name, believed strongly that the data derived from an intelligence test are ordinal in nature He emphasized that what he tried to do with his test was not to measure people (as one might measure a person’s height), but merely to classify (and rank) people on the basis of their performance on the tasks

Ordinal Level of Measurement is most frequently use in Psychology. According to Kerlinger: Intelligence, aptitude and personality test scores are Ordinal

Interval Scale

Interval Scales Interval scales contain equal intervals between numbers . Each unit on the scale is exactly equal to any other unit on the scale. contain no absolute zero point . With interval scales, we have reached a level of measurement at which it is possible to average a set of measurements and obtain a meaningful result. attributes of a variable are differentiated by the degree of difference between them, but there is no absolute zero, and the ratio between the attributes is unknown. Due to this lack of a true zero, measurement ratios are not valid for interval scales. Examples of interval scales include temperature in Celsius and Fahrenheit, credit scores (300-850), SAT scores (200-800)

Interval data always take numerical values where the distance between two points on the scale is standardised and equal. Also, arithmetic operations can be performed on interval data. However, these operations are limited to addition and subtraction. Interval data is measured on an interval scale. A simple example of interval data: The difference between 100 degrees Fahrenheit and 90 degrees Fahrenheit is the same as 60 degrees Fahrenheit and 70 degrees Fahrenheit. A very good example is the Celsius or Fahrenheit temperature scales : each notch on the thermometer directly follows the previous one, and each is the same distance apart

Ratio Scale

RATIO a ratio scale has a true zero point. All mathematical operations can meaningfully be performed because there exist equal intervals between the numbers on the scale as well as a true or absolute zero point It is the true zero point in ratio scales that allow us to add,multiply and divide. examples of ratio scales in psychological research: Height, weight, volume, Kelvin Temperature

Ratio Additionally, these variables have zero measurements representing a lack of the attribute. For example, zero kilograms indicates a lack of weight. Consequently, measurements ratios are valid for these scales. 30 kg is three times the weight of 10 kg. You can add, subtract, multiply, and divide values on a ratio scale.

Nominal: attributes of a variable are differentiated only by name (category) and there is no order (rank, position). Ordinal: attributes of a variable are differentiated by order (rank, position), but we do not know the relative degree of difference between them. Interval: attributes of a variable are differentiated by the degree of difference between them, but there is no absolute zero, and the ratio between the attributes is unknown. Ratio: attributes of a variable are differentiated by the degree of difference between them, there is absolute zero, and we could find the ratio between the attributes.

Describing Data

You have just administered an examination that consists of 100 multiple-choice items (where 1 point is awarded for each correct answer). The distribution of scores for the 25 students enrolled in your class could theoretically range from 0 (none correct) to 100 (all correct). The  scores from 25 students in this distribution are referred to as raw scores . raw score is a straightforward, unmodified accounting of performance that is usually numerical. A raw score may reflect a simple tally, as in number of items responded to correctly on an achievement test.

Frequency Distribution

The data from the test could be organized into a distribution of the raw scores. One way the scores could be distributed is by the frequency with which they occur. refered as simple frequency distribution , all scores are listed alongside the number of times each score occurred. indicate hat individual scores have been used and the data have not been grouped. Frequency Distribution

Group Frequency Distribution

In a grouped frequency distribution , test-score intervals, also called class intervals , replace the actual test scores. The number of class intervals used and the size or width of each class interval (or, the range of test scores contained in each class interval) are for the test user to decide. But how? In most instances, a decision about the size of a class interval in a grouped frequency distribution is made on the basis of convenience. Of course, virtually any decision will represent a trade-off of sorts. A convenient, easy-to-read summary of the data is the trade-off for the loss of detail

Simple frequency distribution Grouped frequency distribution Range of test scores

Measures of Central Tendency

Formulas to remember: Σ X = Σ(X/n) X X = Σ( fX ) n Summation Notation Called Sigma “The sum of” Arithmetic Mean “X bar” Arithmetic mean from a frequency distribution Arithmetic mean

Measure of Central Tendency A measure of central tendency is a statistic that indicates the average or midmost score between the extreme scores in a distribution. The center of a distribution can be defined in different ways. Mean Median Mode

Mean

MEAN arithmetic mean (or, more simply, mean ), which is referred to in everyday language as the “ average .” The mean takes into account the actual numerical value of every score. In standard statistical shorthand ‘’ Summation Notation’’ . In special instances, such as when there are only a few scores and one or two of the scores are extreme in relation to the remaining ones, a measure of central tendency other than the mean may be desirable. The arithmetic mean is typically the most appropriate measure of central tendency for interval or ratio data when the distributions are believed to be approximately normal.

Ex. 2+4+6+7+10+9= 38 6/38= 6.33(Mean) Find the mean of 5+6+10+7+8+6

MEAN Distribution can be grouped or ungrouped Can be used both discrete & continuous data Susceptible to the influence of outliers., data values that are markedly different from the rest of the data item. Affected by extreme values.

Median

MEDIAN defined as the middle score in a distribution, is another commonly used measure of central tendency. We determine the median of a distribution of scores by ordering the scores in a list by magnitude, in either ascending or descending order. If the total number of scores ordered is an odd number, then the median will be the score that is exactly in the middle, with one-half of the remaining scores lying above it and the other half of the remaining scores lying below it. ex. 4 7 9 13 15 19 20 =13 median

SET B 53 + 52 = 105 105/2 = 52.5 The median is 52.5 When the total number of scores ordered is an even number, then the median can be calculated by determining the arithmetic mean of the two middle scores.

Median ,middle score in a distribution, also by ordering the scores in a list by magnitude in ascending or ascending. Less affected by the outliers and skewed data When your data is Positive Skewed or Negative Skewed used Median. Best used in Interval, Ratio & Ordinal Variable.

Mode

MODE The most frequently occurring score in a distribution of scores . For example in the data: 6, 8, 9, 3, 4, 6, 7, 6, 3, the value 6 appears the most number of times. Thus, mode = 6. An easy way to remember mode is: M ost O ften D ata E ntered. the mode is telling you what the most common answer is. The most common score among the group represents the mode.

These scores are said to have a bimodal distribution because there are two scores (51 and 66) that occur with the highest frequency (of two)

Except with nominal dat a, the mode tends not to be a very commonly used measure of central tendency. Unlike the arithmetic mean, which has to be calculated, the value of the modal score is not calculated; one simply counts and determines which score occurs most frequently. Because the mode is arrived at in this manner, the modal score may be totally atypical—for instance, one at an extreme end of thedistribution—which nonetheless occurs with the greatest frequency . Mode is useful on analysis of qualitative or verbal nature. Values occurs most often Not affected by extreme values. can be use for either numerical or categorical data. Most frequently occurring score in distribution of scores.

Measures of Variability

VARIABILITY Variability is an indication of how scores in a distribution are scattered or dispersed . In statistics, variability, dispersion, and spread are synonyms that denote the width of the distribution. Just as there are multiple measures of central tendency, there are several measures of variability.

Low variability is ideal because it means that you can better predict information about the population based on sample data. High variability means that the values are less consistent, so it’s harder to make predictions. A low dispersion indicates that the data points tend to be clustered tightly around the center. High dispersion signifies that they tend to fall further away.

measures of variability : R ange I nterquartile range, S emi-interquartile range, A verage deviation, S tandard deviation, V ariance.

RANGE The range of a distribution is equal to the difference between the highest and the lowest scores . it is the most straightforward measure of variability to calculate and the simplest to understand The range is the simplest measure of variability to calculate, but its potential use is limited. Because the range is based entirely on the values of the lowest and highest scores, one extreme score (if it happens to be the lowest or the highest) can radically alter the value of the range.

We could describe distribution B of Figure 3–3, for example, as having a range of 20 if we knew that the highest score in this distribution was 60 and the lowest score was 40 (60 − 40 = 20)

With respect to distribution A, if we knew that the lowest score was 0 and the highest score was 100, the range would be equal to 100 − 0, or 100 . the range is based entirely on the values of the lowest and highest scores, one extreme score (if it happens to be the lowest or the highest) can radically alter the value of the range, When its value is based on extreme scores in a distribution, the resulting description of variation may be understated or overstated. Better measures of variation include the interquartile range and the semi-interquartile range.

The Interquartile and Semi-interquartile ranges

The Interquartile range The interquartile range is a measure difference between Q3and Q1. Like the median, it is an ordinal statistic . Q1-Q3=?

23 27 30 36 38 44 38 39 50 57 62 65 68 70 75 88 90 100 Q1 Q2 Q3 The interquartile range (IQR) extends from the low end of Q 1 to the upper limit of Q3. For this dataset, the range is 70 – 36 = 34 . IQR=34

Semi-Interquartile range semi-interquartile range , which is equal to the interquartile range divided by 2 . The semi interquartile range (SIR) (also called the quartile deviation ) is a measure of spread. It tells you something about how data is dispersed around a central point (usually the mean). The SIR is half of the interquartile range . Q1-Q3= N/2 Q1=36 - Q3=70 34/2 = SIR 17

25% 50% 75%

Average Deviation The average deviation,It is also called an average absolute deviation . Another tool that could be used to describe the amount of variability in a distribution is the average deviation. That explain how our score or each value is away from the mean. The average deviation is rarely used. Perhaps this is so because the deletion of algebraic signs renders it a useless measure for purposes of any further operations.

Average Deviation Formula to follow in solving the average deviation: X – mean = x Absolute value Summation Total number of scores

Standard Deviation more frequently used “ the cousin of average deviation ,” the standard deviation. Instead of using the absolute value of each deviation score, we use the square of each score. With each score squared, the sign of any negative deviation becomes positive. standard deviation as a measure of variability equal to the square root of the average squared deviations about the mean.

Standard Deviation Standard deviation is a statistic that measures the dispersion of a dataset relative to its mean and is calculated as the square root of the variance . the variance is calculated by squaring and summing all the deviation scores and then dividing by the total number of scores(pasabot variance ang tawag sa gi square root na mean)

A standard deviation (or σ) is a measure of how dispersed the data is in relation to the mean . The standard deviation measures the spread of the data about the mean value. It is useful in comparing sets of data which may have the same mean but a different range

Standard Deviation Formula to follow in solving the standard deviation:

Consider a set X = [5,10,15,20,25] (5+10+15+20+25) / 5 75 / 5 15  Hence  Mean = 15 5-15 = -10 10-15 = -5 15-15 = 0 20-15 = 5 25-15 = 10 (-10) 2  = 100 (-5) 2  = 25 (0) 2  = 0 (5) 2  = 25 (10) 2  = 100 100 + 25 + 0 + 25 + 100 = 250 250 / (5-1) = 250 / 4 = 62.5 √62.5 = 7.905

Standard deviation is the most common measure of variability and is frequently used than average deviation. Standard deviation is considered the most appropriate measure of variability when using a population sample, when the mean is the best measure of center, and when the distribution of data is normal . Some argue that average deviation , or mean absolute deviation, is a better gauge of variability when there are distant outliers or the data is not well distributed. SD VS. AD

Skewness

SKEWNESS skewnes s, or the nature and extent to which symmetry is absent or lack of symmetry. skewed is associated with abnormal, perhaps because the skewed distribution deviates from the symmetrical or so-called normal distribution.

Normal Distribution or symmetrical.

Positive Skew when relatively few of the scores fall at the high end of the distribution . Positively skewed examination results may indicate that the test was too difficult

Negative Skew negative skew when relatively few of the scores fall at the low end of the distribution. Negatively skewed examination results may indicate that the test was too easy

Skewness Positive skew Q3 – Q2 > Q2 – Q1 Negative Skew Q3 – Q2 < Q2 – Q1 Symmetrical Distribution Q1 and Q3 is equal

Kurtosis

KURTOSIS The term testing professionals use to refer to the steepness of a distribution in its center. Kurtosis is a measure of the tailedness of a distribution. Tailedness is how often outliers occur. Platykurtic=(relatively flat) Leptokurtic= (relatively peaked) Mesokurtic=(somewhere in the middle)

Platykurtic=(relatively flat) A platykurtic distribution is thin-tailed, meaning that outliers are infrequent. Platykurtic distributions have less kurtosis than a normal distribution. Platykurtosis is sometimes called negative kurtosis , since the excess kurtosis is negative.

Leptokurtic A leptokurtic distribution is fat-tailed , meaning that there are a lot of outliers. Leptokurtic distributions are more kurtotic than a normal distribution. Leptokurtosis is sometimes called positive kurtosis , since the excess kurtosis is positive.

Mesokurtic A mesokurtic distribution is medium-tailed, so outliers are neither highly frequent, nor highly infrequent. Kurtosis is measured in comparison to normal distributions. Normal distributions have a kurtosis of 3 , so any distribution with a kurtosis of approximately 3 is mesokurtic. Normal distributions have an excess kurtosis of 0, so any distribution with an excess kurtosis of approximately 0 is mesokurtic.
Tags