What is a Point Biserial Correlation?

plummer48 27,808 views 67 slides Oct 01, 2014
Slide 1
Slide 1 of 67
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67

About This Presentation

What is a Point Biserial Correlation?


Slide Content

Point Biserial Correlation Welcome to the Point Biserial Correlation Conceptual Explanation

Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.

Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous. Coherence means how much the two variables covary .

Let’s look at an example of two variables cohering

The data set below represents the average decibel levels at which different age groups listen to music.

The data set below represents the average decibel levels at which different age groups listen to music. Age Group Decibels Teens 95 20s 75 30s 50 40s 45 50s 39 60s 37 70s 35 80s 30

The data set below represents the average decibel levels at which different age groups listen to music. Age Group Decibels Teens 95 20s 75 30s 50 40s 45 50s 39 60s 37 70s 35 80s 30 The reason these two variables (age group and decibel level) cohere is because as one increases the other either increases or decreases commensurately.

The data set below represents the average decibel levels at which different age groups listen to music. Age Group Decibels Teens 95 20s 75 30s 50 40s 45 50s 39 60s 37 70s 35 80s 30 In this case

The data set below represents the average decibel levels at which different age groups listen to music. Age Group Decibels 80s 30 70s 35 60s 37 50s 39 40s 45 30s 50 20s 75 Teens 95 In this case as age goes up

The data set below represents the average decibel levels at which different age groups listen to music. Age Group Decibels 80s 30 70s 35 60s 37 50s 39 40s 45 30s 50 20s 75 Teens 95 In this case as age goes up

The data set below represents the average decibel levels at which different age groups listen to music. Age Group Decibels 80s 30 70s 35 60s 37 50s 39 40s 45 30s 50 20s 75 Teens 95 In this case as age goes up, decibels go down

The data set below represents the average decibel levels at which different age groups listen to music . This is called a negative relationship. Age Group Decibels 80s 30 70s 35 60s 37 50s 39 40s 45 30s 50 20s 75 Teens 95 In this case as age goes up, decibels go down

This is called a negative correlation or coherence, because when one variable increases, the other decreases (or vice-a-versa)

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases.

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases.

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases . Example

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases . Example As the temperature rises the average daily purchase of popsicles increases.

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases . Example As the temperature rises the average daily purchase of popsicles increases. Average Daily Temp Average Daily Popsicle Purchases Per Person 100 2.30 95 1.20 90 1.00 85 .80 80 .70 75 .10 70 .03 65 .01

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases . Example As the temperature rises the average daily purchase of popsicles increases. Average Daily Temp Average Daily Popsicle Purchases Per Person 100 2.30 95 1.20 90 1.00 85 .80 80 .70 75 .10 70 .03 65 .01

A positive correlation would occur when as one variable increases, the other increases or when one decreases the other decreases. Example As the temperature rises the average daily purchase of popsicles increases. These variables are positively correlated because as one variable (Daily Temp) increases another variable (average daily popsicle purchase) increases. Average Daily Temp Average Daily Popsicle Purchases Per Person 100 2.30 95 1.20 90 1.00 85 .80 80 .70 75 .10 70 .03 65 .01

It can be stated another way :

It can be stated another way : As the average daily temperature decreases the average daily popsicle purchases decrease as well.

It can be stated another way : As the average daily temperature decreases the average daily popsicle purchases decrease as well. Average Daily Temp Average Daily Popsicle Purchases Per Person 100 2.30 95 1.20 90 1.00 85 .80 80 .70 75 .10 70 .03 65 .01

It can be stated another way : As the average daily temperature decreases the average daily popsicle purchases decrease as well. Average Daily Temp Average Daily Popsicle Purchases Per Person 100 2.30 95 1.20 90 1.00 85 .80 80 .70 75 .10 70 .03 65 .01

It can be stated another way : As the average daily temperature decreases the average daily popsicle purchases decrease as well. These variables are also positively correlated because as one variable (Daily Temp ) decreases another variable (average daily popsicle purchase) decreases. Average Daily Temp Average Daily Popsicle Purchases Per Person 100 2.30 95 1.20 90 1.00 85 .80 80 .70 75 .10 70 .03 65 .01

Let’s return to our Point Biserial Correlation definition:

Let’s return to our Point Biserial Correlation definition : “Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.”

Let’s return to our Point Biserial Correlation definition : “Point bisevial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.” We discussed coherence

Let’s return to our Point Biserial Correlation definition : “Point bisevial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.” But , what is a dichotomous variable ?

A dichotomous variable is a variable that can only be one thing or another.

A dichotomous variable is a variable that can only be one thing or another . Here are some examples:

A dichotomous variable is a variable that can only be one thing or another . Here are some examples : When you can only answer “Yes” or “No”

A dichotomous variable is a variable that can only be one thing or another . Here are some examples : When you can only answer “Yes” or “No” When your statement can only be categorized as “Fact” or “Opinion”

A dichotomous variable is a variable that can only be one thing or another . Here are some examples : When you can only answer “Yes” or “No” When your statement can only be categorized as “Fact” or “Opinion” When you are either are something or you are not “Catholic” or “Not Catholic”

The dichotomous variable may be naturally occurring as in gender

The dichotomous variable may be naturally occurring as in gender

The dichotomous variable may be naturally occurring as in gender or may be arbitrarily dichotomized as in depressed/not depressed.

The dichotomous variable may be naturally occurring as in gender or may be arbitrarily dichotomized as in depressed/not depressed.

The range of a point biserial correlation in from -1 to +1.

The range of a point biserial correlation in from -1 to +1. -1 +1

Let’s return again to our Point Biserial Correlation definition:

Let’s return again to our Point Biserial Correlation definition : “Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.”

Let’s return again to our Point Biserial Correlation definition : “Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.”

Let’s return again to our Point Biserial Correlation definition : “Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous.” So , we now know what a dichotomous variable is (either / or )

Let’s return again to our Point Biserial Correlation definition : “Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous .”

Let’s return again to our Point Biserial Correlation definition : “Point biserial correlation is an estimate of the coherence between two variables, one of which is dichotomous and one of which is continuous .” What is a continuous variable?

Definition of Continuous Variable:

Definition of Continuous Variable : If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable.

Definition of Continuous Variable : If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable . Here is an example:

Definition of Continuous Variable : If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable . Here is an example: Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds .

Definition of Continuous Variable : If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable . Here is an example: Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds .

The direction of the correlation depends on how the variables are coded.

The direction of the correlation depends on how the variables are coded. Let’s say we are comparing the shame scores (continuous variable from 1-10) and whether someone is depressed or not (dichotomous variable – not depressed = 1 and depressed = 2). .

If the dichotomous variable is coded with the higher value representing the presence of an attribute (depressed)

If the dichotomous variable is coded with the higher value representing the presence of an attribute (depressed) Person Depressed 1 = not depressed 2 = depressed A   B   C   D   E  

If the dichotomous variable is coded with the higher value representing the presence of an attribute (depressed) Person Depressed 1 = not depressed 2 = depressed A Depressed B Depressed C Depressed D Not Depressed E Not Depressed

If the dichotomous variable is coded with the higher value representing the presence of an attribute (depressed) Person Depressed 1 = not depressed 2 = depressed A 2 B 2 C 2 D 1 E 1

. . . and the continuous variable is coded with higher values representing the increasing presence of an attribute (shame),

. . . and the continuous variable is coded with higher values representing the increasing presence of an attribute (shame), Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 2 10 B 2 9 C 2 10 D 1 2 E 1 2

. . . and the continuous variable is coded with higher values representing the increasing presence of an attribute (shame), then positive values of the point- biserial would indicate higher shame associated with depressed status. In this case we would compute a Point Biserial of +.99 Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 2 10 B 2 9 C 2 10 D 1 2 E 1 2

. . . and the continuous variable is coded with higher values representing the increasing presence of an attribute (shame), then positive values of the point- biserial would indicate higher shame associated with depressed status . In this case we would compute a Point Biserial of +.99 Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 2 10 B 2 9 C 2 10 D 1 2 E 1 2

If we switch the codes where not depressed = 2 and depressed = 1

If we switch the codes where not depressed = 2 and depressed = 1 Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 1 10 B 1 9 C 1 10 D 2 2 E 2 2

If we switch the codes where not depressed = 2 and depressed = 1 We would have a -.99 correlation. Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 1 10 B 1 9 C 1 10 D 2 2 E 2 2

If we switch the codes where not depressed = 2 and depressed = 1 We would have a -.99 correlation. Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 1 10 B 1 9 C 1 10 D 2 2 E 2 2

If we switch the codes where not depressed = 2 and depressed = 1 We would have a -.99 correlation. Therefore, instead of looking at the numbers, we think in terms of whether something is present or not in this case (presence of depression or the lack of depression) and how that relates to the amount of shame. Person Depressed 1 = not depressed 2 = depressed Amount of Shame A 2 10 B 2 9 C 2 10 D 1 2 E 1 2

The strength of the association can be tested against chance just as the Pearson Product Moment Correlation Coefficient.
Tags