Ordinal logistic regression

DrAtharKhan 2,080 views 44 slides Mar 29, 2020
Slide 1
Slide 1 of 44
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44

About This Presentation

Ordinal logistic regression OLR SPSS


Slide Content

ORDINAL LOGISTIC REGRESSION
Dr. Athar Khan
[email protected]
3/29/2020 DR ATHAR KHAN 1

Ordinallogisticregression(OLR)isgenerallyusedwhenyou
havecategoriesforthedependentvariablethatareordered
(i.e.,areranked).
Whentheproportionaloddsassumptionisviolated,then
MLRprovidesaviablealternativetoOLR.Theproportional
oddsassumptionessentiallystatesthattherelationship
betweentheindependentvariableanddependentvariable
isconstant,irrespectiveofwhichgroupsarebeing
comparedonthedependentvariable(seeOsborne,2015,
2017).
Overview
3/29/2020 DR ATHAR KHAN 2

▪LogisticRegressionisaversionofmultipleregression
wheretheoutcomevariableisbinary(dichotomous),
meaningthereareonlytwopossibleoutcomes.The
modelcanbeusedtocalculatetheprobabilityofoneof
thetwooutcomesoccurringovertheotherforagiven
case/observationbyusingthevaluesofasetofknown
explanatoryvariables.
▪Logitsarebasicallytransformationsofexistingbinary
outcomevariabledatapointsintoaprobabilityP(ranging
from0to1).
▪Alogitcurveisthereforeagraphoftheselogitsplotted
againstanexplanatoryvariable.
Overview
3/29/2020 DR ATHAR KHAN 3

▪-2log-likelihood(-2LL)providesuswithanindicationof
thetotalerrorthatisinalogisticregressionmodel.The
largerthevalueofthe-2LLthelessaccuratethe
predictionsofthemodel
Overview
3/29/2020 DR ATHAR KHAN 4

Overview
3/29/2020 DR ATHAR KHAN 5

▪TheLogoftheOR,sometimescalledthelogitisa
mathematicaltransformationoftheoddswhichwillhelp
increatingaregressionmodel.
Overview
3/29/2020 DR ATHAR KHAN 6

Scenario:Let’ssayyouarearesearcherstudyingpredictors
ofstudentinterest.Youcollectdatafrom200studentson
severalvariables.
INDEPENDENTVARIABLES
“Pass”indicateswhetherastudentpassed(coded1)orfailed
(coded0)aprevioussubjectmattertest.
“Masteryg”ismasterygoals(higherscoresindicategreater
masterygoals).
“Fearfail”isfearoffailure(higherscoresindicategreaterfear
offailure).“Masteryg”and“Fearfail”aretreatedas
continuousvariables.
“Genderid”isabinaryvariable(likepass),dummycoded
0=identifiedmale,1=identifiedfemale.
3/29/2020 DR ATHAR KHAN 7

Scenario:Let’ssayyouarearesearcherstudyingpredictors
ofstudentinterest.Youcollectdatafrom200studentson
severalvariables.
DEPENDENTVARIABLES
“Interestlev’isanordered,categoricalvariableindicating
students’self-reportedinterestforthenexttopicinclass.Itis
coded1=lowinterest,2=mediuminterest,3=highinterest).
3/29/2020 DR ATHAR KHAN 8

Ordinal logistic regression (using SPSS): Route 1
3/29/2020 DR ATHAR KHAN 9

3/29/2020 DR ATHAR KHAN 10

Here,weplace“Interestlev”variableinthedependentboxandremaining
variables(IV’s)intheCovariate(s)box.Althoughtheyarecategoricalvariables,
wecaninclude“pass”and“genderid"ascovariates.
However,ifyouhavecategoricalvariableswithmorethantwolevels,thenyou
mustusethe“factor(s)”boxforthem.[FYI,wouldhavealsoenteredthe
abovevariablesasfactors,butIpreferhavingcontroloverthedesignationof
thereferencecategory;SPSSdefaultsbytreatingthecategorywiththe
highervalueasthereferencecategory]
3/29/2020 DR ATHAR KHAN 11

Categorical(nominalorordinal)explanatoryvariablesare
enteredtotheFactor(s)box,sothisiswherewe
enterethnic2andgender.Continuousexplanatoryvariables
(inthiscasesec2)areenteredascovariates.
3/29/2020 DR ATHAR KHAN 12

Click “OUTPUT”
Select “Test of parallel lines” provides a test of the proportional
odds assumption.
3/29/2020 DR ATHAR KHAN 13

TheCaseProcessingSummarytellsyoutheproportionof
casesfallingateachlevelofthedependentvariable
(Interestlev).
1=Low interest
2=Medium interest,
3=High interest
3/29/2020 DR ATHAR KHAN 14

TheModelFittingInformation(seeright)containsthe-2Log
Likelihoodfor:
Interceptonly(ornull)modelandtheFullModel
(containingthefullsetofpredictors).
Wealsohavealikelihoodratiochi-squaretesttotest
whetherthereisasignificantimprovementinfitoftheFinal
modelrelativetotheInterceptonlymodel.
Inthiscase,weseeasignificantimprovementinfitofthe
Finalmodeloverthenullmodel[χ²(4)=30.249,p<.001].
3/29/2020 DR ATHAR KHAN 15

Wecomparethefinalmodelagainstthebaselinetoseewhetherithas
significantlyimprovedthefittothedata.TheModelfitting
Informationtablegivesthe-2log-likelihoodvaluesforthebaselineand
thefinalmodel,andSPSSperformsachi-squaretotestthedifference
betweenthe-2LLforthetwomodels.
Thestatisticallysignificantchi-squarestatistic(p<.0005)indicatesthat
theFinalmodelgivesasignificantimprovementoverthebaseline
intercept-onlymodel.Thistellsyouthatthefinalmodelgivesbetter
predictions3/29/2020 DR ATHAR KHAN 16

The“GoodnessofFit”tablecontainstheDevianceand
Pearsonchi-squaretests,whichareusefulfordetermining
whetheramodelexhibitsgoodfittothedata.Non-
significanttestresultsareindicatorsthatthemodelfitsthe
datawell(Field,2018;Petrucci,2009).
Deviance(-2LL)
Thisisthelog-likelihoodmultipliedby-2andiscommonlyusedtoexplore
howwellalogisticregressionmodelfitsthedata.
Thelowerthisvalueisthebetteryourmodelisatpredictingyourbinary
outcomevariable.
3/29/2020 DR ATHAR KHAN 17

Inthisanalysis,weseethatboththePearsonchi-squaretest
[χ²(394)=400.412,p=.401]andthedeviancetest
[χ²(394)=403.353,p=.362]werebothnon-significant.These
resultssuggestgoodmodelfit.
3/29/2020 DR ATHAR KHAN 18

Here,wehavetheregressioncoefficientsandsignificancetests
foreachoftheindependentvariablesinthemodel.The
regressioncoefficientsareliterallyinterpretedas:
Thepredictedchangeinlogoddsofbeinginahigher(as
opposedtoalower)group/categoryonthedependentvariable
(controllingfortheremainingindependentvariables)perunit
increaseontheindependentvariable.
3/29/2020 DR ATHAR KHAN 19

WeinterpretapositiveEstimate(b)inthefollowingway:
Foreveryoneunitincreaseonanindependentvariable,thereis
apredictedincrease(ofacertainamount)inthelogoddsof
fallingatahigherlevelofthedependentvariable.
Moregenerally,thisindicatesthatasscoresincreaseonan
independentvariable,thereisanincreasedprobabilityoffalling
atahigherlevelonthedependentvariable.
1=Low interest
2=Medium interest,
3=High interest
3/29/2020 DR ATHAR KHAN 20

WeinterpretanegativeEstimate(b)inthefollowingway:
Foreveryoneunitincreaseonanindependentvariable,thereis
apredicteddecrease(ofacertainamount)inthelogoddsof
fallingatahigherlevelofthedependentvariable.
Moregenerally,thisindicatesthatasscoresincreaseonan
independentvariable,thereisadecreasedprobabilityoffalling
atahigherlevelonthedependentvariable.
3/29/2020 DR ATHAR KHAN 21

MasterygoalswasasignificantpositivepredictorofInterestin
thenexttopic.Foreveryoneunitincreaseonmasterygoals,
thereisapredictedincreaseof.026inthelogoddsofastudent
beinginahigher(asopposedtolower)categoryonInterest.
Thisindicatesthatastudentscoringhigheronmasterygoals
weremorelikelytoindicategreaterinterestinthenexttopic.
“Masteryg” is mastery goals (higher scores indicate greater mastery goals).
“Interestlev’isanordered,categoricalvariableindicatingstudents’self-
reportedinterestforthenexttopicinclass.Itiscoded1=lowinterest,
2=mediuminterest,3=highinterest).
3/29/2020 DR ATHAR KHAN 22

Fearoffailurewasnotasignificantpredictorinthemodel.[The
coefficientisinterpretedasfollows:Foreveryoneunitincrease
onfearoffailure,thereisapredicteddecreaseof.015inthe
logoddsofbeinginahigherlevelofthedependentvariable.]
“Fearfail” is fear of failure (higher scores indicate greater fear of failure).
“Interestlev’isanordered,categoricalvariableindicatingstudents’self-
reportedinterestforthenexttopicinclass.Itiscoded1=lowinterest,
2=mediuminterest,3=highinterest).
3/29/2020 DR ATHAR KHAN 23

PasswasasignificantpositivepredictorofInterest.SincePassisa
binaryvariable,thesloperepresentsthedifferenceinlogodds
betweenindividualsinthe“failed”groupandthe“passedgroup”.
ThelogoddsofbeinginahigherlevelonInterestwas.820points
higheronaverageforthosewhopassedtheprevioussubjectmatter
testascomparedtothosewhofailedthetest.
“Pass”indicateswhetherastudentpassed(coded1)orfailed(coded0)a
previoussubjectmattertest.
“Interestlev’isanordered,categoricalvariableindicatingstudents’self-
reportedinterestforthenexttopicinclass.Itiscoded1=lowinterest,
2=mediuminterest,3=highinterest).
3/29/2020 DR ATHAR KHAN 24

Genderidentificationwasnotasignificantpredictor.[Again,
becausethisisabinaryvariabletheslopecanbethoughtofas
thedifferenceinlogoddsbetweengroups.Onaverage,thelog
oddsofbeinginahigherInterestcategorywas.232points
greaterforpersonsidentifiedasfemalethanmales.]
“Genderid”isabinaryvariable(likepass),dummycoded0=identifiedmale,
1=identifiedfemale.
“Interestlev’isanordered,categoricalvariableindicatingstudents’self-
reportedinterestforthenexttopicinclass.Itiscoded1=lowinterest,
2=mediuminterest,3=highinterest).
3/29/2020 DR ATHAR KHAN 25

▪Assumptionofproportionalodds(SPSScallsthis
theassumptionofparallellinesbutit’sthesamething).This
assumesthattheexplanatoryvariableshavethesameeffect
ontheoddsregardlessofthethreshold.
3/29/2020 DR ATHAR KHAN 26

▪Asmentionedpreviously,OLRassumesthattherelationship
betweentheIV’sarethesame“acrossallpossiblecomparisons”
(Osborne,2017,p.147)involvingthedependentvariable–an
assumptionreferredtoasProportionalOdds.
▪WhentheresultofthetestofParallellines(i.e.,assumptionof
Proportionalodds)indicatenon-significance,thenweinterpretit
tomeanthattheassumptionissatisfied.Statisticalsignificanceis
takenasanindicatorthattheassumptionisnotsatisfied.
▪Intheresultsfromouranalysis,weinterprettheresultstomean
thattheassumptionissatisfied(asp=.854).
3/29/2020 DR ATHAR KHAN 27

3/29/2020 DR ATHAR KHAN 28

Ordinal logistic regression (using SPSS): Route 2
(using generalized linear models option)
3/29/2020 DR ATHAR KHAN 29

OnedownsideofusingthepreviousoptionisthatwecannotgetOddsRatios
(OR’s),reflectingthechangingoddsofacasefallingatanexthigherlevelon
thedependentvariable.Moreover,thetestresultsassociatedwiththe
independentvariablesarebasedsolelyontheWaldtest.Theseresultscanbe
lesspowerfulthantestresultsbasedontheuseofLikelihoodratiochi-square
tests.UsingtheGeneralizedlinearmodelsoption,wecanobtainallofthis
additionalinformation.
3/29/2020 DR ATHAR KHAN 30

3/29/2020 DR ATHAR KHAN 31

3/29/2020 DR ATHAR KHAN 32

Ifyouhave“factor”
variablesthenyoucould
includethemintheFactors
box.UnlikeRoute1,you
canactuallyspecifythe
referencecategory.
Include independent
variables(nottreatedas
factors)here
3/29/2020 DR ATHAR KHAN 33

3/29/2020 DR ATHAR KHAN 34

Here,Ihaverequested
Likelihoodratiochi-square
statisticsandoddsratiostobe
printedintheoutput.
3/29/2020 DR ATHAR KHAN 35

Thesearevariousgoodnessof
fitstatistics.
You’llnoticethatalthoughthe
Pearsonchi-squareand
Devianceappearinthistable,
testresultsarenotprovided(as
wesawintheGoodnessoffit
tableviaRoute 1).
Nevertheless,bothvaluesand
degreesoffreedomare
provided,whichcouldbeused
totestformodelfitusingthe
chi-squaredistribution.(Of
course,it’sprobablylesswork
toobtainthatinformationvia
Route1)
3/29/2020 DR ATHAR KHAN 36

ThisistheLikelihoodratiochi-square
testwesawviaRoute1.Weseethat
ourfullmodelwasasignificant
improvementinfitoverthenull(no
predictors)model[χ²(4)=30.249,
p<.001].
3/29/2020 DR ATHAR KHAN 37

Runningyourlogisticregressionthroughthisroutewillallowyoutoobtain
bothWaldtestsofthepredictors(seetestresultsunderParameter
Estimates)andLikelihoodratiotests(seeTestsofModelEffects).Forthe
mostpart,thep-valuesfrombothtablesareveryconsistent.
3/29/2020 DR ATHAR KHAN 38

A closer look at the table:
3/29/2020 DR ATHAR KHAN 39

Here,you’llseeroughlythesameinformationcontainedintheprevioustable
ofregressioncoefficientsthroughRoute1.Oneofthemaindifferencesisthe
Exp(B)column(andconfidenceinterval).TheExp(B)columncontainsodds
ratiosreflectingthemultiplicativechangeintheoddsofbeinginahigher
categoryonthedependentvariableforeveryoneunitincreaseonthe
independentvariable,holdingtheremainingindependentvariablesconstant.
Anoddsratio>1suggestsanincreasingprobabilityofbeinginahigherlevelon
thedependentvariableasvaluesonanindependentvariableincreases,
whereasaratio<1suggestsadecreasingprobabilitywithincreasingvalueson
anindependentvariable.Anaddsratio=1suggestsnopredictedchangeinthe
likelihoodofbeinginahighercategoryasvaluesonanindependentvariable
increase.
3/29/2020 DR ATHAR KHAN 40

Asbefore,masterygoalswasasignificantpositivepredictorofInterestinthenext
topic.Foreveryoneunitincreaseonmasterygoals,thereisapredictedincreaseof
.026inthelogoddsofastudentbeinginahigherleveloftheInterest(dependent)
variable.Thisindicatesthatastudentscoringhigheronmasterygoalsweremore
likelytoindicategreaterinterestinthenexttopic.
TheoddsratioindicatesthattheoddsofbeinginahighercategoryonInterest
increasesbyafactorof1.027foreveryoneunitincreaseonmasterygoals.
3/29/2020 DR ATHAR KHAN 41

Fearoffailurewasnotasignificantpredictorinthemodel.[Theregression
coefficientindicatesthatforeveryoneunitincreaseonfearoffailure,thereisa
predicteddecreaseof.015inthelogoddsofbeinginahigherlevelofthe
dependentvariable(controllingfortheremainingpredictors).]
TheoddsratioindicatesthattheoddsofbeinginahighercategoryonInterest
increasesbyafactorof.985foreveryoneunitincreaseonfearoffailure.[Given
thattheoddsratiois<1,thisindicatesadecreasingprobabilityofbeinginahigher
levelontheInterestvariableasscoresincreaseonfearoffailure.]
3/29/2020 DR ATHAR KHAN 42

▪PasswasasignificantpositivepredictorofInterest.Thelogoddsofbeingina
higherlevelonInterestwas.820pointshigheronaverageforthosewhopassed
theprevioussubjectmattertestthanthosewhofailedthetest.
▪Theoddsofstudentswhopassed(theprevioussubjectmattertest)beingina
highercategoryonthedependentvariablewere2.270timesthatofthosewho
failedthetest.
▪Genderidentificationwasnotasignificantpredictor.[Onaverage,thelogoddsof
beinginahigherInterestcategorywas.232pointsgreaterforfemalesthan
males.]
▪Theoddsofastudentidentifiedasfemalebeinginahighercategoryonthe
dependentvariablewas1.261timesthatofastudentidentifiedasmale(although
again,genderidentificationwasnotasignificantpredictor).
3/29/2020 DR ATHAR KHAN 43

MikeCrowson.OrdinallogisticregressionusingSPSS.
https://www.youtube.com/watch?v=rSCdwZD1DuM
References
3/29/2020 DR ATHAR KHAN 44
Tags