Association Rule Mining(ARM) notes for class

racoon97970301 7 views 8 slides Jul 07, 2024
Slide 1
Slide 1 of 8
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8

About This Presentation

ARM slideshow notes


Slide Content

Association Rule Mining

Association Rule Mining (ARM)
•Theclassicalproblemofassociationpatternminingisdefinedinthecontextof
supermarketdatacontainingsetsofitemsboughtbycustomers,whichare
referredtoastransactions.
•Thegoalistodetermineassociationsbetweengroupsofitemsboughtby
customers.
•Themostpopularmodelforassociationpatternminingusesthefrequenciesof
sets.
•Thediscoveredsetsofitemsarereferredtoaslargeitemsets,frequentitemsets,or
frequentpatterns.
•ApplicationofARM -Supermarketdata,Textmining,Generalizationto
dependency-orienteddatatypes,MarketBasketAnalysis.

Association Rule Mining (ARM)
•FrequentitemsetscanbeusedtogenerateassociationrulesoftheformX⇒Y,
whereXandYaresetsofitems.
•This rule suggests that buying of X makes it more likely that Y will also be
bought.
•Givenasetoftransactions,findrulesthatwillpredicttheoccurrenceofanitem
basedontheoccurrencesofotheritemsinthetransaction.
•ExampleofAssociationRules
{Diaper}→{Beer},
{Milk,Bread}→{Eggs,Coke},
{Beer,Bread}→{Milk},

Association Rule Mining (ARM)
•Itemset
•Acollectionofoneormoreitems,eg;{Milk,Bread,Diaper}
•k-itemset-Anitemsetthatcontainskitems.
•Supportcount(σ)
•Frequencyofoccurrenceofanitemset
•E.g.σ({Milk,Bread,Diaper})=2
•Support
•Fractionoftransactionsthatcontainanitemset
•E.g.s({Milk,Bread,Diaper})=2/5
•FrequentItemset
•Anitemsetwhosesupportisgreaterthanorequaltoaminsupthreshold

Association Rule Mining (ARM)
•RuleEvaluationMetrics
•Support(s)
•FractionoftransactionsthatcontainbothXandY
•ThesupportofanitemsetIisdefinedasthefractionofthe
transactionsinthedatabaseT={T1...Tn}thatcontainIasasubset.
•Confidence(c)
•MeasureshowoftenitemsinYappearintransactionsthatcontainX.
•Theconfidenceconf(X∪Y)oftheruleX∪Yistheconditional
probabilityofX∪Yoccurringinatransaction,giventhatthe
transactioncontainsX.Therefore,theconfidenceconf(X⇒Y)is
definedasfollows–
•AssociationRuleMiningTask
•GivenasetoftransactionsT,thegoalofassociationrule
miningistofindallruleshaving
•support≥minsupthreshold
•confidence≥minconfthreshold

Association Rule Mining Task
•Brute-forceapproach:
•Listallpossibleassociationrules
•Computethesupportandconfidenceforeachrule
•Prunerulesthatfailtheminsupandminconfthresholds.
•ComputationallyExpensive
•StrategiestoreduceComputationalComplexity:
•Reducethenumberofcandidates(M)
•Reducethenumberoftransactions(N)
•Reducethenumberofcomparisons(NM)

Association Rule Mining Task
•AprioriPrinciple:Ifanitemsetisfrequent,thenallofitssubsetsmustalsobe
frequent.
•Aprioriprincipleholdsduetothefollowingpropertyofthesupportmeasure:
•Supportofanitemsetneverexceedsthesupportofitssubsets.
•Thisisknownastheanti-monotonepropertyofsupport.
•Apriorialsousesthedownwardclosureproperty.
•Apriorialgorithm generates candidates with smaller length k first and counts their
supports before generating candidates of length (k+1).
•The resulting frequent k-itemsetsare used to restrict the number of (k + 1)-
candidates with the downward closure property.

Association Rule Mining Task
Tags