Association Rule Mining Max Miner Itemset Mining

VishwajeetSingh416757 33 views 17 slides Aug 11, 2024
Slide 1
Slide 1 of 17
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17

About This Presentation

Association Rule Mining


Slide Content

Association Rule Mining
- MaxMiner

Mining Association Rules in
Large Databases
Association rule mining
Algorithms Apriori and FP-Growth
Max and closed patterns
Mining various kinds of association/correlation
rules

Max-patterns & Close-patterns
If there are frequent patterns with many
items, enumerating all of them is costly.
We may be interested in finding the
‘boundary’ frequent patterns.
Two types…

Max-patterns
Frequent pattern {a
1
, …, a
100
}  (
100
1
) +
(
100
2
) + … + (
1
1
0
0
0
0
) = 2
100
-1 = 1.27*10
30
frequent sub-patterns!
Max-pattern: frequent patterns without
proper frequent super pattern
BCDE, ACD are max-patterns
BCD is not a max-pattern
TidItems
10A,B,C,D,E
20B,C,D,E,
30A,C,D,F
Min_sup=2

Maximal Frequent Itemset
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCD
E
Border
Infrequent
Itemsets
Maximal
Itemsets
An itemset is maximal frequent if none of its immediate supersets
is frequent

Closed Itemset
An itemset is closed if none of its immediate
supersets has the same support as the itemset
TID Items
1 {A,B}
2 {B,C,D}
3{A,B,C,D}
4 {A,B,D}
5{A,B,C,D}
ItemsetSupport
{A} 4
{B} 5
{C} 3
{D} 4
{A,B} 4
{A,C} 2
{A,D} 3
{B,C} 3
{B,D} 4
{C,D} 3
ItemsetSupport
{A,B,C} 2
{A,B,D} 3
{A,C,D} 2
{B,C,D} 3
{A,B,C,D} 2

Maximal vs Closed Itemsets
TID Items
1 ABC
2 ABCD
3 BCE
4 ACDE
5 DE
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
124 123 1234 245 345
12 124 24 4 123 2 3 24 34 45
12 2 24 4 4 2
3 4
2 4
Transaction Ids
Not supported by
any transactions

Maximal vs Closed Frequent Itemsets
null
AB AC AD AE BC BD BE CD CE DE
A B C D E
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
ABCD ABCE ABDE ACDE BCDE
ABCDE
124 123 1234 245 345
12 124 24 4 123 2 3 24 34 45
12 2 24 4 4 2 3 4
2 4
Minimum support = 2
# Closed = 9
# Maximal = 4
Closed and
maximal
Closed but
not
maximal

Maximal vs Closed Itemsets
Frequent
Itemsets
Closed
Frequent
Itemsets
Maximal
Frequent
Itemsets

MaxMiner: Mining Max-patterns
Idea: generate the complete set-
enumeration tree one level at a time, while
prune if applicable.
 (ABCD)
A (BCD) B (CD) C (D)D ()
AB (CD)AC (D)AD ()BC (D)BD ()CD ()
ABC (C)
ABCD ()
ABD ()ACD ()BCD ()

Local Pruning Techniques (e.g. at node A)
Check the frequency of ABCD and AB, AC, AD.
If ABCD is frequent, prune the whole sub-tree.
If AC is NOT frequent, remove C from the
parenthesis before expanding.
 (ABCD)
A (BCD) B (CD) C (D)D ()
AB (CD)AC (D)AD ()BC (D)BD ()CD ()
ABC (C)
ABCD ()
ABD ()ACD ()BCD ()

Algorithm MaxMiner
Initially, generate one node N= ,
where h(N)= and t(N)={A,B,C,D}.
Consider expanding N,
If h(N)t(N) is frequent, do not expand N.
If for some it(N), h(N){i} is NOT frequent,
remove i from t(N) before expanding N.
Apply global pruning techniques…
 (ABCD)

Global Pruning Technique (across sub-trees)
When a max pattern is identified (e.g. ABCD),
prune all nodes (e.g. B, C and D) where
h(N)t(N) is a sub-set of it (e.g. ABCD).
 (ABCD)
A (BCD) B (CD) C (D)D ()
AB (CD)AC (D)AD ()BC (D)BD ()CD ()
ABC (C)
ABCD ()
ABD ()ACD ()BCD ()

Example
TidItems
10A,B,C,D,E
20B,C,D,E,
30A,C,D,F
 (ABCDEF)
ItemsFrequency
ABCDEF 0
A 2
B 2
C 3
D 3
E 2
F 1
Min_sup=2
Max patterns:
A (BCDE)B (CDE)C (DE) E ()D (E)

Example
TidItems
10A,B,C,D,E
20B,C,D,E,
30A,C,D,F
 (ABCDEF)
ItemsFrequency
ABCDE 1
AB 1
AC 2
AD 2
AE 1
Min_sup=2
A (BCDE)B (CDE)C (DE) E ()D (E)
AC (D)AD ()
Max patterns:
Node A

Example
TidItems
10A,B,C,D,E
20B,C,D,E,
30A,C,D,F
 (ABCDEF)
Items Frequency
BCDE 2
BC
BD
BE
Min_sup=2
A (BCDE)B (CDE)C (DE) E ()D (E)
AC (D)AD ()
Max patterns:
BCDE
Node B

Example
TidItems
10A,B,C,D,E
20B,C,D,E,
30A,C,D,F
 (ABCDEF)
Items Frequency
ACD 2
Min_sup=2
A (BCDE)B (CDE)C (DE) E ()D (E)
AC (D)AD ()
Max patterns:
BCDE
ACD
Node AC