Explorations of a self-replication algorithm to flexibly match patterns

FLAGreserachlaborato 12 views 43 slides Sep 04, 2024
Slide 1
Slide 1 of 57
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57

About This Presentation

Presentation of our IEEEAccess paper, also presented at the 18CCC

Abstract: Pattern matching algorithms have been studied on numerous occasions, mainly focusing on performance because of the large amount of data used in a matching process. However, a strong focus on performance can entail particula...


Slide Content

EXPLORING A SELF-REPLICATION
ALGORITHM TO FLEXIBLY MATCH PATTERNSPAUL LEGER, HIROAKI FUKUDA, NICOLAS CARDOZO, DANIEL SAN MARTIN
1 2 3 1 UNIVERSIDAD CATÓLICA DEL NORTE, COQUIMBO - CHILE
SHIBURA INSTITUTE OF TECHNOLOGY, TOKYO, JAPÓN
UNIVERSIDA DE LOS ANDES, BOGOTÁ - COLOMBIA
1
2
3 3
[email protected]/ @ncardoz

2
The pattern matching problem
Reference string:ABC ABCDAB ABCDABCDABDE
Pattern: ABCDABD
Find the occurrences of the pattern in the reference string

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABCDABD
ABC ABCDAB ABCDABCDABDE

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABCDABD
ABC ABCDAB ABCDABCDABDE

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABCDABD
ABC ABCDAB ABCDABCDABDE

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABCDABD
ABC ABCDAB ABCDABCDABDE

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABCDABD
ABC ABCDAB ABCDABCDABDE

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABC ABCDAB ABCDABCDABDE
ABCDABD

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABC ABCDAB ABCDABCDABDE
ABCDABD
Optimal time complexity O(n)

2
The pattern matching problem
Reference string:
Pattern:
Find the occurrences of the pattern in the reference string
ABC ABCDAB ABCDABCDABDE
ABCDABD
Optimal time complexity O(n)
Must have complete information about the pattern and the reference string

3
Flexibility in matching algorithms
request

3
Flexibility in matching algorithms
request
malicious request

3
Flexibility in matching algorithms
request
malicious request

4
Flexibility in matching algorithms
request
malicious request
Request filter

4
Flexibility in matching algorithms
request
malicious request
Request filter

4
Flexibility in matching algorithms
request
malicious request
Request filter

4
Flexibility in matching algorithms
Request filter

4
Flexibility in matching algorithms
Request filter

5
1.The stream sequence and the
pattern may be unknown
beforehand
2.New patterns are required to
satisfy new requirements

6
Matcher Cells
biologically inspired algorithm using self-replication to flexibly match patterns
Replicase pattern
original component
Replicated component
(with possible mutations)

7
The Matcher Cells patterns
a
a
a
$
Pattern sequence definition
Received token
Parent-child link
$Matched cell
Solution (environment)

7
The Matcher Cells patterns
a
a
a
$
a
b
ab→
ab→
Pattern sequence definition
Received token
Parent-child link
$Matched cell
Solution (environment)

8
Flexible semantics
Match rules can be applied to each cell using the current token
a
a
a
$

8
Flexible semantics
Match rules can be applied to each cell using the current token
a
a
a
$
per-cell rules

8
Flexible semantics
Match rules can be applied to each cell using the current token
After all per-cell rules are applied, it is possible to keep/kill cells in a solution to
enable further matching in the same solution
a
a
a
$
per-cell rules

8
Flexible semantics
Match rules can be applied to each cell using the current token
After all per-cell rules are applied, it is possible to keep/kill cells in a solution to
enable further matching in the same solution
a
a
a
$
per-cell rules
per-solution rules

8
Flexible semantics
Match rules can be applied to each cell using the current token
After all per-cell rules are applied, it is possible to keep/kill cells in a solution to
enable further matching in the same solution
a
a
$
per-cell rules
per-solution rules

8
Flexible semantics
Match rules can be applied to each cell using the current token
After all per-cell rules are applied, it is possible to keep/kill cells in a solution to
enable further matching in the same solution
a
a
$
per-cell rules
per-solution rules
Rules can be composed by means of a composable rule that takes a rule as input
and returns another rule. This behavior is used to generate more complex
matching behavior that adapts to the needs of programers

9
Cell and rule creation
(define (Cell pattern meta-inf)
(lambda (token null)
(if(null? token) meta-inf
(let ((result ( react token pattern)))
(if (pattern-matched? result)
(return-list-with-new-cells result)
(return-empty-list))))))

9
Cell and rule creation
(define (Cell pattern meta-inf)
(lambda (token null)
(if(null? token) meta-inf
(let ((result ( react token pattern)))
(if (pattern-matched? result)
(return-list-with-new-cells result)
(return-empty-list))))))
;;per-cell rule
(define (identity token cells)
cells)
;;per-solution rule
(define (remove-matched-cells cells pattern)
(remove-matched-cells cells))

10
Matching semantics
Single matches come from the combination of a match token and a kill creator rule

10
Matching semantics
Single matches come from the combination of a match token and a kill creator rule
a b
ab→
a b b
$

10
Matching semantics
Single matches come from the combination of a match token and a kill creator rule
a b
ab→
a b b
$
(define single−match
(kill−creator
(match−token identity )))
(single−match (Cell a->b) a)

11
Matching semantics
multiple matches allow us to generate multiple match tokens for a sequence

11
Matching semantics
multiple matches allow us to generate multiple match tokens for a sequence
a
b
ab→ ab→
a
b
ab→
b
b
b
ab→
b
$$

11
Matching semantics
multiple matches allow us to generate multiple match tokens for a sequence
a
b
ab→ ab→
a
b
ab→
b
b
b
ab→
b
$$
(define multiple−match
(match−token identity))
(multiple−match (Cell a->b) a)

12
Matching semantics
One match at a time allow us to match a pattern and to create the seed again to match a new pattern

12
Matching semantics
One match at a time allow us to match a pattern and to create the seed again to match a new pattern
a b
ab→
a b b
$
ab→

12
Matching semantics
One match at a time allow us to match a pattern and to create the seed again to match a new pattern
a b
ab→
a b b
$
ab→
(define one-match-at-a-time
(add-seed
(kill−creator
(match−token identity ))))
(one-match-at-a-time (Cell a->b) a)

13
Matching semantics
Always start a match has the seed of a match to create a new seed

13
Matching semantics
Always start a match has the seed of a match to create a new seed
a
b
ab→
a
b
b
$
ab→
ab→
b
ab→
$

13
Matching semantics
Always start a match has the seed of a match to create a new seed
a
b
ab→
a
b
b
$
ab→
ab→
b
ab→
$
(define always-start-match
(always-add
(kill−creator
(apply-reaction identity ))))
(always-start-match (Cell a->b) a)

14
Matching semantics
Match per time frame enables matches exclusively within a time frame, killing all cells once the
given time elapses
a b
ab→

14
Matching semantics
Match per time frame enables matches exclusively within a time frame, killing all cells once the
given time elapses
a b
ab→
time of first
token match

14
Matching semantics
Match per time frame enables matches exclusively within a time frame, killing all cells once the
given time elapses
a b
ab→
bb
time of first
token match

14
Matching semantics
Match per time frame enables matches exclusively within a time frame, killing all cells once the
given time elapses
a b
ab→
b
(define match-per-time-frame
(trace-life-time
(kill−creator
(match-token identity ))))
(match-per-time-frame (Cell a->b) a)
b
time of first
token match

15
Matcher Cells at work
1.
Patterns can use
Regular Expressions
2.
Customized
Semantics
100,000 tweets
1 tweet every 60s
choice of common
patterns using regex
matching type
semantics, multiple
matching

15
Matcher Cells at work
1.
Patterns can use
Regular Expressions
2.
Customized
Semantics
100,000 tweets
1 tweet every 60s
choice of common
patterns using regex
a
+b→ a
+b→
a
*b→
a
matching type
semantics, multiple
matching

16
Matcher Cells at work

16
Matcher Cells at work
Identify sequences of events in the application to adapt its difficulty
- Good performance: answering N exercises correctly in a row
- Bad performance: answering M exercises wrong in a row
- No performance: skipping K exercises in a row

17
Matcher Cells proved to be useful and applicable in
data processing scenarios where different
semantics are required

18
Application areas
Stream processing
Reactive programming
Execution trace analysis

19
Conclusion
1.Matcher Cell offers a flexible pattern
matching algorithm through the definition
of language abstractions and functional
composition

19
Conclusion
1.Matcher Cell offers a flexible pattern
matching algorithm through the definition
of language abstractions and functional
composition
2.Practical solution with implementations
across programming paradigms (!paper)

19
Conclusion
1.Matcher Cell offers a flexible pattern
matching algorithm through the definition
of language abstractions and functional
composition
2.Practical solution with implementations
across programming paradigms (!paper)
3.Usable system with comparable
performance up 500000 tokens (!paper)

@FLAGlab research