zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs

AlexPruden 86 views 34 slides Jun 03, 2024
Slide 1
Slide 1 of 34
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34

About This Presentation

This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redaction...


Slide Content

Reef
Sebastian Angel
+, Eleftherios Ioannidis
+, Eli Margolin
+, Srinath Setty*, Jess Woods
+
+
University of Pennsylvania, *Microsoft Research
Fast Succinct Non-Interactive Zero-Knowledge
Regex Proofs
1

Why do we want private Regex matching?
•Oblivious DNS over HTTPS
•Strong passwords with APAKE
•Spam Keyword Filtering
•Redacted documents and FOIA requests
•Genetic Mutations
2
!

#

Scenario - Clinical Trials
3
$
Vikki can’t ensure Patty actually qualifies
%&
Patty registers online for Vikki’s clinical trial and is
automatically accepted
!
Patty Vikki
.*ACTG…

Have Patty send her DNA
4
Vikki has Patty’s DNA in plain text, even if Patty
isn’t accepted into the trial
%&
Patty can send Vikki her DNA to verify
!
Patty Vikki
.*ACTG…

We want to prove Patty qualifies for
Vikki’s trial
5
Without Vikki learning Patty’s DNA before
she’s enrolled

Zero Knowledge Proofs
6
Vikki only learns if Patty qualifies for her trial
%&
Patty can register for Vikki’s trial and attach a proof that
she qualifies
!
$
Patty Vikki
.*ACTG…

How to make a ZKP in 4 easy steps…
1.Vikki expresses her regex matching statement as a circuit satisfiability
instance
2.Vikki publishes her circuit
3.Patty finds a satisfying witness
4.Patty proves to Vikki that she knows the satisfying witness
7
1.Vikki expresses her regex matching statement as a circuit satisfiability
instance
2.Vikki publishes her circuit
3.Patty finds a satisfying witness
4.Patty proves to Vikki that she knows the satisfying witness

Naive Solution
a+b.*
8
0 1 2
a b
ab {a,b}
(Q: {0,1,2}, Σ: {a,b}, δ, q 0:{0}, f:{2})
field delta(field q, field c){
if (q == 0 && c == 0)
return 1;
if (q == 0 && c == 1)
return 0;
……
if (q == 2) return 2;
return −1;
}
for c in Doc: do delta

Problem 1: Expressive Regexes
Look-arounds → NFA state explosion
9
^(?=.*a)(?=.*b).{2,6}$

Problem 2: Unfolding delta
We have to unfold delta |Doc| times
10
for c in Doc:
do delta(q,c)
delta(q0,c0);
delta(q1,c1);
delta(q2,c2);
delta(q3,c3);
delta(q4,c4);
delta(q5,c5);
delta(q6,c6);
delta(q7,c7);
delta(q8,c8);
delta(q9,c9);
delta(q10,c10);

Problem 3: delta as if statements
Requires O(|Σ|*|NFA|) constraints
11
field delta(field q, field c){
if (q == 0 && c == 0)
return 1;
if (q == 0 && c == 1)
return 0;
……
if (q == 2) return 2;
return −1;
}

12
O(|Doc|*|Σ|*|NFA|) constraints
Large Documents
Large Alphabets
Large Automata
Naive Solution requires…

Key Insight 1 - Skipping Alternating Finite
Automata
Alternating Finite Automata give us greater expressivity
We can extend AFA to skip irrelevant parts of the document
13

Key Insight 2 - Lookup Arguments
We can represent the cascading if statements as (start state,
character, end state) lookup arguments in the circuit
14

Key Insight 3 - Recursion
We can make iterating through the document much faster using a
recursive proof system
15

Reef is able to decouple Prover running time from
document size
16

Reef - Zero Knowledge Regex Proofs
•Patty commits to her DNA
•Vikki publishes a regex of the
genetic variant required to
participate in her trial
•Patty proves (in zero knowledge)
that her committed DNA
matches the public regex
•Vikki verifies Patty’s proof
17
%&
Patty Vikki
!
.*ACTG

Roadmap
Reef
Skipping Alternating Finite Automata
Lookup Arguments
Adding Recursion
Optimizations
Evaluation
Summary
18

Skipping Alternating Finite Automata
•Compress multiple wildcards (.{n},.{m,n},.*) into a single transition
19
^(?=.*a)(?=.*b).{2,6}$
•Introduce quantifiers (∃, ∀)

c=2c=3
c=1
Skipping Alternating Finite Automata
20
^(?=.*a)(?=.*b).{2,6}$
c=6c=2
abbbaa
123456
c=1
c=6
c=6

Lookup Arguments
•Prove that some values {v1,…,vm} are in a table ??????
•2 tables
21
ACGGCTACGTCACT
GACTCTCAGACGTC
ACTGACGCTATATAC
GCGCTACGTATCAC
GGCACTTACAGTTA
ACACTGTGGGAC
0 A
1 C
2 G
3 G
… …
0 G 0
0 T 0
0 C 0
0 A 1
… … …
Commitment -
ties characters to
indices
Replaces the
delta function0:∃ 1:∃ 2:∃
A C
A{C,T,G} {A,T,G}

Background - Recursive Proof Systems
•Produce a proof π for each foo()
•Circuit size is parameterized by just one iteration
for (i = 0; i < j; i++) {
foo();
}
π0
foo()
V()
π1
foo()
V(π0)
πj-1
foo()
V(πj-2)
. . .

nlookup
•Table represented as a multilinear polynomial ?????? such that ??????(index) = value at
index
•Applies sum-check to reduce m “lookups” (??????(q) == v) → one ??????(r) == v
•O(mlog|??????|) to reduce, O(|??????|) for final check
0 A
1 C
2 G
3 G
… …
??????(0) = A
??????(1) = C
??????(2) = G
??????(3) = G
….

•With recursion maintain a running claim to check at the end
•Round 1: ??????(q1) = v1, …, ??????(qm) = vm →
•Round 2: ??????(q1) = v1, …, ??????(qm) = vm →
•Round 3: ??????(q1) = v1, …, ??????(qm) = vm → ??????(r3) = v’3
•Reduce all lookups to one evaluation of ?????? polynomial
•As the number of lookups increases, cost-per-lookup decreases
??????(r1) = v’1
??????(r2) = v’2
??????(r1) = v’1
??????(r2) = v’2
Everything’s Better with Recursion
24
m·log|??????| per round
|??????| per round

m·log|??????| per round
|??????| once

m * #foldings lookups
??????(100) == A
??????(111) == G
??????(101) == C

Reduced to one check:
??????(r0, r1, r2) == v
m * #foldings lookups
??????1(00) == A
??????1(11) == G
??????1(01) == C
….
Reduced to one check:
??????1(r0, r1) == v → ??????(1, r0, r1) == v
0b000 G
0b001 A
0b010 T
0b011 T
0b100 A
0b101 C
0b110 A
0b111 G
0b100A
0b101C
0b110A
0b111G
Prover: O(|??????| · #iterations)
Verifier: O(mlog(|??????|)) + O(|??????|)
nlookup Optimizations
25
O(|??????1| · #iterations)
O(mlog(|??????1|)) + O(|??????|)

0b10 0A
0b10 1C
0b00 0G
0b00 1A
??????00(0) == G
??????00(1) == A

??????10(1) == C

??????hybrid(00) == G
??????hybrid(01) == A

??????hybrid(11) == C

Reduced to one check:
??????hybrid(r0, r1) == v →
(1 - r0) · ??????(0 0 r1) + r0 · ??????(1 0 r1) == v
0b000 G
0b001 A
0b010 T
0b011 T
0b100 A
0b101 C
0b110 A
0b111 G
nlookup Optimizations
26
0b00 0G
0b00 1A
0b10 0A
0b10 1C
0
1

nlookup Optimizations
•Document Projections
•Run lookup over subsets of the larger document table
•Works for regexs with prescribed offsets
•.{10}abc.*
•.*abc
•Hybrid tables
•Public SAFA Table + Private Document Table → mlog(|Document|·|SAFA|)
constraints
•Single Hybrid Public/Private Table → mlog(|Document|+|SAFA|) constraints
27

Roadmap
Reef
Skipping Alternating Finite Automata
Lookup Arguments
Adding Recursion
Optimizations
Evaluation
Summary
28

Evaluation
•Can Reef support a variety of regexs?
•Can Reef support a variety of document sizes?
•Is Reef efficient for the prover?
•Is Reef efficient for the verifier?
•How does Reef compare to existing/alternative solution?
•Do SAFA meaningfully reduce the size of the automata?
•What impact do Reef’s additional optimizations have?
29
•Can Reef support a variety of regexs?
•Can Reef support a variety of document sizes?
•Is Reef efficient for the prover?
•Is Reef efficient for the verifier?
•How does Reef compare to existing/alternative solution?
•Do SAFA meaningfully reduce the size of the automata?
•What impact do Reef’s additional optimizations have?

Reef supports more robust Regexs
More expressivity with fewer constraints
30
Fixed String
Matching
Wildcards
Kleene
stars
NegationAlternationLookarounds # Constraints
Reef O(⍺log(D+QSAFA+|Σ|))
Zombie O(D·QTNFA)
ZKRegex O(D·QTNFA·log(|Σ|))
ZK Email O(D·QTNFA·|Σ|)
zkreg O(D+QAC-DFA)
D - Document Size, Q - # Transitions in Automata, Σ - Alphabet, ⍺ - Number of
lookups

Reef Supports a Variety of Applications
31
Document Size
Average SAFA Size -
States
Average SAFA Size -
Transitions
Document Redaction
(Small)
415 331 42,318
Document Redaction
(Large)
1,000 908 116,751
ODoH 128 16 1,958
Strong Password
Match/Non-Match
12/6-9 21 1,188
DNA Match/Non-Match 32 million - 43 million 546 29,832

And It’s More Efficient - Fewer Constraints
32
DFA DFA+Recursion SAFA+lookup Reef
Document
Redaction
(Small)
23,041,771 67,472 54,679 52,631
Document
Redaction (Large)
X 141,712 57,628 54,636
ODoH 1,552,754 24,131 22,573 18,437
Password Match/
Non-Match
X X 21,002/21,721 19,982/20,725
DNA Match/Non-
Match
X X 96,296/107,184 85,352/95,916

DNA Non-Match/MatchDocument Redactions (Small and Large)
And It’s More Efficient - Faster Total Prover Time
33
Even without Reef’s additional optimizations, SAFA+lookup is orders of
magnitude faster

Summary
We want…
•Succinct Zero Knowledge Proofs
•For a variety of regexs
•That scale well for large documents
We can…
•Use SAFA to get better expressivity
•And skip irrelevant parts of the document
•Use lookup tables for document commitment and SAFA transitions
Reef!
•Support for a variety of applications
•Fast Prover and Verifier times
•Fewer constraints
34
Thank you!
Contact:
elefthei,ecmargo,woodsjk @
seas.upenn.edu