zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
AlexPruden
86 views
34 slides
Jun 03, 2024
Slide 1 of 34
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
About This Presentation
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redaction...
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Size: 917.77 KB
Language: en
Added: Jun 03, 2024
Slides: 34 pages
Slide Content
Reef
Sebastian Angel
+, Eleftherios Ioannidis
+, Eli Margolin
+, Srinath Setty*, Jess Woods
+
+
University of Pennsylvania, *Microsoft Research
Fast Succinct Non-Interactive Zero-Knowledge
Regex Proofs
1
Why do we want private Regex matching?
•Oblivious DNS over HTTPS
•Strong passwords with APAKE
•Spam Keyword Filtering
•Redacted documents and FOIA requests
•Genetic Mutations
2
!
⚖
#
Scenario - Clinical Trials
3
$
Vikki can’t ensure Patty actually qualifies
%&
Patty registers online for Vikki’s clinical trial and is
automatically accepted
!
Patty Vikki
.*ACTG…
Have Patty send her DNA
4
Vikki has Patty’s DNA in plain text, even if Patty
isn’t accepted into the trial
%&
Patty can send Vikki her DNA to verify
!
Patty Vikki
.*ACTG…
We want to prove Patty qualifies for
Vikki’s trial
5
Without Vikki learning Patty’s DNA before
she’s enrolled
Zero Knowledge Proofs
6
Vikki only learns if Patty qualifies for her trial
%&
Patty can register for Vikki’s trial and attach a proof that
she qualifies
!
$
Patty Vikki
.*ACTG…
How to make a ZKP in 4 easy steps…
1.Vikki expresses her regex matching statement as a circuit satisfiability
instance
2.Vikki publishes her circuit
3.Patty finds a satisfying witness
4.Patty proves to Vikki that she knows the satisfying witness
7
1.Vikki expresses her regex matching statement as a circuit satisfiability
instance
2.Vikki publishes her circuit
3.Patty finds a satisfying witness
4.Patty proves to Vikki that she knows the satisfying witness
Naive Solution
a+b.*
8
0 1 2
a b
ab {a,b}
(Q: {0,1,2}, Σ: {a,b}, δ, q 0:{0}, f:{2})
field delta(field q, field c){
if (q == 0 && c == 0)
return 1;
if (q == 0 && c == 1)
return 0;
……
if (q == 2) return 2;
return −1;
}
for c in Doc: do delta
Problem 1: Expressive Regexes
Look-arounds → NFA state explosion
9
^(?=.*a)(?=.*b).{2,6}$
Problem 2: Unfolding delta
We have to unfold delta |Doc| times
10
for c in Doc:
do delta(q,c)
delta(q0,c0);
delta(q1,c1);
delta(q2,c2);
delta(q3,c3);
delta(q4,c4);
delta(q5,c5);
delta(q6,c6);
delta(q7,c7);
delta(q8,c8);
delta(q9,c9);
delta(q10,c10);
Problem 3: delta as if statements
Requires O(|Σ|*|NFA|) constraints
11
field delta(field q, field c){
if (q == 0 && c == 0)
return 1;
if (q == 0 && c == 1)
return 0;
……
if (q == 2) return 2;
return −1;
}
12
O(|Doc|*|Σ|*|NFA|) constraints
Large Documents
Large Alphabets
Large Automata
Naive Solution requires…
Key Insight 1 - Skipping Alternating Finite
Automata
Alternating Finite Automata give us greater expressivity
We can extend AFA to skip irrelevant parts of the document
13
Key Insight 2 - Lookup Arguments
We can represent the cascading if statements as (start state,
character, end state) lookup arguments in the circuit
14
Key Insight 3 - Recursion
We can make iterating through the document much faster using a
recursive proof system
15
Reef is able to decouple Prover running time from
document size
16
Reef - Zero Knowledge Regex Proofs
•Patty commits to her DNA
•Vikki publishes a regex of the
genetic variant required to
participate in her trial
•Patty proves (in zero knowledge)
that her committed DNA
matches the public regex
•Vikki verifies Patty’s proof
17
%&
Patty Vikki
!
.*ACTG
Lookup Arguments
•Prove that some values {v1,…,vm} are in a table ??????
•2 tables
21
ACGGCTACGTCACT
GACTCTCAGACGTC
ACTGACGCTATATAC
GCGCTACGTATCAC
GGCACTTACAGTTA
ACACTGTGGGAC
0 A
1 C
2 G
3 G
… …
0 G 0
0 T 0
0 C 0
0 A 1
… … …
Commitment -
ties characters to
indices
Replaces the
delta function0:∃ 1:∃ 2:∃
A C
A{C,T,G} {A,T,G}
Background - Recursive Proof Systems
•Produce a proof π for each foo()
•Circuit size is parameterized by just one iteration
for (i = 0; i < j; i++) {
foo();
}
π0
foo()
V()
π1
foo()
V(π0)
πj-1
foo()
V(πj-2)
. . .
nlookup
•Table represented as a multilinear polynomial ?????? such that ??????(index) = value at
index
•Applies sum-check to reduce m “lookups” (??????(q) == v) → one ??????(r) == v
•O(mlog|??????|) to reduce, O(|??????|) for final check
0 A
1 C
2 G
3 G
… …
??????(0) = A
??????(1) = C
??????(2) = G
??????(3) = G
….
•With recursion maintain a running claim to check at the end
•Round 1: ??????(q1) = v1, …, ??????(qm) = vm →
•Round 2: ??????(q1) = v1, …, ??????(qm) = vm →
•Round 3: ??????(q1) = v1, …, ??????(qm) = vm → ??????(r3) = v’3
•Reduce all lookups to one evaluation of ?????? polynomial
•As the number of lookups increases, cost-per-lookup decreases
??????(r1) = v’1
??????(r2) = v’2
??????(r1) = v’1
??????(r2) = v’2
Everything’s Better with Recursion
24
m·log|??????| per round
|??????| per round
➕
m·log|??????| per round
|??????| once
➕
m * #foldings lookups
??????(100) == A
??????(111) == G
??????(101) == C
…
Reduced to one check:
??????(r0, r1, r2) == v
m * #foldings lookups
??????1(00) == A
??????1(11) == G
??????1(01) == C
….
Reduced to one check:
??????1(r0, r1) == v → ??????(1, r0, r1) == v
0b000 G
0b001 A
0b010 T
0b011 T
0b100 A
0b101 C
0b110 A
0b111 G
0b100A
0b101C
0b110A
0b111G
Prover: O(|??????| · #iterations)
Verifier: O(mlog(|??????|)) + O(|??????|)
nlookup Optimizations
25
O(|??????1| · #iterations)
O(mlog(|??????1|)) + O(|??????|)
0b10 0A
0b10 1C
0b00 0G
0b00 1A
??????00(0) == G
??????00(1) == A
…
??????10(1) == C
…
??????hybrid(00) == G
??????hybrid(01) == A
…
??????hybrid(11) == C
…
Reduced to one check:
??????hybrid(r0, r1) == v →
(1 - r0) · ??????(0 0 r1) + r0 · ??????(1 0 r1) == v
0b000 G
0b001 A
0b010 T
0b011 T
0b100 A
0b101 C
0b110 A
0b111 G
nlookup Optimizations
26
0b00 0G
0b00 1A
0b10 0A
0b10 1C
0
1
nlookup Optimizations
•Document Projections
•Run lookup over subsets of the larger document table
•Works for regexs with prescribed offsets
•.{10}abc.*
•.*abc
•Hybrid tables
•Public SAFA Table + Private Document Table → mlog(|Document|·|SAFA|)
constraints
•Single Hybrid Public/Private Table → mlog(|Document|+|SAFA|) constraints
27
Evaluation
•Can Reef support a variety of regexs?
•Can Reef support a variety of document sizes?
•Is Reef efficient for the prover?
•Is Reef efficient for the verifier?
•How does Reef compare to existing/alternative solution?
•Do SAFA meaningfully reduce the size of the automata?
•What impact do Reef’s additional optimizations have?
29
•Can Reef support a variety of regexs?
•Can Reef support a variety of document sizes?
•Is Reef efficient for the prover?
•Is Reef efficient for the verifier?
•How does Reef compare to existing/alternative solution?
•Do SAFA meaningfully reduce the size of the automata?
•What impact do Reef’s additional optimizations have?
Reef supports more robust Regexs
More expressivity with fewer constraints
30
Fixed String
Matching
Wildcards
Kleene
stars
NegationAlternationLookarounds # Constraints
Reef O(⍺log(D+QSAFA+|Σ|))
Zombie O(D·QTNFA)
ZKRegex O(D·QTNFA·log(|Σ|))
ZK Email O(D·QTNFA·|Σ|)
zkreg O(D+QAC-DFA)
D - Document Size, Q - # Transitions in Automata, Σ - Alphabet, ⍺ - Number of
lookups
Reef Supports a Variety of Applications
31
Document Size
Average SAFA Size -
States
Average SAFA Size -
Transitions
Document Redaction
(Small)
415 331 42,318
Document Redaction
(Large)
1,000 908 116,751
ODoH 128 16 1,958
Strong Password
Match/Non-Match
12/6-9 21 1,188
DNA Match/Non-Match 32 million - 43 million 546 29,832
And It’s More Efficient - Fewer Constraints
32
DFA DFA+Recursion SAFA+lookup Reef
Document
Redaction
(Small)
23,041,771 67,472 54,679 52,631
Document
Redaction (Large)
X 141,712 57,628 54,636
ODoH 1,552,754 24,131 22,573 18,437
Password Match/
Non-Match
X X 21,002/21,721 19,982/20,725
DNA Match/Non-
Match
X X 96,296/107,184 85,352/95,916
DNA Non-Match/MatchDocument Redactions (Small and Large)
And It’s More Efficient - Faster Total Prover Time
33
Even without Reef’s additional optimizations, SAFA+lookup is orders of
magnitude faster
Summary
We want…
•Succinct Zero Knowledge Proofs
•For a variety of regexs
•That scale well for large documents
We can…
•Use SAFA to get better expressivity
•And skip irrelevant parts of the document
•Use lookup tables for document commitment and SAFA transitions
Reef!
•Support for a variety of applications
•Fast Prover and Verifier times
•Fewer constraints
34
Thank you!
Contact:
elefthei,ecmargo,woodsjk @
seas.upenn.edu