Push the Database Beyond the Edge by Nikita Sivukhin
ScyllaDB
0 views
55 slides
Oct 14, 2025
Slide 1 of 55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
About This Presentation
Almost any application can benefit from having data available locally - enabling blazing-fast access and optimized write patterns. This talk will walk you through one approach to designing a full-featured sync engine, applicable across a wide range of domains, including front-end, back-end, and mach...
Almost any application can benefit from having data available locally - enabling blazing-fast access and optimized write patterns. This talk will walk you through one approach to designing a full-featured sync engine, applicable across a wide range of domains, including front-end, back-end, and machine learning training.
Size: 4.5 MB
Language: en
Added: Oct 14, 2025
Slides: 55 pages
Slide Content
A ScyllaDB Community
Push the Database
Beyond the Edge
Nikita Sivukhin
Software Engineer
Nikita Sivukhin (he/his)
Software Engineer at Turso
■Currently work at Turso (previously at ShareChat/Moj)
■Learning to play the ukulele
(so far, it feels harder than writing code)
■I like algorithms and data structures
●Especially if DS is compact or succinct!
Talk structure
We will discuss how to bring a database locally to improve read/write patterns
1.Use cases from different domains where this can be useful
2.Design considerations for the solution we will build
3.Storage model overview
4.Database replication with read-only queries
5.Database replication with read/write queries
Use cases
Let’s discover a few example where local state can greatly
improve the app
Example #1: backend-router??????
■Router app: proxy requests by hostname to specific IP
●Hostname: sivukhin.p99conf.io → 10.4.113.154
●100 million mappings in total, 1000+ reads per sec, 10 writes per sec
●Eventual consistency: route update can be delayed by a few seconds
■Solution:
●Pull changes from the remote periodically and maintain local state
■Complexity:
●Updates must be streamed consistently from the main DB
●Full snapshot must be sent from the main DB in an efficient manner
●Local state must be maintained durably to recover from crashes
Example #2: ML-training pipeline??????
■Train embeddings in real-time with some algorithm
●Train on events (user, post, +/-), 10-100k events at peak
●Distributed training pipeline to handle the load
●Consistency: workers can accumulate gradients for some time and
flush them periodically
■Solution:
●Maintain embeddings in local state and accumulate gradients locally
●Flush gradients and update local embeddings periodically
■Complexity:
●Need to track accumulated gradients to send updates later
●Need to select embeddings to store locally – as the dataset is huge
Insight!
■Examples from completely different domains have benefited from local state
●One of the key components – is relaxed consistency
■Examples are pretty simple – in reality complexity usually grows fast
●Complex queries, support for local DDL statements, indexes, transactions, etc
■We must push the database beyond the “edge”!
Design considerations
Communication
■Remote DB is the source of truth
●It’s enough for all examples
●Simpler to reason about the data and perform complex actions (e.g. CREATE INDEX)
■“Star” topology for synchronization
●Tree topology is possible: just add intermediate nodes running the same DB
Technology choice
■Embedded database
●Must be pluggable into any language and any runtime
■A few “moving” components
●Otherwise replication complexity will increase
Technology choice: SQLite!
■Embedded database
●Must be pluggable into any language and any runtime
■A few “moving” components
●Otherwise replication complexity will increase
■SQLite is almost* perfect technology for us!
* we will discuss later why it’s not the perfect technology
SQLite storage model
SQLite storage model
■The DB “virtually” represented by a single file
●File consists of same-size pages linked as specified in the SQLite database file format
SQLite storage model
■The DB “virtually” represented by a single file
●File consists of same-size pages linked as specified in the SQLite database file format
■“Physically”, the DB consists of 2 files: main DB file and the WAL
●WAL is the sequence of frames which hold page content update , ID and commit marker
●SQLite forbids concurrent writers
SQLite storage model
■The DB “virtually” represented by a single file
●File consists of same-size pages linked as specified in the SQLite database file format
■“Physically”, the DB consists of 2 files: main DB file and the WAL
●WAL is the sequence of frames which hold page content update , ID and commit marker
●SQLite forbids concurrent writers
■WAL makes “time travel” easy
SQLite storage model
■The DB “virtually” represented by a single file
●File consists of same-size pages linked as specified in the SQLite database file format
■“Physically”, the DB consists of 2 files: main DB file and the WAL
●WAL is the sequence of frames which hold page content update , ID and commit marker
●SQLite forbids concurrent writers
■WAL makes “time travel” easy
■WAL checkpointed to the DB file
Read-only replication
■Replicate data from remote to local client
■Read queries can be executed completely locally
Read-only replication
■Replicate the DB file and stream the WAL frames to the client
■Client DB can checkpoint the WAL at its own pace
Read-only replication
■Pros:
●� Heavy operations are “executed” only once at the remote DB (e.g. CREATE INDEX)
●�Impossible to diverge from the remote (no worries about difference in SQLite versions)
■Cons:
●� WAL frames from different transactions contain same pages – unnecessary data transfer
●�Any update will transfer at least one frame – which is 4KB by default
Read-only replication
■Pros:
●� Heavy operations are “executed” only once at the remote DB (e.g. CREATE INDEX)
●�Impossible to diverge from the remote (no worries about difference in SQLite versions)
■Cons:
●� WAL frames from different transactions contain same pages – unnecessary data transfer
●�Any update will transfer at least one frame – which is 4KB by default
sqlite> UPDATE conferences SET name = 'p99.conf' WHERE id = 1;
sqlite> UPDATE conferences SET dates = '22 October 2025 - 23 October 2025' WHERE id = 1;
$> ls -l conferences.db-wal
8272 conferences.db-wal
Read-only replication
■Pros:
●� Heavy operations are “executed” only once at the remote DB (e.g. CREATE INDEX)
●�Impossible to diverge from the remote (no worries about difference in SQLite versions)
■Cons:
●� WAL frames from different transactions contain same pages – unnecessary data transfer
●�Any update will transfer at least one frame – which is 4KB by default
■*SQLite doesn’t have a WAL API for the replication and needs to be extended
●wal_begin_session / wal_read_page_at / wal_insert_page_at / wal_end_session
Read-only replication: deduplicate pages
■Fetch unique page updates and maintain “virtual” DB equivalence
●WAL history will be different between client and server
●But every commit boundary from the client will match some commit boundary on the server
Read-only replication: compress pages
■Pages in the single update set can be compressed
■Even more: client and remote can establish shared compression “dictionary”
●No need to transfer the “dictionary” – client and server need to agree only about page IDs
Read-only replication: compress pages
■Pages in the single update set can be compressed
■Even more: client and remote can establish shared compression “dictionary”
●Server has full DB history and can choose a page version same as on the client side
●No need to transfer the “dictionary” – client and server need to agree only about page IDs
Read-only replication: intelligent server
■For optimizations to work – server must be “intelligent” (just code – no AI)
●This is not just some trivial file server which streams DB and WAL files
■Keep “immutable” history of the database and efficiently store it
■Efficiently compute page updates delta for client between virtual DB states
Read-only replication: summary
■� Heavy operations are “executed” only once at the remote DB
■�Impossible to diverge from the remote
■�Straightforward consistency guaranteed by simplicity of the WAL
■Data transfer is close enough to “logical” replication
●It depends: some workloads will benefit more from page-based replication with compression
●If mutations cause lots of updates (indices, triggers, etc) – “logical” replication will be better
Writes support
Local write support
■Client must be able to write data locally
■Local updates can be synced on demand with remote
●E.g. client is back online or large enough batch is accumulated to amortize write cost
Local write support
■For the sync of local writes we need two components:
●Push local changes to the remote
●Pull remote changes to the local
Local write support: push
■Using physical replication for the push phase is problematic due to conflicts
Local write support: push
■Using physical replication for the push phase is problematic due to conflicts
■Capture deterministic logical changes and “replay” them on the remote
●*SQLite needs to be extended to have a smooth CDC capability
Local write support: push
■Using physical replication for the push phase is problematic due to conflicts
■Capture deterministic logical changes and “replay” them on the remote
●*SQLite needs to be extended to have a smooth CDC capability
■Maintain last change ID from the client applied on the remote in the a
separate table
Local write support: pull
■Need to “rebase” local changes on top of the WAL without disruption
Local write support: pull
■Need to “rebase” local changes on top of the WAL without disruption
■Instead of “fixing” WAL file directly – append “revert” pages to fix it “virtually”
●Revert F#C and F#D frames
●Apply remote frames F#A and F#B
●Replay local changes from the logical log
Local write support: pull
■Need to “rebase” local changes on top of the WAL without disruption
■Instead of “fixing” WAL file directly – append “revert” pages to fix it “virtually”
●Revert F#C and F#D frames
●Apply remote frames F#A and F#B
●Replay local changes from the logical log
■Use WAL “time travel” capability
to “revert” page changes
Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
■Before checkpoint – just put revert frames into a separate file
Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
■Before checkpoint – just put revert frames into a separate file
Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
■Before checkpoint – just put revert frames into a separate file
Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
■Before checkpoint – just put revert frames into a separate file
Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
■Before checkpoint – just put revert frames into a separate file
Local write support: conflict resolution
■Easy to get an LWW conflict resolution strategy with logical “replay”
●Convert mutations to statements INSERT INTO … ON CONFLICT DO UPDATE …
Local write support: conflict resolution
■Easy to get an LWW conflict resolution strategy with logical “replay”
●Convert mutations to statements INSERT INTO … ON CONFLICT DO UPDATE …
■Some applications need more sophisticated rules
Local write support: conflict resolution
■Easy to get an LWW conflict resolution strategy with logical “replay”
●Convert mutations to statements INSERT INTO … ON CONFLICT DO UPDATE …
■Some applications need more sophisticated rules
●Use the power of SQL and let the user decide!
transform: m => ({
sql: `update ${m.tableName} set counter = counter + ?`,
values: [m.after.counter - m.before?.counter ?? 0]
})
Final design
■Designed a fully-functional, writable SQL DB replicated locally into the app
●Built on top of SQLite
●Pull changes by replicating physical pages
●Push changes by executing SQL mutations on the remote DB
Further improvements
■Logical mode for pull operation
●Logical mode has different trade-offs and expands supported use cases
■Partial synchronization
●Physical mode allows lazy pulling of B-Tree pages only when a query actually needs them.
■Local changes deduplication
●In most cases, only the latest local update matters. This reduces the storage required for CDC.
■…and more!
●We’re listening, and your feedback shapes the future of sync!
Implementation details
■SQLite is almost a perfect solution – but it lacks features which we used:
●Raw WAL API for pull
●Logical logging support for push
●Other things left overboard: precise checkpoint control, schema versioning and more
Implementation details: meet Turso
■SQLite is almost a perfect solution – but it lacks features which we used:
●Raw WAL API for pull
●Logical logging support for push
●Other things left overboard: precise checkpoint control, schema versioning and more
■SQLite is extensible – but core is hard to improve without major changes.
■Meet turso-database.
Implementation details: meet Turso
■Written in Rust which is very expressive language with big community.
●Sync-engine uses genawaiter crate
●Async code can be written flawlessly – but core is completely runtime agnostic
async fn apply(&self, coro: &Coro, changes: Changes) -> Result<()> {
let changes_file = &changes.file_slot;
let apply_result = self.apply_internal(coro, &changes_file). await;
let Ok(watermark) = pull_result else {
return Err(pull_result.err().unwrap());
};
let revert = self.io.open_file(&self.revert_wal)?;
reset_wal_file(coro, revert_wal_file, 0). await?;
self.update_meta(coro, |m| m.watermark = watermark). await?;
Ok(())
}
Implementation details: meet Turso
■CDC required for sync-engine is implemented as independent feature
●User can enable it with special pragma and track all changes happening in the database
●All database connections created by sync-engine enable CDC by default
turso> SELECT
bin_record_json_object(table_columns_json_array(table_name), before) before,
bin_record_json_object(table_columns_json_array(table_name), after) after
FROM turso_cdc;
──────────────────────────────────────────────┬──────────────────────────────
──────────────────────
before │ after
│ {"text":"record p99 talk","date":"23-09-2025"}
│ {"text":"play ukulele","date":"27-09-2025"}
{"text":"play ukulele","date":"27-09-2025"} │ {"text":"improve sync engine","date":"27-09-2025"}
──────────────────────────────────────────────┴──────────────────────────────
──────────────────────
Implementation details: meet Turso
■WAL API exposed to support precise control over frames in the sync engine
●Basic operations like read/append
●More advanced stuff like generation of “revert” frames for pull implementation
trait DatabaseWalSession {
fn append_page(&mut self, page_no: u32, page: &[u8]) -> Result<()>;
fn revert_page(&mut self, page_no: u32, frame_watermark: u64) -> Result<()>;
fn revert_changes_after(&mut self, frame_watermark: u64) -> Result<usize>;
fn commit(&mut self, db_size: u32) -> Result<()>;
}
Call to action
■Think if local state can improve the performance of your application
■There are already some options for you to consider:
●SQLite – battle-tested and very extensible embedded DB which can cover simple use cases
●Turso-db – SQLite compatible rewrite of SQLite powered by Deterministic Simulation Testing
■Sync-engine described in the talk is already available for use: @tursodatabase/sync