Push the Database Beyond the Edge by Nikita Sivukhin

A ScyllaDB Community
Push the Database
Beyond the Edge
Nikita Sivukhin
Software Engineer

Nikita Sivukhin (he/his)

Software Engineer at Turso
■Currently work at Turso (previously at ShareChat/Moj)
■Learning to play the ukulele
(so far, it feels harder than writing code)
■I like algorithms and data structures
●Especially if DS is compact or succinct!

Talk structure
We will discuss how to bring a database locally to improve read/write patterns
1.Use cases from different domains where this can be useful
2.Design considerations for the solution we will build
3.Storage model overview
4.Database replication with read-only queries
5.Database replication with read/write queries

Use cases
Let’s discover a few example where local state can greatly
improve the app

Example #1: backend-router??????
■Router app: proxy requests by hostname to speciﬁc IP
●Hostname: sivukhin.p99conf.io → 10.4.113.154
●100 million mappings in total, 1000+ reads per sec, 10 writes per sec
●Eventual consistency: route update can be delayed by a few seconds
■Solution:
●Pull changes from the remote periodically and maintain local state
■Complexity:
●Updates must be streamed consistently from the main DB
●Full snapshot must be sent from the main DB in an eﬃcient manner
●Local state must be maintained durably to recover from crashes

Example #2: ML-training pipeline??????
■Train embeddings in real-time with some algorithm
●Train on events (user, post, +/-), 10-100k events at peak
●Distributed training pipeline to handle the load
●Consistency: workers can accumulate gradients for some time and
ﬂush them periodically
■Solution:
●Maintain embeddings in local state and accumulate gradients locally
●Flush gradients and update local embeddings periodically
■Complexity:
●Need to track accumulated gradients to send updates later
●Need to select embeddings to store locally – as the dataset is huge

Insight!
■Examples from completely different domains have beneﬁted from local state
●One of the key components – is relaxed consistency
■Examples are pretty simple – in reality complexity usually grows fast
●Complex queries, support for local DDL statements, indexes, transactions, etc
■We must push the database beyond the “edge”!

Design considerations

Communication
■Remote DB is the source of truth
●It’s enough for all examples
●Simpler to reason about the data and perform complex actions (e.g. CREATE INDEX)
■“Star” topology for synchronization
●Tree topology is possible: just add intermediate nodes running the same DB

Technology choice
■Embedded database
●Must be pluggable into any language and any runtime
■A few “moving” components
●Otherwise replication complexity will increase

Technology choice: SQLite!
■Embedded database
●Must be pluggable into any language and any runtime
■A few “moving” components
●Otherwise replication complexity will increase
■SQLite is almost* perfect technology for us!

* we will discuss later why it’s not the perfect technology

SQLite storage model

SQLite storage model
■The DB “virtually” represented by a single ﬁle
●File consists of same-size pages linked as speciﬁed in the SQLite database ﬁle format

SQLite storage model
■The DB “virtually” represented by a single ﬁle
●File consists of same-size pages linked as speciﬁed in the SQLite database ﬁle format
■“Physically”, the DB consists of 2 ﬁles: main DB ﬁle and the WAL
●WAL is the sequence of frames which hold page content update , ID and commit marker
●SQLite forbids concurrent writers

SQLite storage model
■The DB “virtually” represented by a single ﬁle
●File consists of same-size pages linked as speciﬁed in the SQLite database ﬁle format
■“Physically”, the DB consists of 2 ﬁles: main DB ﬁle and the WAL
●WAL is the sequence of frames which hold page content update , ID and commit marker
●SQLite forbids concurrent writers
■WAL makes “time travel” easy

SQLite storage model
■The DB “virtually” represented by a single ﬁle
●File consists of same-size pages linked as speciﬁed in the SQLite database ﬁle format
■“Physically”, the DB consists of 2 ﬁles: main DB ﬁle and the WAL
●WAL is the sequence of frames which hold page content update , ID and commit marker
●SQLite forbids concurrent writers
■WAL makes “time travel” easy
■WAL checkpointed to the DB ﬁle

SQLite storage model: simple as 2 ﬁles
$> echo -n
'H4sIAAAAAAAAAwsO9MksSVVIyy/KTSxRMGZgYmBiYnBQUGBgADIhGAYYgZgFjU8IMDHo9f7gBSlm
3McARKOAqsCBkU1cWZnRsyQxKSc1OT8vLbUoNS85tRiJyeQc5OoY4qoQ4ujk46qAJKGRl5ibqgmKG
5RYHgUjCgAAXKzHWwAEAAA=' | base64 -d | zstd -d > conferences.db
$> echo -n
'H4sIAAAAAAAAAzOvZ2ti0H0kwcDAxAACq+ov/npx5ufO83Oa9joWqHsxQCSYYOILjmoYKNS9n8QL
FGRk/MIARKNghAAuRibZAktLveT8vDQAbci+bDgCAAA=' | base64 -d | zstd -d >
conferences.db-wal
$> sqlite3 conferences.db 'SELECT * FROM conferences'
p99.conf

Read-only replication

Read-only replication
■Replicate data from remote to local client
■Read queries can be executed completely locally

Read-only replication
■Replicate the DB ﬁle and stream the WAL frames to the client
■Client DB can checkpoint the WAL at its own pace

Read-only replication
■Pros:
●&#3627932749; Heavy operations are “executed” only once at the remote DB (e.g. CREATE INDEX)
●&#3627932749;Impossible to diverge from the remote (no worries about difference in SQLite versions)
■Cons:
●&#3627932750; WAL frames from different transactions contain same pages – unnecessary data transfer
●&#3627932750;Any update will transfer at least one frame – which is 4KB by default

Read-only replication
■Pros:
●&#3627932749; Heavy operations are “executed” only once at the remote DB (e.g. CREATE INDEX)
●&#3627932749;Impossible to diverge from the remote (no worries about difference in SQLite versions)
■Cons:
●&#3627932750; WAL frames from different transactions contain same pages – unnecessary data transfer
●&#3627932750;Any update will transfer at least one frame – which is 4KB by default

sqlite> UPDATE conferences SET name = 'p99.conf' WHERE id = 1;
sqlite> UPDATE conferences SET dates = '22 October 2025 - 23 October 2025' WHERE id = 1;
$> ls -l conferences.db-wal
8272 conferences.db-wal

Read-only replication
■Pros:
●&#3627932749; Heavy operations are “executed” only once at the remote DB (e.g. CREATE INDEX)
●&#3627932749;Impossible to diverge from the remote (no worries about difference in SQLite versions)
■Cons:
●&#3627932750; WAL frames from different transactions contain same pages – unnecessary data transfer
●&#3627932750;Any update will transfer at least one frame – which is 4KB by default
■*SQLite doesn’t have a WAL API for the replication and needs to be extended
●wal_begin_session / wal_read_page_at / wal_insert_page_at / wal_end_session

Read-only replication: deduplicate pages
■Fetch unique page updates and maintain “virtual” DB equivalence
●WAL history will be different between client and server
●But every commit boundary from the client will match some commit boundary on the server

Read-only replication: compress pages
■Pages in the single update set can be compressed
■Even more: client and remote can establish shared compression “dictionary”
●No need to transfer the “dictionary” – client and server need to agree only about page IDs

Read-only replication: compress pages
■Pages in the single update set can be compressed
■Even more: client and remote can establish shared compression “dictionary”
●Server has full DB history and can choose a page version same as on the client side
●No need to transfer the “dictionary” – client and server need to agree only about page IDs

Read-only replication: intelligent server
■For optimizations to work – server must be “intelligent” (just code – no AI)
●This is not just some trivial ﬁle server which streams DB and WAL ﬁles
■Keep “immutable” history of the database and eﬃciently store it
■Eﬃciently compute page updates delta for client between virtual DB states

Read-only replication: summary
■&#3627932749; Heavy operations are “executed” only once at the remote DB
■&#3627932749;Impossible to diverge from the remote
■&#3627932749;Straightforward consistency guaranteed by simplicity of the WAL
■Data transfer is close enough to “logical” replication
●It depends: some workloads will beneﬁt more from page-based replication with compression
●If mutations cause lots of updates (indices, triggers, etc) – “logical” replication will be better

Writes support

Local write support
■Client must be able to write data locally
■Local updates can be synced on demand with remote
●E.g. client is back online or large enough batch is accumulated to amortize write cost

Local write support
■For the sync of local writes we need two components:
●Push local changes to the remote
●Pull remote changes to the local

Local write support: push
■Using physical replication for the push phase is problematic due to conﬂicts

Local write support: push
■Using physical replication for the push phase is problematic due to conﬂicts
■Capture deterministic logical changes and “replay” them on the remote
●*SQLite needs to be extended to have a smooth CDC capability

Local write support: push
■Using physical replication for the push phase is problematic due to conﬂicts
■Capture deterministic logical changes and “replay” them on the remote
●*SQLite needs to be extended to have a smooth CDC capability
■Maintain last change ID from the client applied on the remote in the a
separate table

Local write support: pull
■Need to “rebase” local changes on top of the WAL without disruption

Local write support: pull
■Need to “rebase” local changes on top of the WAL without disruption
■Instead of “ﬁxing” WAL ﬁle directly – append “revert” pages to ﬁx it “virtually”
●Revert F#C and F#D frames
●Apply remote frames F#A and F#B
●Replay local changes from the logical log

Local write support: pull
■Need to “rebase” local changes on top of the WAL without disruption
■Instead of “ﬁxing” WAL ﬁle directly – append “revert” pages to ﬁx it “virtually”
●Revert F#C and F#D frames
●Apply remote frames F#A and F#B
●Replay local changes from the logical log
■Use WAL “time travel” capability
to “revert” page changes

Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”

Local write support: checkpoint
■We use a portion of WAL to do the “time-travel” – how to checkpoint WAL?
■We need to have frames somewhere to revert local changes before “replay”
■Before checkpoint – just put revert frames into a separate ﬁle

Local write support: conﬂict resolution
■Easy to get an LWW conﬂict resolution strategy with logical “replay”
●Convert mutations to statements INSERT INTO … ON CONFLICT DO UPDATE …

Local write support: conﬂict resolution
■Easy to get an LWW conﬂict resolution strategy with logical “replay”
●Convert mutations to statements INSERT INTO … ON CONFLICT DO UPDATE …
■Some applications need more sophisticated rules

Local write support: conﬂict resolution
■Easy to get an LWW conﬂict resolution strategy with logical “replay”
●Convert mutations to statements INSERT INTO … ON CONFLICT DO UPDATE …
■Some applications need more sophisticated rules
●Use the power of SQL and let the user decide!
transform: m => ({
sql: `update ${m.tableName} set counter = counter + ?`,
values: [m.after.counter - m.before?.counter ?? 0]
})

Final design
■Designed a fully-functional, writable SQL DB replicated locally into the app
●Built on top of SQLite
●Pull changes by replicating physical pages
●Push changes by executing SQL mutations on the remote DB

Further improvements
■Logical mode for pull operation
●Logical mode has different trade-offs and expands supported use cases
■Partial synchronization
●Physical mode allows lazy pulling of B-Tree pages only when a query actually needs them.
■Local changes deduplication
●In most cases, only the latest local update matters. This reduces the storage required for CDC.
■…and more!
●We’re listening, and your feedback shapes the future of sync!

Implementation details
■SQLite is almost a perfect solution – but it lacks features which we used:
●Raw WAL API for pull
●Logical logging support for push
●Other things left overboard: precise checkpoint control, schema versioning and more

Implementation details: meet Turso
■SQLite is almost a perfect solution – but it lacks features which we used:
●Raw WAL API for pull
●Logical logging support for push
●Other things left overboard: precise checkpoint control, schema versioning and more
■SQLite is extensible – but core is hard to improve without major changes.
■Meet turso-database.

Implementation details: meet Turso
■Written in Rust which is very expressive language with big community.
●Sync-engine uses genawaiter crate
●Async code can be written ﬂawlessly – but core is completely runtime agnostic
async fn apply(&self, coro: &Coro, changes: Changes) -> Result<()> {
let changes_file = &changes.file_slot;
let apply_result = self.apply_internal(coro, &changes_file). await;
let Ok(watermark) = pull_result else {
return Err(pull_result.err().unwrap());
};
let revert = self.io.open_file(&self.revert_wal)?;
reset_wal_file(coro, revert_wal_file, 0). await?;
self.update_meta(coro, |m| m.watermark = watermark). await?;
Ok(())
}

Implementation details: meet Turso
■CDC required for sync-engine is implemented as independent feature
●User can enable it with special pragma and track all changes happening in the database
●All database connections created by sync-engine enable CDC by default
turso> SELECT
bin_record_json_object(table_columns_json_array(table_name), before) before,
bin_record_json_object(table_columns_json_array(table_name), after) after
FROM turso_cdc;
──────────────────────────────────────────────┬──────────────────────────────
──────────────────────
before │ after
│ {"text":"record p99 talk","date":"23-09-2025"}
│ {"text":"play ukulele","date":"27-09-2025"}
{"text":"play ukulele","date":"27-09-2025"} │ {"text":"improve sync engine","date":"27-09-2025"}
──────────────────────────────────────────────┴──────────────────────────────
──────────────────────

Implementation details: meet Turso
■WAL API exposed to support precise control over frames in the sync engine
●Basic operations like read/append
●More advanced stuff like generation of “revert” frames for pull implementation
trait DatabaseWalSession {
fn append_page(&mut self, page_no: u32, page: &[u8]) -> Result<()>;
fn revert_page(&mut self, page_no: u32, frame_watermark: u64) -> Result<()>;
fn revert_changes_after(&mut self, frame_watermark: u64) -> Result<usize>;
fn commit(&mut self, db_size: u32) -> Result<()>;
}

Call to action
■Think if local state can improve the performance of your application
■There are already some options for you to consider:
●SQLite – battle-tested and very extensible embedded DB which can cover simple use cases
●Turso-db – SQLite compatible rewrite of SQLite powered by Deterministic Simulation Testing
■Sync-engine described in the talk is already available for use: @tursodatabase/sync

Thank you! Let’s connect.
Nikita Sivukhin
[email protected]
@SivukhinN
sivukhin.github.io

Push the Database Beyond the Edge by Nikita Sivukhin

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Push the Database Beyond the Edge by Nikita Sivukhin

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx