From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
ScyllaDB
474 views
33 slides
Jun 26, 2024
Slide 1 of 33
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
About This Presentation
ShareChat's Ivan Burmistrov walks through how they built a low latency ML Feature Store based on ScyllaDB which initially failed to meet the scalability requirements and failed on 1 million features per second load, but has been successfully scaled 1000 times to handle 1 billing features per sec...
ShareChat's Ivan Burmistrov walks through how they built a low latency ML Feature Store based on ScyllaDB which initially failed to meet the scalability requirements and failed on 1 million features per second load, but has been successfully scaled 1000 times to handle 1 billing features per second without scaling the underlying database.
Size: 20.82 MB
Language: en
Added: Jun 26, 2024
Slides: 33 pages
Slide Content
From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store Ivan Burmistrov Principal Software Engineer Andrei Manakov Staff Software Engineer ShareChat ShareChat
Context
Moj, a short video app
What are the features anyway?
The story
The architecture
Tiles
Tiles
High-level architecture
Why it failed?
ScyllaDb Schema (Bad) CREATE TABLE features ( entity_id string, tile_time timestamp, feature_name string, value blob, PRIMARY KEY ((entity_id), tile_time, feature_name))
Tiling Configuration (Bad)
Let’s do some math
Optimisations
ScyllaDb Schema (Good) CREATE TABLE features ( entity_id string, tile_time timestamp, features blob, PRIMARY KEY (entity_id, tile_time))
Conclusion Robust proven technologies pay off (ScyllaDB, Flink,...) Every next step is harder than previous one The simplest and practical solution does work The most optimized solution isn’t human-friendly Don’t be scared to fork a lib and adjust it for your system
Ivan Burmistrov burmistrov.ivan @gmail.com @isburmistrov [email protected] Thank you! Let’s connect. Andrei Manakov andection @gmail.com @AndreyManakov andection@threads