Feature Store Evolution Under Cost Constraints: When Cost is Part of the Architecture

A ScyllaDB Community
Feature Store Evolution Under Cost
Constraints: When Cost is Part of
the Architecture
David Malinge
Sr. Staff Software Engineer
ShareChat
Ivan Burmistrov
Principal Software Engineer

Previously on P99 Conf…
■ScyllaDB ftw
■Smart data model
■Various optimizations
■Cache locality
■…

Leadership:
“Great system! Can you
now build the one that
does the same, but 10x
cheaper? Thanks!”
Our reaction:

Cloud wants to rob you is complicated
■Billing is confusing: both AWS and GCP have ~40K SKUs
■Providers are not motivated to make it easy to understand and debug
■Easy to lose track of cost components

Step 1: ruthless cost inventory / attribution

Cost savings funnel

Cost savings funnel: Cloud traps

k8s wastage

Cloud traps: k8s wastage

Solution: Advanced k8s scheduler
■Open Source:

■Cloud speciﬁc:

■Commercial / Multi Cloud

“Cloud Tax”

Cloud traps: cross-AZ network egress (NIZE) cost
■3 zones
■Replication Factor = 3
(data gets copied to
all zones)

Naive reads: no settings in the client driver

Smarter reads: use token-aware routing

Smartest reads: + zone-aware routing

Smartest writes: everything we learned

Smartest+ writes: 2 zones

Smartest++ writes: 2 datacenters

Smartest+++ writes: a different cloud

Cost savings funnel: Compute optimization

ShareChat’s Feature Store
Case Study

Feature store: Architecture
More on compute: The Harsh Reality of Building a Realtime ML Feature Store (slides), QCon 2024

Feature Store

#1 Database scalability
■ScyllaDB clusters
■Flat-ish cost, no autoscaling
■Why ﬂat-ish? Not autoscalable != not scalable
●Don’t scale for yearly peak!
●Kudos to Scylla support ??????
■True autoscaling incoming?
●Tablets in ScyllaDB 6.0

#2 Different workloads

■Solutions
●Overscale Scylla cluster $$$
●Dual-datacenter - the classic advice: “isolate read and writes”
●usually leads to underutilized resources $$
●What if we could choose the latency level we want for reads vs writes?
■Scylla Workload prioritization $

Feature Serving Layer

Feature store costs: serving features
■Distributed GRPC service deployed on K8s
■Vast majority of the cost in compute
●Note: network costs can also be a problem
■For compute heavy K8s deployments, once you are past the the obvious pod
autoscaling (HPA/VPA), time to look into optimizations.
■Don’t optimize blindly: SETUP CONTINUOUS PROFILING!

Feature store costs: serving features
■The usual suspect: ser/de
■95%+ of our requests are cached
■Hottest path: request multiple entities => cache hits => merge => return
●aka: read from cache, deserialize, merge, serialize

Naive solution

Feature store costs: serving features
■The usual suspect: ser/de
■95%+ of our requests are cached
■Hottest path: request multiple entities => cache hits => merge => return
●aka: read from cache, deserialize, merge, serialize
■Biggest Win: partial deserialization!

Proto serde trick: encoding primer
■A message is series of key-value pairs (“record”)
●the key is the ﬁeld number
●the value is encoded depending on type ⇒
●repeated ﬁelds emit one record per element
●maps are a repeated ﬁeld under the hood
map<T, U> ⇔repeated entry<T,U>

{
“name” : “hello”,
“inner”: [{“val”: 1}, {“val”: 2}]
}

// protoscope output
3: {7: 1} // inner: {val: 1}
3: {7: 2} // inner: {val: 2}
source: https://protobuf.dev/programming-guides/encoding/#structure

Proto serde trick: encoding primer
message InnerMessage {
int32 val = 7;
}

message OuterMessage {
string name = 1;
repeated InnerMessage inner = 2;
}

// Example OuterMessage (as JSON)
{
“name” : “hello”,
“inner”: [{“val”: 1}, {“val”: 2}]
}

// protoscope output, 3 records:
1: {"hello"} // name: “hello”
2: {7: 1} // inner: {val: 1}
2: {7: 2} // inner: {val: 2}

Proto serde trick: how?
■Trick enabler:
●embedded messages and bytes
representation are the same (len-delimited)
■Why is this good?
●both are interchangeable!
●bytes deserialization is just a copy
■Ok.. but what do I do with bytes..?
●operations on maps and repeated ﬁelds!
●append, delete, swap, insert..
●All done without deserializing elements!
source: https://protobuf.dev/programming-guides/encoding/#structure

Proto serde trick: back to example
message InnerMessage {
int32 val = 8;
}

message OuterMessage {
string name = 1;
repeated InnerMessage inner = 2;
}

// OuterMessage can also
// be deserialized as:
message LazyOuterMessage {
bytes name = 1;
repeated bytes inner = 2;
}

Proto serde trick: back to cache

■Our cache stores serialized protos Features for a given entity
■Our response is basically a collection of Features for each entity requested

Proto serde trick: simple bench
■Repo: https://github.com/david-sharechat/lazy-proto
■Benchmarking “naive” merge vs lazy merge
●Appending 2 serialized protos with map and repeated ﬁeld

Conclusion

Learnings

Before we part…
■We could not cover everything, please reach out if interested in these topics
●Computing windowed counter features (slashed costs by 5x)
●Flink autoscaling & state recovery (2x cost lever)
●Specialized Golang cache library (will likely open-source soon)
●… anything cheap and performant ;)

A ScyllaDB Community
Thanks for watching!
ShareChat

Feature Compute

Feature store costs: computing features
■Multiple Apache Flink jobs
■Problem: Flat cost, scaled for peak
■Obvious solution: autoscaling

Before CPU util chart
placeholder
After CPU util chart
placeholder
More on autoscaling Flink jobs: The Harsh Reality of Building a Realtime ML Feature Store (slides), QCon 2024

Feature Store Evolution Under Cost Constraints: When Cost is Part of the Architecture

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Feature Store Evolution Under Cost Constraints: When Cost is Part of the Architecture

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......