"Choosing proper type of scaling", Olena Syrota

fwdays 312 views 27 slides Jun 18, 2024
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze s...


Slide Content

About me
Software architect at Star
Lecturer at SET University

20+ years of experience in IT
Worked at different positions from dev to architect
Java, Spring, cloud, IoT
Certified as SEI Software
Architecture Professional
Certified as Stanford Advanced
Project Manager
Tech stack:
Olena Syrota
LinkedIn

About TechTalk
Topic “Choosing proper
type of scaling”
Level 2: IntermediateImagine IoT processing system which
is already quite mature and production
ready and for which client coverage is
growing and scaling and performance
aspects are life and death questions.

System has Redis, MongoDB and stream
processing based on ksqldb.

In this talk we will first analyse scaling
approaches and then select proper
ones for our system

Agenda
Scaling perspectives:
•AKF Scale Cube
•Vertical scaling vs horizontal scaling
•Replication vs sharding


IoT processing system scaling
(Redis, MongoDB, ksqldb)

Scaling
perspectives

Perspective 1: AKF Scale Cube

https://akfpartners.com/growth-blog/scale-cube

Perspective 2
Vertical
Horizontal
Replication
Sharding
Same dataset
Dataset is split

Replication vs sharding
REPLICATION

●Same data set

●High availability

●Increases read capacity*

●Increases data locality
(data closer top reader)*

●Dedicated replicas can be used for
specific purpose (reporting etc)
SHARDING

●Each shard is different dataset

●Accomodates more data.
Increases read and write capacity

●Each shard is actually a separate
cluster (or master-slave replica set)

●Within each shard replication
can be applied

Scaling use case:
IoT processing system

Simplified diagram

Redis. Why to scale
Reason why to scale
●Availability
●Accommodate more data
●Handle more reads


Our use case
●Sensors
●Validation of sensors
●Reason for scaling - availability of cache
https://architecturenotes.co/redis/

Vertical scaling

Horizontal scaling, 1 shard, replication
●Redis cluster
●Sharding (start from 1 shard)
●Replicas for each shard (master, slave)

Horizontal scaling. N shards
●N shards to evenly distribute load
Redis. How to scale
https://architecturenotes.co/redis/

Redis cluster. Replication.
Should I read from slave?
Arguments
●Technically possible
●Client can obtain stale data. How stale ?
●Suitable for data changed rarely

Our use-case. Are reads of stale data
acceptable?
●Yes
●Validation of connected sensors

Redis cluster.
Why may I need more
read replicas?
Up to 5 read replicas

Boost reads performance
●if your use case allows stale data

Redis cluster.
Summary
Vertical
●If you can not achieve your goals with
horizontal/replication
●If cost is the same with vertical as with
horizontal/replication

Horizontal via replication
●If you have benefits from replicas (availability, reading
from replicas)

Horizontal via sharding
●Amount of data
https://architecturenotes.co/redis/

Why to scale
Reason why you want to scale
●More memory to fit more indexes
●More IOPS to write/read from disk faster
●More CPU to make in-memory operations
faster

Our use case:
●Timeseries data
●Accommodate telemetry data
(specific amount)
●Handle write scenarios (predictable load)
●Handle more reads as user number grows

Vertical scaling vs horizontal
replication
Vertical
●Scaling up each node of cluster

Horizontal replication
●Adding more nodes (to default 3 instances)

Advice
●Compare cost
●Check SLA for replication

Default strategy
●Write and read via primary replica

Reads from replica, cons
●It may take up to 90 seconds to replicate data
●When reading from secondary, you can use
parameter maxStalenessSeconds (>90 sec)
which will switch
you to primary replica in case of stale data

Our use case
●Reading from replicas was not the option
●Telemetry data should not be stale
Should I read from replica?

Sharding
Use sharding when:
●You dataset size is more then 4 Tb ->
use sharding (MongoDB Atlas specific)

●When it is not possible to scale further
vertically

Stream processing
with ksqldb.
Why and how to scale
●Execute more processing procedures ->
scale vertically

●Handle more workload ->
scale horizontally (add more instances
to cluster)


●Instead of sharding, spin another cluster

Closing questions
Is horizontal scaling always
better than vertical?
●It depends


Are reads from replicas is a silver
bullet for performance boost?
●What is SLA for replication?
Do you need sharding from day 1?
●No


Does sharding comes for additional cost?
●Yes. It is one more cluster
(or master-slave replica set)

Recommendations
●Understand you workload
●Do capacity planning &
benchmarking & performance testing
●Calculate cost

Recap
●AKF Scaling cube
●Horizontal/vertical scaling,
replication/sharding

●Scaling for IoT processing system
(Redis, MongoDB, ksqldb)

Learn more about Master's
programs at SET University
Learn more about Star