"Choosing proper type of scaling", Olena Syrota
fwdays
312 views
27 slides
Jun 18, 2024
Slide 1 of 27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
About This Presentation
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze s...
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Size: 1.46 MB
Language: en
Added: Jun 18, 2024
Slides: 27 pages
Slide Content
About me
Software architect at Star
Lecturer at SET University
20+ years of experience in IT
Worked at different positions from dev to architect
Java, Spring, cloud, IoT
Certified as SEI Software
Architecture Professional
Certified as Stanford Advanced
Project Manager
Tech stack:
Olena Syrota
LinkedIn
About TechTalk
Topic “Choosing proper
type of scaling”
Level 2: IntermediateImagine IoT processing system which
is already quite mature and production
ready and for which client coverage is
growing and scaling and performance
aspects are life and death questions.
System has Redis, MongoDB and stream
processing based on ksqldb.
In this talk we will first analyse scaling
approaches and then select proper
ones for our system
Agenda
Scaling perspectives:
•AKF Scale Cube
•Vertical scaling vs horizontal scaling
•Replication vs sharding
IoT processing system scaling
(Redis, MongoDB, ksqldb)
Scaling
perspectives
Perspective 1: AKF Scale Cube
https://akfpartners.com/growth-blog/scale-cube
Perspective 2
Vertical
Horizontal
Replication
Sharding
Same dataset
Dataset is split
Replication vs sharding
REPLICATION
●Same data set
●High availability
●Increases read capacity*
●Increases data locality
(data closer top reader)*
●Dedicated replicas can be used for
specific purpose (reporting etc)
SHARDING
●Each shard is different dataset
●Accomodates more data.
Increases read and write capacity
●Each shard is actually a separate
cluster (or master-slave replica set)
●Within each shard replication
can be applied
Scaling use case:
IoT processing system
Simplified diagram
Redis. Why to scale
Reason why to scale
●Availability
●Accommodate more data
●Handle more reads
Our use case
●Sensors
●Validation of sensors
●Reason for scaling - availability of cache
https://architecturenotes.co/redis/
Vertical scaling
Horizontal scaling, 1 shard, replication
●Redis cluster
●Sharding (start from 1 shard)
●Replicas for each shard (master, slave)
Horizontal scaling. N shards
●N shards to evenly distribute load
Redis. How to scale
https://architecturenotes.co/redis/
Redis cluster. Replication.
Should I read from slave?
Arguments
●Technically possible
●Client can obtain stale data. How stale ?
●Suitable for data changed rarely
Our use-case. Are reads of stale data
acceptable?
●Yes
●Validation of connected sensors
Redis cluster.
Why may I need more
read replicas?
Up to 5 read replicas
Boost reads performance
●if your use case allows stale data
Redis cluster.
Summary
Vertical
●If you can not achieve your goals with
horizontal/replication
●If cost is the same with vertical as with
horizontal/replication
Horizontal via replication
●If you have benefits from replicas (availability, reading
from replicas)
Horizontal via sharding
●Amount of data
https://architecturenotes.co/redis/
Why to scale
Reason why you want to scale
●More memory to fit more indexes
●More IOPS to write/read from disk faster
●More CPU to make in-memory operations
faster
Our use case:
●Timeseries data
●Accommodate telemetry data
(specific amount)
●Handle write scenarios (predictable load)
●Handle more reads as user number grows
Vertical scaling vs horizontal
replication
Vertical
●Scaling up each node of cluster
Horizontal replication
●Adding more nodes (to default 3 instances)
Advice
●Compare cost
●Check SLA for replication
Default strategy
●Write and read via primary replica
Reads from replica, cons
●It may take up to 90 seconds to replicate data
●When reading from secondary, you can use
parameter maxStalenessSeconds (>90 sec)
which will switch
you to primary replica in case of stale data
Our use case
●Reading from replicas was not the option
●Telemetry data should not be stale
Should I read from replica?
Sharding
Use sharding when:
●You dataset size is more then 4 Tb ->
use sharding (MongoDB Atlas specific)
●When it is not possible to scale further
vertically
Stream processing
with ksqldb.
Why and how to scale
●Execute more processing procedures ->
scale vertically
●Handle more workload ->
scale horizontally (add more instances
to cluster)
●Instead of sharding, spin another cluster
Closing questions
Is horizontal scaling always
better than vertical?
●It depends
Are reads from replicas is a silver
bullet for performance boost?
●What is SLA for replication?
Do you need sharding from day 1?
●No
Does sharding comes for additional cost?
●Yes. It is one more cluster
(or master-slave replica set)