Cost-Effective Burst Scaling For Distributed Query Execution

ScyllaDB 117 views 16 slides Jun 25, 2024
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

Building a query engine that scales efficiently is a difficult task. Queries over big datasets stored in Object Storage require a large amount of IO and compute power. Keeping the latency of expensive queries acceptable when using a fixed size compute cluster is only possible when over-provisioning ...


Slide Content

Cost-Effe ctive Burst Scaling for Distributed Query Execution Dan Harris Principal Software Engineer @ Coralogix

Dan Harris ( he/him ) Principal Software Engineer @ Coralogix Building DataPrime Query Engine , a distributed query engine for low-cost, real time analytics on observability data Apache Arrow committer To the user, a query engine is only as good as its p99 latency When I’m not hacking, I’m a runner, hiker, husband and dad (not in that order :))

DataPrime Query Engine: Goals Native execution of DataPrime, a custom query language optimized for heterogeneous, semi-structured data Low-latency interactive query execution over petabytes of highly dynamic data Cost-effectiveness Separation of compute and storage Efficient resource utilization to manage infrastructure costs

DataPrime Query Engine: Architecture Based on Arrow Ballista, a distributed execution engine for Arrow DataFusion Two c omponent services Scheduler: Responsible for distributed query planning and orchestration Executor: Executes tasks assigned by the scheduler

Autoscaling: Thesis Customer workloads are fairly predictable Follow normal business hours in a given region Query execution can be parallelized very effectively Separation of compute and storage Scaling compute does not require moving data around

Autoscaling: Antithesis Individual queries can be very large Not uncommon to scan ~500TB for a single query By the time you scale up the query is already done or has already timed out Auto-scaling signals are laggy Servers are expensive…

Autoscaling: Synthesis We can decompose this into two sub-problems Problem 1: How do we keep enough spare capacity to manage sudden bursts without falling over? Problem 2: How do we manage sustained load cost-effectively?

Autoscaling: Right Tool For the Job Problem 1: AWS Lambda Instant(ish) availability Only pay for what you use Really, really expensive when you do use it Problem 2: AWS EC2 Delayed provisioning Pay whether you use it or not Relatively cheap

Distributed Query Execution: A Brief Primer Execution Plan AggregateExec: mode=Final, gby=[applicationame, severity], aggr=[count(1)] CoalescePartitionsExec AggregateExec: mode=Partial, gby=[applicationame, severity], aggr=[count(1)] ProjectionExec: expr=[applicationame, severity] TableScanExec: logs, partitions = 1000 source logs | filter kubernetes.pod_name ~ 'my-service' | groupby application, severity derive count(1) SELECT application, severity, count(1) FROM logs WHERE kubernetes.pod_name LIKE '%my-service%'

Distributed Query Execution: A Brief Primer Distributed Execution Plan Stage 1: ShuffleWriterExec AggregateExec: mode=Partial, gby=[applicationame, severity], aggr=[count(1)] ProjectionExec: expr=[applicationame, severity] TableScanExec: logs, partitions = 1000 Stage 2: ShuffleWriterExec AggregateExec: mode=Final, gby=[application, severity], aggr=[count(1)], metrics=[] CoalescePartitionsExec ShuffleReadExec

Distributed Query Execution: Data Flow

Hybrid Execution

Autoscaling: Signals CPU Utilization Per-executor utilization throttled by scheduler Task Queue Good metrics for pending work Too volatile when sampled Task Load Average Good metric for cluster utilization Accurate sampling Smoothed value a good metric for “sustained” load

Autoscaling: All Together Now Architect for hybrid execution (serverless + EC2) Burst spiky compute loads onto serverless workers Shift sustained load to cheaper EC2 compute Profit!

Coda

Dan Harris dan@coralogix @thinkharderdev https://github.com/thinkharderdev https://www.linkedin.com/in/dsh2va/ Thank you! Let’s connect.
Tags