Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines

ScyllaDB 107 views 26 slides Jul 02, 2024
Slide 1
Slide 1 of 26
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26

About This Presentation

In this presentation, we explore how standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data ingestion workflows. Instead, we showcase the power of simple yet clever methods that can uncover hidden performance limitations.

Attendees will discover unc...


Slide Content

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines Zamir Paltiel Head of Engineering at Hyperspace

Zamir Paltiel Head of Engineering at Hyperspace Performance enthusiast F ocused on search performance in recent years Love building things from scratch Father to 4 children

Intro - performance optimization process How developers measure their pipelines today The issues one might miss Case study at Hyperspace A new approach to find bottlenecks Agenda

Intro - performance optimization process

How to increase speed of your code Run test Fix/Optimize Analyze Measure Repeat

Pareto Principle

T he bottleneck syndrome

What should I measure? CPU time Cycles Context switches Cache misses Memory Disk Network Mutexes

How devs measure code speed today

CPU Profilers CPU Profiler shows what functions consume what percent of CPU time. Event based profilers - use built-in hooks that the runtime framework provides Applicable to managed code like Java, Go, C# Instrumentation profilers - add code to the program that collects required info Inject in runtime or d uring compilation phase Adds overhead to code execution that sometimes creates distortion of reality Sampling profilers - collects call stack of the process in timely intervals Less intrusive - the code runs as usual Very good approximation of the amount of CPU time each function took in percentage No knowledge about the actual time an operation took - missing off cpu time

Flame Graph

T he issues one might miss

Off CPU Time

Case study at Hyperspace

Sample Ingestion Pipeline

Let’s profile it!

Flame Graph Analysis JSON parsing seems like the most significant part - can be replaced by binary serialization Challenge - Using Top command we see that CPU utilization is 2/8 CPUs - seems like there is a bottleneck that prevents us from reaching maximum ingestion speed

A new approach to find bottlenecks

Finding the real bottleneck

What we found If you have an off CPU bottleneck - the more load you add, it will take higher percentage of the time. Redundant lock - turns that FPGA SDK is thread safe. Usage of a single file descriptor to communicate with PCI interface Underutilizing the speed of the PCI bus - no usage of multithreading Not utilizing the 4 channels that exists in the PCI

Increase of ~30% in ingestion speed

Summary There are important factors to measure apart from CPU time We reviewed a new methodology to find off-cpu bottlenecks This new process yielded significant improvement in Hyperspace ingestion speed

LinkedIn: Zamir Paltiel W ebsite : hyper-space.io Thank you! Let’s connect.
Tags