The Art of Event Driven Observability with OpenTelemetry

ScyllaDB 265 views 41 slides Jun 26, 2024
Slide 1
Slide 1 of 41
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41

About This Presentation

To Make sure we provide a reliable system to our users we have implemented observability on our critical applications, services to be able understand precisely what is really happening in our environment. To help us in our Observability journey, OpenTelemetry has provided a standard on the way we ar...


Slide Content

The Art of Event-Driven Observability with OpenTelemetry Henrik Rexed Cloud Native Advocate | Dynatrace

Henrik Rexed Cloud Native Advocate 15+ years of Performance engineering Owner of Producer of

Get a gentle reminder on OpenTelemetry The various components How to produce traces Learn the various way of instrumenting EDA architecture See the value of Span links If you stay with me you will ...

Once Upon a Time

24 years ago … My Manager taught me how to use tools to get system health (perfmon, top, nmon…etc). Web Server Health 1999

20 years ago… With solution having history and providing metadata from our environment helped me to design a better visualization: Web Server Health 2004

13 years ago… With the usage of APM solutions, distributed traces, metrics , it was easier to represent the situation: Web Server Health 2010

10 years ago… By looking at the logs produced by our application and servers, the situation was ….a bit ….clearer Web Server Health 2014

What I wanted to represent What I wanted to represent

Our artistic tools

Observability pillars Logs Events Metric Observability Traces Profiling Exception

Why do we need several signals? Analysis Context

We isolate our signals Most of the Organizations tends to separate the usage and storage of : Logs Traces Profiling Metrics

OpenTelemetry

OTel provides a set of  APIs , libraries , and tools to capture  distributed traces ,  metrics , and logs  from your applications. What is OpenTelemetry (OTel)? Azure Kubernetes GCP const span =   tracer .startSpan( 'operation1’ ); span .setAttribute( ‘customer_level’ , ‘gold’ ); span .end(); Service Python Service Node.js OTel Collector (optional) OTel API + SDK

The main Component of OpenTelemetry Instrument Collector

OpenTelemetry the Standard for Observability Metrics Logs Traces Continuous Profiling

How to produce traces?

How to add OTel to your applications? Service Java OTel Collector (optional) Manual Instrumentation Service Python const span =   tracer .startSpan( 'operation1’ ); span .setAttribute( ‘customer_level’ , ‘gold’ ); span .end(); Service Node.js // Pre-instrumented libraries pip install opentelemetry-instrumentation opentelemetry-instrument python myapp.py // OTel agent java –javaagent: path/to/otel javaagent.jar \ -jar myapp.jar Semi-automatic/ Auto instrumentation Observability backend

What is a trace? A trace is made of Spans. The trace Context glue all the various spans into a trace. Span 1 Span 2 Span 3 Span 4 Span 5 Span 6 Span Name Context Parent_id Start_time End_time Atribute Events { " name ": "Hello-Greetings", " context ": { " trace_id ": "0x5b8aa5a2d2c872e8321cf37308d69df2", " span_id ": "0x5fb397be34d26b51", }, "parent_id ": "0x051581bf3cb55c13", ‘ status_code’ “ STATUS_CODE_OK ” " start_time ": "2022-04-29T18:52:58.114304Z", " end_time ": "2022-04-29T18:52:58.114435Z", " attributes ": { " http.route ": "some_route1" }, " events ": [ { " name ": "hey there!", " timestamp ": "2022-04-29T18:52:58.114561Z", " attributes ": { " event_attributes ": 1 } }, { " name ": "bye now!", " timestamp ": "2022-04-29T22:52:58.114561Z", " attributes ": { " event_attributes ": 1 } } ], }

Tracing Instrumentation Otel OpenTelemetry API Propagator TraceProvider SpanProcessor Sampler Exporter Ressource Span

Resource Resource is the identity of a process production Telemetry data. The Resource is key to name telemetry components in the Observability backends. There are standard attributes to define a resource : Attribute Type Description Required? Service.name String Name of the service Yes Service.namespace String Namespace of the service.name No Service.instance.id String No Service.version String Version number of the service No

Tracing Sampler AlwaysON AlwaysOff ParentBased TraceIdratioBased parentbased_always_on parentbased_traceidratio parentbased_always_off Service A Service B Service C Service D Service E Observability backend 50% 25% 50% 10% 10% 1000 requests = 1 request with End2End trace with spans from service E

What is propagation? Service A Service B Context propagation Inject extract

In a traditional micro service architecture

Distributed tracing in normal architecture Distributed tracing allow us to attach: Keep track on all the tasks of a given transactions With this visualization we can understand where we are spending time in our transaction flow ‹#›

Distributed tracing in EDA With OpenTelemetry, I can represent my transactions in various way : One Big Traces Separate my transactions in sub transactions. ‹#›

Example 1 – End2End trace

Our environnent Otel-demo OpenTelemetry Operator Default EDA https://github.com/isItObservable/tracing_eda

Are generated distributed traces useful?

Example 2 – The usage of Span Links

Span Links Span link is feature allowing to create a relationship between separated spans. ~2min ~200ms ~2min ~1m30 Span link Span link

Our environnent Otel-demo OpenTelemetry Operator Default EDA https://github.com/isItObservable/tracing_eda

Conclusion

Pros Cons Sampling decision Relation between Producer & Consumer Visualize traces Keep track on the consumers Pros/Cons

Take Away Instrumentation Make sure your code is agnostic using no Vendor library and exporter Observability Make sure the metrics produced has enough dimensions Produce logs with contextual information Add Span Attributes to your spans Creativity Understand your system Design the right Observability

Looking for educational content on Observability , Checkout the Youtube Channel : Is It Observable Is it observable

Henrik Rexed [email protected] @twitter website/blog url Thank you! Let’s connect.
Tags