The Art of Event Driven Observability with OpenTelemetry
ScyllaDB
265 views
41 slides
Jun 26, 2024
Slide 1 of 41
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
About This Presentation
To Make sure we provide a reliable system to our users we have implemented observability on our critical applications, services to be able understand precisely what is really happening in our environment. To help us in our Observability journey, OpenTelemetry has provided a standard on the way we ar...
To Make sure we provide a reliable system to our users we have implemented observability on our critical applications, services to be able understand precisely what is really happening in our environment. To help us in our Observability journey, OpenTelemetry has provided a standard on the way we are able to produce and collect measurements in our environments. Since the stable support of Traces in OpenTelemtry, many organizations are utilizing traces to understand what is happening and where we are spending time in our applications. We usually try to have an end 2 end traces starting from the mobile device or the browser of our users to the Database. Building tracing in a microservices architecture is usually a straightforward concept where each services has dependencies. But for architectures using events or messages to exchange information between services, …. it’s another story. How should a distributed trace look like for event driven architectures where you potentially have hundreds of asynchronous paths to a single message? Does it make sense to create an end-2-end trace for everything? Observability is not a science but art. We all get the brushes, colours, paper, and we all try to design a painting representing our system. In messaging architecture there are not one solution for all. We need to understand our system to paint the right telemetry. Sometimes in Event driven architecture, the usage of span links will resolve our challenge by having several traces for a transaction, but all those traces will be attached with the help of span links. During this presentation we will explain:
- The various components of OpenTelemetry
• Share examples not useful traces from event driven architecture.
• The purpose of span links
• The usage of span links in event driven architecture.
Size: 18.38 MB
Language: en
Added: Jun 26, 2024
Slides: 41 pages
Slide Content
The Art of Event-Driven Observability with OpenTelemetry Henrik Rexed Cloud Native Advocate | Dynatrace
Henrik Rexed Cloud Native Advocate 15+ years of Performance engineering Owner of Producer of
Get a gentle reminder on OpenTelemetry The various components How to produce traces Learn the various way of instrumenting EDA architecture See the value of Span links If you stay with me you will ...
Once Upon a Time
24 years ago … My Manager taught me how to use tools to get system health (perfmon, top, nmon…etc). Web Server Health 1999
20 years ago… With solution having history and providing metadata from our environment helped me to design a better visualization: Web Server Health 2004
13 years ago… With the usage of APM solutions, distributed traces, metrics , it was easier to represent the situation: Web Server Health 2010
10 years ago… By looking at the logs produced by our application and servers, the situation was ….a bit ….clearer Web Server Health 2014
What I wanted to represent What I wanted to represent
We isolate our signals Most of the Organizations tends to separate the usage and storage of : Logs Traces Profiling Metrics
OpenTelemetry
OTel provides a set of APIs , libraries , and tools to capture distributed traces , metrics , and logs from your applications. What is OpenTelemetry (OTel)? Azure Kubernetes GCP const span = tracer .startSpan( 'operation1’ ); span .setAttribute( ‘customer_level’ , ‘gold’ ); span .end(); Service Python Service Node.js OTel Collector (optional) OTel API + SDK
The main Component of OpenTelemetry Instrument Collector
OpenTelemetry the Standard for Observability Metrics Logs Traces Continuous Profiling
How to produce traces?
How to add OTel to your applications? Service Java OTel Collector (optional) Manual Instrumentation Service Python const span = tracer .startSpan( 'operation1’ ); span .setAttribute( ‘customer_level’ , ‘gold’ ); span .end(); Service Node.js // Pre-instrumented libraries pip install opentelemetry-instrumentation opentelemetry-instrument python myapp.py // OTel agent java –javaagent: path/to/otel javaagent.jar \ -jar myapp.jar Semi-automatic/ Auto instrumentation Observability backend
What is a trace? A trace is made of Spans. The trace Context glue all the various spans into a trace. Span 1 Span 2 Span 3 Span 4 Span 5 Span 6 Span Name Context Parent_id Start_time End_time Atribute Events { " name ": "Hello-Greetings", " context ": { " trace_id ": "0x5b8aa5a2d2c872e8321cf37308d69df2", " span_id ": "0x5fb397be34d26b51", }, "parent_id ": "0x051581bf3cb55c13", ‘ status_code’ “ STATUS_CODE_OK ” " start_time ": "2022-04-29T18:52:58.114304Z", " end_time ": "2022-04-29T18:52:58.114435Z", " attributes ": { " http.route ": "some_route1" }, " events ": [ { " name ": "hey there!", " timestamp ": "2022-04-29T18:52:58.114561Z", " attributes ": { " event_attributes ": 1 } }, { " name ": "bye now!", " timestamp ": "2022-04-29T22:52:58.114561Z", " attributes ": { " event_attributes ": 1 } } ], }
Resource Resource is the identity of a process production Telemetry data. The Resource is key to name telemetry components in the Observability backends. There are standard attributes to define a resource : Attribute Type Description Required? Service.name String Name of the service Yes Service.namespace String Namespace of the service.name No Service.instance.id String No Service.version String Version number of the service No
Tracing Sampler AlwaysON AlwaysOff ParentBased TraceIdratioBased parentbased_always_on parentbased_traceidratio parentbased_always_off Service A Service B Service C Service D Service E Observability backend 50% 25% 50% 10% 10% 1000 requests = 1 request with End2End trace with spans from service E
What is propagation? Service A Service B Context propagation Inject extract
In a traditional micro service architecture
Distributed tracing in normal architecture Distributed tracing allow us to attach: Keep track on all the tasks of a given transactions With this visualization we can understand where we are spending time in our transaction flow ‹#›
Distributed tracing in EDA With OpenTelemetry, I can represent my transactions in various way : One Big Traces Separate my transactions in sub transactions. ‹#›
Pros Cons Sampling decision Relation between Producer & Consumer Visualize traces Keep track on the consumers Pros/Cons
Take Away Instrumentation Make sure your code is agnostic using no Vendor library and exporter Observability Make sure the metrics produced has enough dimensions Produce logs with contextual information Add Span Attributes to your spans Creativity Understand your system Design the right Observability
Looking for educational content on Observability , Checkout the Youtube Channel : Is It Observable Is it observable