stackconf 2024 | Ignite: Distributed Tracing using OpenTelemetry and Jaeger by AJ Jester.pdf

NETWAYS 49 views 20 slides Jul 25, 2024
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

Several years ago, when you had a monolithic application, it was fairly easy to debug and diagnose since there was probably only one service with a couple of users. Nowadays systems are broken up into smaller microservices deployed in containers on top of Kubernetes in multiple clusters across diffe...


Slide Content

1
Distributed Tracing using
OpenTelemetry and Jaeger
1
Prepared by: AJ

22
●Several years ago when you had a monolithic application, it was
fairly easy to debug and diagnose since there was probably only
one service with a couple of users.
●Nowadays systems are broken up into smaller microservices
deployed in containers on top of Kubernetes in multiple clusters
across different cloud environments.
●In these kinds of distributed environments, there is a need to
observe it all, both the overall picture, and, if need be, at a more
granular level.

33
●Observability can be roughly divided into three sub-categories:
logging, metrics, and tracing.
●Tracing is the term used to describe the activity of recording and
observing requests made by an application, and how those
requests propagate through a system.
●When systems are set up in a distributed fashion, we call this
distributed tracing as it involves observing the application and its
interactions through multiple systems.
●For example, as a developer your code probably includes multiple
functions, but you are more interested in how long the functions
take to execute and the interdependency of those functions as they
are used in the application.

44
Tracing will give the necessary insights by
●Identifying performance and latency bottlenecks
●Finding root cause analysis after major incidents
●Determining dependencies in multi-service
architecture
●Monitoring transactions within your application

55
●Let’s start by taking a look where it makes
sense to add tracing.
●Here we’ll get started with a small Python
application that will show a couple of
simple operations.
●We’ll then later add the code for tracing to
measure the time it takes for our code to
execute.

66
Initially the code would look something like this
minio_client.fput_object(config["dest_bucket"],
file_path, file_path)

77

88
Install OpenTelemetry…
$ pip install opentelemetry-api
$ pip install opentelemetry-sdk
and import Trace
from opentelemetry import trace

99
Initialize the trace…
resource = Resource(attributes={
SERVICE_NAME: "my-minio"
})
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(ConsoleSpanExporter())
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
and create a Span
with tracer.start_as_current_span("check if
bucket exist"):

1010
You can create additional multiple spans
with tracer.start_as_current_span("create object to add"):
with tracer.start_as_current_span("add created object to bucket"):
with tracer.start_as_current_span("fetch object from bucket"):
with tracer.start_as_current_span("remove object from bucket"):
with tracer.start_as_current_span("check if bucket exists"):

1111
Once a span is created you can add custom attributes
●Adding a function name
current_span.set_attribute("function.name", "CHECK_BUCKET")
●Adding a Span event
current_span.add_event("Checking if bucket exists.")
if not minio_client.bucket_exists(config["dest_bucket"]):
current_span.add_event("Bucket does not exist, going to create it.")
minio_client.make_bucket(config["dest_bucket"])

1212
You can create multiple span events
current_span.add_event("Test object has been placed in bucket.")
current_span.add_event("Test object has been fetched from
bucket.")
current_span.add_event("Test object has been removed from
bucket.")
current_span.add_event("Bucket exists, going to remove it.")

1313
Configuring Jaeger
●Install Jaeger package
pip install opentelemetry-exporter-jaeger
●Import and Configure Jaeger
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
jaeger_exporter = JaegerExporter(
agent_host_name="localhost",
agent_port=6831,
)

1414
Go to http://localhost:16686/ to access the Jaeger UI

1515
Using Jaeger UI find the traces

1616
So far we’ve only made one request, and there should
be a couple of traces from the few spans we created.

1717
Click on one of the traces which shows 2 Spans; let's
choose the one that says “check if bucket exists”.

1818
After running the script five to six times, we can start
to see patterns emerge.

1919
Is tracing the only option?
●Tracing is just one of the pieces in the path
towards observability.
●Generally when incidents happen it's not just
one trace or one log or one metric that we use to
determine and resolve the issue.
●Often it is understanding the combination of
these things that is required to get to the root
cause.

20
Thank You
LinkedIn: aj-jester
https://www.linkedin.com/in/aj-jester/