Metrics, logs, traces, and mayhem_ introducing an observability adventure game powered by Grafana Alloy and OTel

ImmaValls 8 views 45 slides Oct 18, 2025
Slide 1
Slide 1 of 45
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45

About This Presentation

This hands-on workshop transforms learning about observability into an engaging adventure!

Step into a text-based game where you’ll master essential observability tools—metrics, logs, and traces—and discover how to leverage them effectively for real-world troubleshooting and gaining critical ...


Slide Content

Logs, Metrics, Traces and Mayhem:
An observability adventure game powered by Grafana Alloy and OTel

Who am I?
Imma Valls
Staff Developer Advocate
Grafana Labs
https://eyeveebee.dev/
https://github.com/immavalls
https://www.linkedin.com/in/imma-valls

Setting the scene
1.What is observability?
2.What can go wrong?
3.Some telemetry signals: metrics, logs & traces
4.Let’s play!
5.Not just playing …
6.More observability adventures, please!
7.Takeaways

Black Box
Input Output
1.What is observability?
print(“Why is this broken?!” )

2. Observability Lessons
Matchmaking DB
Failures
2021
Performance
Problems
2020
Login & Scaling
Issues
2023
When things go wrong…

2. Why does this happen?
●Complexity
○Dawn of Microservices
●Companies are victim of their own success
○Demand
●We are all human…
○Bad code
○Bad package
○Users do User things

3. Key components of observability
Metrics
“What”
Logs
“Why”
Traces
“Where”

3.1. Metrics - What happened?
●Quantifiable data
●Time series
●Unique name
●Label key-value pairs
●Metric types (counter, gauge, etc)
http_requests_total{method="post",code="200",path="/api/v1/users"} 1234
node_memory_usage_bytes{instance="server-01",job="node-exporter"} 8432678912

3.2. Logs - Why did it happen?
●Bread crumbs…
●Log entries can take many
forms.
[2024-04-09T12:44:24.076] WARN: One of your party members has fallen ill! We should find medicine soon.
[2024-04-09T13:44:24.076] ERROR: Your party member has died…
Log Entry
Timestamp
Severity
Message
Metadata

3.3. Traces - Where did it happen?
●Track requests across services
●Traces are made of spans
●Help pinpoint slowdowns and
failures
●Essential for distributed systems

3.4. Microservices Example
Customers’ interface for placing orders
Kitchen Service
Manage order preparation
Delivery System
Tracking and final delivery
Pizza Delivery Application

3.5. Exemplars
●A rich data point that links a high-level
metric observation (e.g. a latency spike) to
a specific trace (span / trace ID).
●Bridges the gap between Metrics and
Traces.
●In Grafana, exemplars appear as markers
on your metric graphs / link .
https://grafana.com/docs/grafana/latest/fundamentals/exemplars

3.4. Exemplars
●Pizzas are arriving late (metric), why?

3.4. Exemplars
●Pizzas are arriving late (metric), why?
●From the latency graph, go to traces
associated with those delays, and check
where the delay happens (exemplar)

3.4. Exemplars
●Pizzas are arriving late (metric), why?
●From the latency graph, go to traces
associated with those delays, and check
where the delay happens (exemplar)
●It’s raining where orders are delayed!

How do all these telemetry signals
come together?

4. Let’s play!
●Forge an awesome sword
●Make it more powerful
●Take on a quest
●Defeat the wizard
●Save the town!
Teleprinter

4.1. Creating the adventure!
●Key components:
○OpenTelemetry
○Grafana Alloy
○Prometheus
○Loki
○Tempo
○Grafana
main.py

4.2. OpenTelemetry
●A method of standardizing telemetry signals:
○Framework
○API
○SDK
○Collector
●We use the Python SDK to standardize how we
write metrics and logs

main.py

4.3. Grafana Alloy
●OpenTelemetry collector distribution
●Alloy performs two tasks in our adventure
○Receives
○Sends

4.4. Telemetry Storage
Loki
●Logs Store
●LogQL
Prometheus
●Metrics Store
●PromQL

Tempo
●Traces Store
●TraceQL

4.5. Grafana
●Open-source Observability Platform
●Visualise, Monitor, and Analyse
●Big tent philosophy
●Alerts

4. Let’s play!
https://github.com/grafana/adventure

4. Let’s play!
https://bit.ly/o11y-workshop-openfest-25

4. Let’s play!
1.What metrics are useful to play the game?
2.What useful log appears when we accept the offer
from the mysterious man in the village?
3.What dashboard allows us to view traces?
4.What does an exemplar allow us to view?
5.How do we navigate from logs to traces?
6.How do we navigate telemetry signals without using
logQL or promQL?
7.Did we have any useful alerts set up?
8.Did anyone spot an Easter egg???????

1.What metrics are useful to play the game?
4. Let’s play!

2. What useful log appears when we accept the offer
from the mysterious man in the village?
4. Let’s play!

3. What dashboard allows us to view traces?
4. Let’s play!

4. What does an exemplar allow us to view?
4. Let’s play!

5. How do we navigate from logs to traces?
4. Let’s play!

4. Let’s play!
6. How do we navigate telemetry signals without using
logQL or promQL?

6. How do we navigate telemetry signals without using
logQL or promQL?
https://grafana.com/blog/2025/02/20/grafana-drilldown-apps-the-improved-queryless-experience-formerly-known-as-the-explore-apps/

4. Let’s play!

7. Did we have any useful alerts set up?
4. Let’s play!

4. Let’s play!
7. Did we have any useful alerts set up?

8. Did anyone spot an Easter egg???????
4. Let’s play!

5. Not just playing …
1.Create a cheat counter




2.Modify the dashboard to add the counter
https://opentelemetry.io/docs/specs/otel/metrics/api

5. Not just playing…
1.Create a cheat counter

5. Not just playing …
2. Modify the dashboard to add the counter

6. More observability adventures, please!
https://github.com/grafana/alloy-scenarios/tree/main/game-of-tracing
●OpenTelemetry tracing
fundamentals
●Context propagation & span links
●Correlation for debugging

6. More observability adventures, please!
https://grafana.com/blog/2025/08/11/learn-opentelemetry-traci
ng-through-a-grand-strategy-game-introducing-game-of-traces/

https://killercoda.com/grafana-labs/course/workshops/game-of-traces
6. More observability adventures, please!

6. More observability adventures, please!

7. Takeaways
●Observability is no longer a luxury
●What, Why, Where
●OpenTelemetry unifies telemetry signals
●Learning Observability doesn’t have to be a chore

Thanks for listening, adventurer!
Have ye any questions?

Join our Community!