Monitoring and Managing Anomaly Detection on OpenShift.pdf

TosinAkinosho 212 views 23 slides Jun 13, 2024

Slide 1 of 23

About This Presentation

Monitoring and Managing Anomaly Detection on OpenShift

Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and ...

Size: 1.81 MB

Language: en

Added: Jun 13, 2024

Slides: 23 pages

Slide Content

Update confidential designator here
Version number here V00000
1

Monitoring and
Managing Anomaly
Detection on
OpenShift
Tosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift

Update confidential designator here
Version number here V00000
Project Overview and Purpose
2
▸Provide hands on tutorial and code for implementing anomaly detection
on edge devices
▸Enable real time identification of unusual behavior or failures on
resource-constrained IoT/edge devices
▸Cover end-to-end process from data collection and model training to
edge deployment and monitoring

Project Overview and Purpose

Update confidential designator here
Version number here V00000
Optional section marker
3
▸What is Anomaly Detection?
▸What is Edge (IoT)?
▸What is ArgoCD?
▸Deployment using ArgoCD for edge devices
▸What is Apache Kafka and S3?
▸Viewing Kafka messages in the data lake
▸What is Prometheus?
▸Monitoring application metrics with Prometheus
▸What is Camel K
▸Configuring Camel K integrations for data pipelines
▸What is a Jupyter notebook
▸Jupyter notebooks with code examples

Key Topics Covered

Update confidential designator here
Version number here V00000
What is Anomaly Detection?
4
Source:
Insert source data here
Insert source data here
What is Anomaly Detection?
●Definition: The process of identifying data points, events or observations that deviate significantly from expected
patterns or norms
●Purpose: To detect unusual, suspicious or rare occurrences that may indicate errors, threats, opportunities or insights
●Types of Anomalies:
○Point anomalies (single outlier data points)
○Contextual anomalies (anomalies based on context, e.g. time of year)
○Collective anomalies (anomalous sequences or sets of data)
●Techniques:
○Statistical methods (e.g. distribution tests)
○Machine learning (supervised, unsupervised, semi-supervised)
○Visualization and human analysis

Update confidential designator here
Version number here V00000
Applications (Use Cases)
5
Source:
Insert source data here
Insert source data here
Applications (Use Cases):

●Cybersecurity (network intrusions, fraud detection)
●Industrial (equipment failure, sensor errors)
●Business analytics (sales outliers, customer behavior)
●Scientific research (experimental outliers)

Update confidential designator here
Version number here V00000
What is Edge (IoT)?
6
Source:
Insert source data here
Insert source data here
What is Edge (IoT)?

●Definition: Edge computing brings processing power and data storage closer to the sources of data generation (e.g.
IoT devices, sensors)
●Key Concept: Processing data at or near the "edge" of the network, rather than sending all data to the cloud or a
central data center
●Why It's Important:
○Reduced latency for time-sensitive applications
○Reduced bandwidth usage by processing data locally
○Increased security and data privacy
○Resilience against network disruptions
●Edge Devices:
○IoT sensors, cameras, industrial equipment
○Gateways to aggregate and process data
○Edge servers/appliances for analytics and control

Update confidential designator here
Version number here V00000
What is Edge (IoT)?
7
Source:
Insert source data here
Insert source data here
Edge (IoT) (Use Cases):

●Smart manufacturing (predictive maintenance)
●Autonomous vehicles (real-time decision making)
●Smart cities (traffic optimization, public safety)
●Remote monitoring (oil rigs, renewable energy)

Update confidential designator here
Version number here V00000

What is ArgoCD?

8
Source:
Insert source data here
Insert source data here

What is ArgoCD?

●ArgoCD is an open-source, declarative, continuous delivery tool for Kubernetes
●It follows the GitOps pattern of using Git repositories as the source of truth for defining the desired application state
●It automates the deployment of applications to Kubernetes clusters by syncing the live state with the desired target
state specified in Git
●It is implemented as a Kubernetes controller that continuously monitors applications
●It enables GitOps workflows by treating Git as the single source of truth
●It supports declarative application definitions using Kubernetes manifests, Helm charts, Kustomize, etc.

Update confidential designator here
Version number here V00000
Deployment using ArgoCD for edge devices

9
Source:
Insert source data here
Insert source data here

Deployment using ArgoCD for edge devices

●Lists installed operators like amq-streams, camel-k,
cluster-config-app
●Displays data foundation and CI/CD pipeline operators
●Indicates sync status and health of each operator/application
●Allows syncing, refreshing, and deleting operators from the
dashboard

Update confidential designator here
Version number here V00000

What is Apache Kafka?

10
Source:
Insert source data here
Insert source data here

What is Apache Kafka?

●Distributed streaming platform for handling real-time data
feeds
●Open-source system developed by the Apache Software
Foundation
●Written in Java and Scala
●Provides three main capabilities:
○Publish and subscribe to streams of records
○Store streams of records in order they were generated
○Process streams of records in real-time
●Based on a partitioned log model
○Data is stored in ordered, immutable logs called topics
○Topics are partitioned and replicated across brokers for
scalability

Update confidential designator here
Version number here V00000

Viewing Kafka messages in the data lake

11
Source:
Insert source data here
Insert source data here

Viewing Kafka messages in the data lake

●Kafdrop allows us to view the data that are coming into kafka
●You can see different data for the topic Olympic
●Data is separated by the Offset number

Update confidential designator here
Version number here V00000

What is S3?

12
Source:
Insert source data here
Insert source data here

What is S3?

●S3 stands for Amazon Simple Storage Service, which is a
highly scalable object storage service provided by Amazon
Web Services (AWS).
○Object storage service for storing and retrieving any
amount of data from anywhere on the internet
○Provides 99.999999999% durability and 99.99%
availability of objects over a given year
○Stores data as objects with unique key identifiers in
buckets
○Supports various use cases like data lakes, backup and
restore, content delivery, big data analytics, etc.

Update confidential designator here
Version number here V00000

What is Prometheus?

13
Source:
Insert source data here
Insert source data here

What is Prometheus?

Prometheus is an open-source monitoring and alerting system widely
used for collecting and querying metrics from various sources.
●Open-source monitoring and alerting toolkit
●Designed for monitoring cloud-native and microservices
applications
●Collects time-series data as metrics from targets (servers,
databases, applications, etc.)
●Stores metrics data in a time-series database with
configurable retention
●Allows setting up alerting rules based on metric expressions
●Provides a web UI for visualizing metrics and managing alerts

Update confidential designator here
Version number here V00000

Monitoring application metrics with Prometheus

14
Source:
Insert source data here
Insert source data here

Monitoring application metrics with Prometheus

●You can view the metric data in Prometheus
●The example is showing three edge devices
engine_fuel_consumption
●End-Users would be able to view the live metrics and data
using this tool

Update confidential designator here
Version number here V00000

What is Camel K?

15
Source:
Insert source data here
Insert source data here

What is Camel K?

●Camel K is a lightweight integration platform for building and
running integration applications on Kubernetes
●It is based on the popular open-source Apache Camel
integration framework
●Provides a streamlined development experience for
Kubernetes-native integration solutions
●Supports common integration patterns like routing,
transformation, orchestration
●Built on top of the Quarkus Kubernetes-native Java
framework
●Part of the Red Hat Integration product portfolio
●Enables integration with external systems like Kafka,
databases, APIs

Update confidential designator here
Version number here V00000

Configuring Camel K Integrations for data pipelines?

16
Source:
Insert source data here
Insert source data here

Configuring Camel K integrations for data pipelines

●Logs indicate the Camel thread and Aggregator route
uploading data
●The pipeline is moving data from Kafka to S3 for the
"Olympic" dataset
●The files will be stored in a .txt file

Update confidential designator here
Version number here V00000

What is a Jupyter Notebook?

17
Source:
Insert source data here
Insert source data here

What is a Jupyter Notebook?

●Jupyter Notebook is an open-source web application that allows you to create and share documents containing live
code, visualizations, and narrative text.
●It provides an interactive computational environment for developing, documenting, and executing code.
●The notebook interface consists of cells that can contain code (in various programming languages like Python, R,
Julia), markdown text, equations, or visualizations.
●Each cell can be executed independently, allowing for an iterative and exploratory workflow.
●The output of code cells (text, graphics, tables, etc.) is displayed inline below the respective cells.
●Notebooks integrate code, rich text elements, and visualizations in a single shareable document.

Update confidential designator here
Version number here V00000

What is a Jupyter Notebook?

18
Source:
Insert source data here
Insert source data here

Jupyter notebooks with code examples

●It is a Jupyter notebook for training an anomaly detection model on train tonnage data
●The notebook covers data exploration, preprocessing, model training, and visualization steps
●It uses an Isolation Forest algorithm for the anomaly detection model
●The notebook generates visualizations like scatter plots, box plots, and heatmaps to analyze the data
●It demonstrates converting the trained model to ONNX format for inference
●The notebook includes code for loading the ONNX model, feature extraction, inference, and visualizing model
outputs

Update confidential designator here
Version number here V00000

Check the Correlations in the Data

19
Source:
Insert source data here
Insert source data here

Check the Correlations in the Data

Update confidential designator here
Version number here V00000

Correlation Heatmap

20
Source:
Insert source data here
Insert source data here

Correlation Heatmap

Update confidential designator here
Version number here V00000

Scatter Plot of Primary Suspension Stiffness vs. Train Acceleration

21
Source:
Insert source data here
Insert source data here

Scatter Plot of Primary Suspension Stiffness vs. Train Acceleration

Update confidential designator here
Version number here V00000

Links and How to get started

22
Source:
Insert source data here
Insert source data here

Links and How to get started

https://tosin2013.github.io/edge-anomaly-detection/
●Developer Deployment Instructions
●Viewing the Kafka messages under the data lake
project
●Checking the Prometheus charts for the application
●Configuring Camel K Ship integration
●edge-anomaly-detection-notebooks: Jupyter
notebooks with code examples and tutorials related
to this workshop.
●opcua-asyncio-build-pipelines: Build pipelines for
OPC UA applications using asyncio.
●Example App: A sample application demonstrating
engine room monitoring using OPC UA and asyncio.

Update confidential designator here
Version number here V00000
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHat
23
Red Hat is the world’s leading provider of enterprise
open source software solutions. Award-winning
support, training, and consulting services make Red
Hat a trusted adviser to the Fortune 500.

Thank you

Monitoring and Managing Anomaly Detection on OpenShift.pdf

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Monitoring and Managing Anomaly Detection on OpenShift.pdf

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx