Amazon DevOps Guru for Serverless Applications at DevOpsCon 2024 London
VadymKazulkin
49 views
97 slides
May 03, 2024
Slide 1 of 97
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
About This Presentation
In this talk, we’ll use a standard serverless application that uses API Gateway, Lambda, DynamoDB, SQS, Step Functions (and other AWS-managed services). We'll explore how Amazon DevOps Guru recognizes operational issues and anomalies like increased latency and error rates (timeouts, throttling...
In this talk, we’ll use a standard serverless application that uses API Gateway, Lambda, DynamoDB, SQS, Step Functions (and other AWS-managed services). We'll explore how Amazon DevOps Guru recognizes operational issues and anomalies like increased latency and error rates (timeouts, throttling, and resource limits) and integrate DevOps Guru with PagerDuty to provide even better incident management. Amazon DevOps Guru analyzes data like application metrics, logs, events, and traces to establish baseline operational behavior and then uses ML to detect anomalies. The service uses pre-trained ML models that are able to identify spikes in application requests, so it knows when to alert and when not to.
Size: 7.33 MB
Language: en
Added: May 03, 2024
Slides: 97 pages
Slide Content
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
Amazon DevOps Guru for the Serverless applications
Vadym Kazulkin, ip.labs, DevOpsCon London , April 9 2024
Amazon DevOps
Guru for the
Serverless
Applications
1
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
Contact
Vadym Kazulkin
ip.labs GmbH Bonn, Germany
Co-Organizer of the Java User Group Bonn [email protected]
@VKazulkin
https://dev.to/vkazulkin
https://github.com/Vadym79/
https://de.slideshare.net/VadymKazulkin/
https://www.linkedin.com/in/vadymkazulkin
https://www.iplabs.de/
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
About ip.labs
4 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Lifecycle
5 Amazon DevOps Guru for the Serverless Applications
c
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
Amazon DevOps Guru
6 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH7
AIOPs
ArtificialIntelligenceforITOperations(AIOps)istheprocessofusingmachine
learningtechniquestosolveoperationalproblems.ThegoalofAIOpsistoreduce
humaninterventionintheIToperationsprocesses.
Byusingadvancedmachinelearningtechniques,youcanreduceoperational
incidentsandincreaseservicequality.AIOpscanhelpyouwith:
•Increaseservicequality
•forexample,bygroupingrelatedincidentsbasedontimeandlanguageorby
predictingKnowledgeBasearticlestosolveanincident
•Predictincidentsbeforetheyhappen
•Classifynewincidentsandinsights
https://aws.amazon.com/devops-guru
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH8
What is AWS DevOps Guru
AmazonDevOpsGuruoffersafullymanagedAIOpsplatformpoweredby
machinelearning(ML)thatisdesignedtomakeiteasytoimproveanapplication’s
operationalperformanceandavailability
DevOpsGuruhelpsdetectbehaviorsthatdeviatefromnormaloperatingpatternsso
youcanidentifyoperationalissueslongbeforetheyimpactyourcustomers
•increasedlatency
•errorrates(timeouts,throttles,CPU,memoryand,diskutilization)
•resourceconstraints(exceedingAWSaccountlimits)
https://aws.amazon.com/devops-guru
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH9
Benefits of DevOps Guru
https://aws.amazon.com/devops-guru
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH10
https://aws.amazon.com/devops-guru
How DevOps Guru work
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH11
DevOps Guru is powered by pre-trained ML models
•Builtdomain-specific,single-purposemodelstoidentifyknownfailuremodesinstead
ofnormalmetricbehavior.
•DevOpsGurureliesonalargeensembleofdetectors—statisticalmodelstunedto
detectcommonadversescenariosinavarietyofoperationalmetrics.
•DevOpsGurudetectorsdon’tneedtobetrainedorconfigured.Theywork
instantlyaslongasenoughhistoryisavailable.
•Individualdetectorsworkinpreconfiguredensemblestogenerateanomalieson
someofthemostimportantmetricsoperatorsmonitor:errorrates,availability,
latency,incomingrequestrates,CPU,memory,anddiskutilization,among
others.
https://aws.amazon.com/blogs/machine-learning/amazon-devops-guru-is-powered-by-pre-trained-ml-models-that-encode-operational-excellence/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH12
DevOps Guru pre-trained ML detectors
•Detector identified a significant trend
in disk usage, heading for disk
exhaustion within 24 hours.
•The model has identified a significant
trend between the vertical dashed lines.
Extrapolating this trend (diagonal dashed
line), the detector predicts time to
resource exhaustion.
•As the metric breaches the horizontal
red line, which acts as a significance
threshold, the detector notifies
operators.
https://aws.amazon.com/blogs/machine-learning/amazon-devops-guru-is-powered-by-pre-trained-ml-models-that-encode-operational-excellence/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH13
DevOps Guru pre-trained ML detectors with periodic behaviors
•Many metrics, such as the number of
incoming requests in customer-facing
APIs, exhibit periodic behavior.
•The purpose of the causal convolution
detector is to analyze temporal data
with such patterns and to determine
expected periodic behavior.
•When the detector infers that a metric
is periodic, it adapts normal metric
behavior thresholds to the seasonal
pattern.
https://aws.amazon.com/blogs/machine-learning/amazon-devops-guru-is-powered-by-pre-trained-ml-models-that-encode-operational-excellence/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru Example Application
14
https://github.com/Vadym79/DevOpsGuruWorkshopDemoinspired by https://github.com/aws-samples/serverless-java-frameworks-samples
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
How future of software developers may look like
15 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH16
Monitoring & Alerting of the Serverless Applications
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH17
Monitoring & Alerting of the Serverless Applications
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH18
Monitoring & Alerting of the Serverless Applications
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH19
DevOps Guru Set Up
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH20
DevOps Guru Set Up with AWS Organizations
https://aws.amazon.com/blogs/mt/how-to-easily-configure-devops-guru-across-your-organization-with-systems-manager-quick-setup/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru Dashboard
21 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru Dashboard
22 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru Reactive Insights
23 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru Examples
24
•Warm up the application (takes between 1 and 24 hours) to create a
base line
•Design test experiment to provoke errors and latency increase
•Reduce the service quote of the AWS service (API Gateway, Lambda,
DynamoDB)
•Set very low service quotas for the sake of reducing AWS costs
•Add latency artificially
•Stress test with HeyTool to run into the operational issues
•See if the DevOps Guru recognized the operational issues
•Remediate the operational issues by increasing service quote, removing
the artificial latency or stopping the stress test
•See whether DevOps Guru closes the incident when it’s resolved
https://github.com/rakyll/hey
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru: Recognize Operational Issues in DynamoDB
25 Amazon DevOps Guru for the Serverless Applications
c
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru Examples: DynamoDB Throttling
26
hey -q 20 -z 15m -c 20 -H "X-API-Key: XXXa6XXXX "
https://XXX.execute-api.eu-central
1.amazonaws.com/prod/products/1
Amazon DevOps Guru for the Serverless Applications
c
Vadym Kazulkin| @VKazulkin | ip.labsGmbH27
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH28
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH29
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH30
c
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH31
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH32
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH33
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH34
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH35
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH36
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH37
DevOps Guru Examples: DynamoDB Throttling
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH38
https://aws.amazon.com/de/blogs/aws/automatically-detect-operational-issues-in-lambda-functions-with-amazon-devops-guru-for-serverless/
•Table or Account Level read/write capacity forDynamoDB consumption reaching
account limit
•Triggered when the account consumed capacity isapproaching table or account-
level limits during a period of time
Other DynamoDB Operational Issues and The Proactive
Insights 1/2
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH39
•DynamoDB table consumed capacity reaching
AutoScalingMaximum parameter limit
•Triggered when table consumed capacity
is reaching
AutoScalingMax parameters limit over a
period.
Other DynamoDB Operational Issues and The Proactive
Insights 2/2
https://aws.amazon.com/de/blogs/aws/automatically-detect-operational-issues-in-lambda-functions-with-amazon-devops-guru-for-serverless/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH40
DevOps Guru: Recognize operational issues in API Gateway
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru: Recognize Operational Issues in DynamoDB
41 Amazon DevOps Guru for the Serverless Applications
c
Vadym Kazulkin| @VKazulkin | ip.labsGmbH42
DevOps Guru Examples: API Gateway
HTTP 429 „too many requests“ Error
Query to exaust the quota
hey -q 10 -z 1m -c 10 -H "X-API-Key:
XXXa6XXXX" https://XXX.execute-api.eu
-central-1.amazonaws.com/prod/
products/1
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH43
DevOps Guru Examples: API Gateway
HTTP 404 „Not Found“ Error
Query for not existing product id ,e.g. 200
hey -q 1 -z 15m -c 1 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-
api.eu-central-1.amazonaws.com/prod/products/200
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH44 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH45
DevOps Guru: Recognize Operational Issues in Lambda
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
DevOps Guru: Recognize Operational Issues in DynamoDB
46 Amazon DevOps Guru for the Serverless Applications
c
Vadym Kazulkin| @VKazulkin | ip.labsGmbH47
DevOps Guru Examples: Lambda Throttling 1
hey -q 5 -z 15m -c 5 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-api.eu-
central-1.amazonaws.com/prod/products/1
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH48
DevOps Guru Examples: Lambda Throttling 1
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH49
Add 31 sec latency in the code of the Lambda function
DevOps Guru Examples: Lambda Timeout Error
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH50
Java 11 runtime requires 256 MB memory to start and execute this function
Java 21 can sucessfully start with 128 MB memory
DevOps Guru Examples: Lambda Init Error
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH51
DevOps Guru Examples: Lambda Error
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH52
Temporary add 28 sec latency in the code of the Lambda function
DevOps Guru Examples: Lambda Increased Latency
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH53
DevOps Guru Examples: Lambda Increased Latency
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH54
DevOps Guru: Recognize Operational Issues in SQS
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH55
Temporary add 26 sec latency in
the code of the Lambda function
DevOps Guru: Operational Issues in SQS
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH56
DevOps Guru: Operational Issues in SQS
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH57
DevOps Guru: Recognize Operational Issues with SNS
Amazon DevOps Guru for the Serverless Applications
External API Endpoint
don‘t respond with
HTTP 200-499
Vadym Kazulkin| @VKazulkin | ip.labsGmbH58
DevOps Guru: Operational Issues in SNS
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH61
DevOps Guru: Recognize Operational Issues in Aurora
Serverless v2 PostgreSQL
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH62
DevOps Guru Examples: Enabling Performance
Insights for Aurora Serverless v2
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH63
DevOps Guru Examples: Operational Issues Lambda -
> Aurora Serverless v2 w/o RDS Proxy
hey -q 100 -z 15m -c 100 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-
api.eu-central-1.amazonaws.com/prod/productsWithoutDataApi/2
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH64
DevOps Guru: Recognize Operational Issues in Aurora
Serverless v2 PostgreSQL using DataAPI
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH65
DevOps Guru Examples: Operational Issues Lambda -> Aurora
Serverless v2 using DataAPI
hey -q 100 -z 15m -c 100 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-
api.eu-central-1.amazonaws.com/prod/productsWithDataApi/2
Amazon DevOps Guru for the Serverless Applications
No Aurora Serverless DB anomalous metrics
detected
Vadym Kazulkin| @VKazulkin | ip.labsGmbH66
DevOps Guru Examples: Operational Issues Lambda -> Aurora
Serverless v2 using DataAPI
hey -q 100 -z 15m -c 100 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-
api.eu-central-1.amazonaws.com/prod/productsWithDataApi/2
Amazon DevOps Guru for the Serverless Applications
Data APINon Data API Non Data APIData API
Non Data API
Non Data APIData API Data API
Non Data APIData API
Vadym Kazulkin| @VKazulkin | ip.labsGmbH68
DevOps Guru Examples: Operational Issues Lambda ->
Aurora Serverless v2 using DataAPI second experiment
hey -q 100 -z 15m -c 100 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-
api.eu-central-1.amazonaws.com/prod/productsWithDataApi/2
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH69
DevOps Guru: Recognize Operational Issues Amazon
in Kinesis
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH70
DevOps Guru Examples: Operational Issues in
Amazon Kinesis Data Stream -> Lambda -> (S3)
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH71
DevOps Guru: Recognize Operational Issues with
DynamoDB Stream and Dead Letter SQS Queue
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH72
DevOps Guru Examples Operational Issues with
DynamoDB Stream and Dead Letter SQS Queue 1st try
hey -q 1 -z 15m -c 1 -m PUT -d '{"id": 1, "name": "Print 10x13", "price": 0.15}' -H "X-API-Key: XXXa6XXXX"
https://XXX.execute-api.eu-central-1.amazonaws.com/prod/products
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH73
hey -q 1 -z 15m -c 1 -m PUT -d '{"id": 1, "name": "Print 10x13", "price": 0.15}' -H "X-API-Key: XXXa6XXXX"
https://XXX.execute-api.eu-central-1.amazonaws.com/prod/products
Amazon DevOps Guru for the Serverless Applications
DevOps Guru Examples Operational Issues with
DynamoDB Stream and Dead Letter SQS Queue 2nd try
Vadym Kazulkin| @VKazulkin | ip.labsGmbH74
DevOps Guru: Recognize Operational Issues in
AWS Step Functions
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH75
DevOps Guru Examples: Operational Issues
in Amazon Step Functions -> Lambda
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH76
DevOps Guru Proactive Insights
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH77
DevOps Guru Proactive Examples: DynamoDB table
reads/writes are under utilized
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH78
DevOps Guru Proactive Examples: DynamoDB table
point in time recovery not enabled
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH79
DevOps Guru Proactive Examples: Lambda
timeout exceeds recommended SQS visibility
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH80
DevOps Guru Proactive Examples: Lambda Timeout Exceeds
Recommended SQS Visibility
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH81
DevOps Guru Proactive Examples: SQS Triggered Lambda
Does Not Have a DLQ
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH82
DevOps Guru Proactive Examples: Lambda Function Consuming
DynamoDB/Kinesis Stream Without Failure Destination
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH83
DevOps Guru Proactive Examples: Enhanced stream metrics
disabled for Kinesis Stream
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH84
DevOps Guru Proactive Examples: Lambda Function Has
Concurrency Spillover
hey -q 1 -z 30m -c 9 -m DELETE -H "X-API-Key: XXXa6XXXX" -H "Content-Type: application/json;charset=utf-
8" https://XXX.execute-api.eu-central-1.amazonaws.com/prod/products/11
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH85
DevOps Guru Proactive Examples: Lambda Function
does not have enough subnets
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH86
DevOps Guru integration in Incident
Management Tools
•AWS OPsCenter(via AWS Systems Manager)
•PagerDuty
•Atlassian Opsgenie
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH87
DevOps Guru Integration Settings
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH88
DevOps Guru Integration with PagerDuty
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH89
DevOps Guru Integration with PagerDuty
https://www.pagerduty.com/docs/guides/amazon-devops-guru-integration-guide/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH90
DevOps Guru Integration with PagerDuty
https://www.pagerduty.com/docs/guides/amazon-devops-guru-integration-guide/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH91
DevOps Guru Integration with PagerDuty
Enter „Integration
URL“ generated by
PagerDuty
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH93
DevOps Guru PagerDuty Incidents
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH94
DevOps Guru PagerDuty Incidents
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH95
DevOps Guru Supported Services and Pricing
https://aws.amazon.com/de/devops-guru/pricing/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH96
$3,024 per
resource per month
$2,016 per
resource per month
DevOps Guru Supported Services and Pricing
https://aws.amazon.com/de/devops-guru/pricing/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH97
DevOps Guru Cost Estimator
https://aws.amazon.com/de/devops-guru/pricing/
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH98
DevOps Guru Conclusions, Obeservations, Suggestions
•Most operational issues have been correctly recognized so far
•It took several (at least 7) minutes to create an incident after anomaly
appeared
•Correctly no insights created for the temporary incidents
•Short time Lambda, DynamoDB and API Gateway Throttling
•Recommendations for the insight reason could be more precise
•No differentiation between Lambda throttling because of reaching individual
function concurrency limit or the total AWS account concurrency limit
•No differentiation between Lambda Timeout and Init Error
•Sometimes missing relevant metrics (DynamoDB Streams)
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH99
•Some expected insights missing
•Insights with SNS not being able to deliver message to the destination only
created in case there are other anomalies recognized in parallel
•Lambda duration anomalous insights (Duration p90)
•took huge time to create (sometimes more than 30 minutes). Maybe because
of the medium severity
•DevOps Guru Proactive Insights
•Improved: will be now displayed more quickly
•Missed some important ones, like not used Lambda Provisioned Concurrency
for a long period of time
DevOps Guru Conclusions, Obeservations, Suggestions
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH101
•#AWS #Wishlist for DevOps Guru
•Support for EventBridge(and EventBridgePipes)
•Support for AppSync
•Support for Aurora Serverless v2 over DataAPI
•Better support for tracing i.e. AWS X-Ray, CloudWatch ServiceLensand
integrations with the 3
rd
observability tools i.e. Lumigo, Datadog
DevOps Guru Conclusions, Obeservations, Suggestions
Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH
FAQ Ask me Anything
102 Amazon DevOps Guru for the Serverless Applications
Vadym Kazulkin| @VKazulkin | ip.labsGmbH103
Thank you
Amazon DevOps Guru for the Serverless Applications