Making sense of AWS Serverless operations- Serverless Architecture Conference Berlin 2025

VadymKazulkin 8 views 59 slides Oct 22, 2025
Slide 1
Slide 1 of 59
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59

About This Presentation

There is a common misconception that AWS Serverless services can scale without limits. But every AWS service (not only Serverless) has a big list of quotas that everybody needs to understand and take into account during the development. In this talk, I'll highlight the key considerations for the...


Slide Content

Making sense of AWS Serverless operations
Vadym Kazulkin, ip.labs , Serverless Architecture Conference Berlin, 22 October 2025

Vadym Kazulkin
ip.labs GmbH Bonn, Germany
Co-Organizer of the Java User Group Bonn
[email protected]
@VKazulkin
https://dev.to/vkazulkin
https://github.com/Vadym79/
https://de.slideshare.net/VadymKazulkin/
https://www.linkedin.com/in/vadymkazulkin
https://www.iplabs.de/
Contact

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
The future of the software development (prioto LLMs)

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Challenges and lessons learned
Architect with AWS Serverless quotas and technical concepts in mind
General architectural decisions
SQS vs SNS vs Kinesis vs EventBridge
Aurora (Serverless) vs DynamoDB vs Aurora
DSQL
Challenging Serverless observability

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
account and current region limits
Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Service Quotas Request History

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Service Quotas Automatic Management

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Service Quotas Automatic Management

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://aws.amazon.com/de/blogs/compute/building-well-architected-serverless-applications-controlling-serverless-api-access-part-2/
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html
•The throttle ratethen determines how
many requests are allowed per second
•The throttle burstdetermines how many
additional requests are allowed per
second
API Gateway throttling-related settings are
applied in the following order:
•Per-client or per-method throttling limits
that you set for an API stage in a usage
plan
•Per-method throttling limits that you set
for an API stage
•Account-level throttling per Region
•AWS Regional throttling
Token bucket algorithm
API Gateway Token Bucket Algorithm

Quota Description Value Adjustable
Default
throughput /
Throttle rate
The maximum number of requests per
second that your APIs can receive
10.000
Throttle burst rateThe maximum number of additional
requests per second that you can send in
one burst
5.000
API Gateway Important Service Quotas

Quota Description ValueAdjustableMitigation
Max timeoutThe maximum integration
timeout in milliseconds
29 sec 1) Increase the limit
2) Lambda Function URL
with response streaming
API Payload
size
Maximum payload size for
non WebSocket API
10 MB 1)The client makes an
HTTP GET request to API
Gateway, and the
Lambda function
generates and returns a
presignedS3 URL
2)The client uploads the
image to S3 directly,
using the resigned S3
URL
API Gateway Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html
Concurrencyis the number of in-flight requests your AWS Lambda function is
handling at the same time
Lambda Concurrency

Quota Description Value Adju
stab
le
Mitigation
Concurrent
executions/
Concurrency
limit
The maximum number of
events that functions can
process simultaneously in
the current region
1.000 Rearchitect
Burst
Concurrency
Limit
After the initial burst,
concurrency scales by 1000
executions every 10
seconds up to your account
concurrency limit. Each
function within an account
now scales independently
from each other
•US West (Oregon), US
East (N. Virginia), Europe
(Ireland)=3.000
•Asia Pacific (Tokyo),
Europe (Frankfurt), US
East (Ohio)=1000
•All other Regions=500
Use
provisioned
concurrency
Lambda Important Service Quotas New

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://aws.amazon.com/de/blogs/aws/aws-lambda-functions-now-scale-12-times-faster-when-handling-high-volume-requests/
Lambda Concurrency and throttling

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/
Lambda concurrencylimit is a limit on the
simultaneous in-flight invocations allowed at
the same time
Transaction per second (TPS) =
concurrency / function duration in
seconds
Lambda Concurrency and TPS

Quota DescriptionValue Adjust
able
Mitigation
TPS
(Transaction
per Second)
The
maximum
number of
TPS
TPS = min(10 x
concurrency,
concurrency /
function duration
in seconds)
•If the function duration is exactly
100ms (or 1/10th of a second),
both terms in the min function
are equal
•If the function duration is over
100ms, the second term is lower
and TPS is limited as per
concurrency/function duration
•If the function duration is under
100ms, the first term is lower
and TPS is limited as per 10 x
concurrency
Lambda Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/ https://www.linkedin.com/pulse/how-aws-lambda-works-underneath-shwetabh-shekhar/
The TPS limit exists to protect the Invoke Data Planefrom the high churn of short-
lived invocations. In case of short invocations of under 100ms, throughput is capped
as though the function duration is 100ms (at 10 x concurrency). This implies that
short lived invocations may be TPS limited, rather than concurrency limited.
Lambda TPS Quota

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Lambda Function level Concurrency

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
•Optimize for cost-performance
•Use AWS Lambda Power Tuning
•Reuse AWS Service clients/connections outside of the Lambda
handler
•Use a keep-alive directive to maintain persistent connections
•Use the newest version of AWS SDK of programming language of your
choice
•Import only dependencies that you need (especially from AWS SDK)
•Minimize dependencies and package size
•Implement (other) best practices to reduce cold starts
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
General Best Practices for using Lambda

Quota Description Value AdjustableMitigation
Function
timeout
The maximum timeout
that you can configure for
a function
15 min
Synchronous
payload
The maximum size of an
incoming synchronous
invocation requestor
outgoing response
6 MB For the Request:
•use API Gateway
serviceproxy to S3
•use pre-signed S3 URL and
upload directly to S3
For the Response:
Use response streaming (with
AWS Lambda Web Adapter )
https://theburningmonk.com/2020/04/hit-the-6mb-lambda-payload-limit-heres-what-you-can-do/
Lambda Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://aws.amazon.com/de/blogs/compute/introducing-aws-lambda-response-streaming/
You can use response streaming to send responses larger than Lambda’s 6 MB
response payload limit up to a soft limit of 20 MB.
•Response streaming currently supports the Node.js 14.x and subsequent
managed runtimes
•To indicate to the runtime that Lambda should stream your function’s responses,
you must wrap your function handler with the streamifyResponse() decorator.
This tells the runtime to use the correct stream logic path, allowing the function to
stream responses
exports.handler = awslambda.streamifyResponse(
async (event, responseStream, context) => {
responseStream.setContentType(“text/plain”);
responseStream.write(“Hello, world!”);
responseStream.end();});
Lambda Response Streaming

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application

Quota Description Value Adjus
table
Mitigation
Table-level
read/wrtie
throughput limit
The maximum number of
read/write throughput allocated
for a table or global secondary
index
40.000 RCU/
40.000 WCU
Ask for quote
increase
Table-Level burst
capacity for
provisioned
capacity mode
During an occasional burst of
read or write activity, these extra
capacity units can be consumed
quickly
up to 300
seconds of
unused RCUs
and WCUs
Partition-level
read/write
throughput
The maximum number of
read/write throughput allocated
for a partition
3000 RCU /1000
WCU
Use best
practices to
avoid hot
partition
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html#default-limits-throughput-capacity-modes
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-throughput-bursting
DynamoDB Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
On-Demand Capacity Mode is ideal for:
•Unknown workloads
•Frequently idle workloads
•Staging and (individual) test environments
•Unpredictable application traffic
•Low management overhead (truly serverless mode) is preferable
DynamoDB Provisioned Throughput vs On-Demand
Capacity Mode

Quota Description Value Adjustable
Initial throughput for
on-demand capacity
mode
Initial throughput for on-demand
capacity mode
See futher details
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html#default-limits-throughput-capacity-modes
DynamoDB Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Newly created table with on-demand capacity mode:
•Enables newly created on-demand tables to serve up to 4,000 WCUs or
12,000 RCUs
•If you exceed double your previous traffic's peak within 30 minutes, then you
might experience throttling
•Solutions for pre-warming:
•Use pre-warming Amazon DynamoDB tables/indeceswith warm
throughput
•Performing the load test on your own
•Creating table in provisioned mode with high enough WCUs/RCUs and
then switch to on-demand mode
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.InitialThroughput
https://aws.amazon.com/blogs/database/pre-warming-amazon-dynamodb-tables-with-warm-throughput/
Initial Throughput for DynamoDB On-Demand Capacity
Mode

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application with Aurora (Serverless) instead of
DynamoDB

Quota Description Value Adjustable
Data API HTTP
request body size
The maximum size allowed for the
HTTP request body
4 Megabytes
Data API maximum
result set size
The maximum size of the database
result set that can be returned by
the Data API.
1 Megabytes
Data API maximum
size of JSON response
string
The maximum size of the simplified
JSON response string returned by
the RDS Data API.
10 Megabytes
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/CHAP_Limits.html
Aurora Serverless v2 Important Service Quotas

Quota Description Value Adjustable
Data API
requests
per second
The maximum number of
requests to the Data API per
second allowed
The max_connectionsvalue for
Aurora Serverless v2DB
instances is based on the
memory size derived from the
maximum ACUs.
However, when you specify a
minimum capacity of 0.5 ACUs
on PostgreSQL-compatible DB
instances, the maximum value of
max_connectionsis capped at
2,000.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.setting-capacity.html#aurora-serverless-v2.max-connections
Aurora Serverless v2 Important Service Quotas

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.setting-capacity.html#aurora-serverless-v2.max-connections
Aurora Serverless v2 Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application with Aurora DSQL instead of
DynamoDB

Quota Description Value Adjustable
Maximum
transaction time
Maximum allowed transaction
time
5 minutes
Maximum number
of rows to be
mutated in a
transaction block
Maximum number of table and
index rows that can be mutated
in a transaction block
3000
Maximum size of
all data modified in
a write transaction
Maximum size of all data
modified in a write transaction
10 Megabytes
https://docs.aws.amazon.com/aurora-dsql/latest/userguide/CHAP_quotas.html
https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-unsupported-features.html#working-with-postgresql-compatibility-unsupported-limitations
Aurora DSQL Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless Application

Quota Description Value Adjustable
Throughput
per Standard
Queue
Standard queues support a nearly unlimited
number of transactions per second (TPS) per
API action.
Nearly
unlimited
SQS (Standard) Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-seahttps://aws.amazon.com/about-aws/whats-new/2023/11/aws-lambda-polling-scale-rate-sqs-event-source/?nc1=h_ls
https://aws.amazon.com/blogs/compute/introducing-faster-polling-scale-up-for-aws-lambda-functions-configured-with-amazon-sqs/
•When a Lambda function subscribes to an SQS
queue, Lambda polls the queue as it waits for
messages to arrive. It consumes messages in
batches, starting with 5functions at a time
•If there are more messages in the queue, Lambda
adds up to 300functions/concurrent executions
per minute, up to 1250 concurrent executions, to
consume those messages from the SQS queue
•This scaling behavior is managed by AWS and
cannot be modified
•To process more messages, you can optimize your
Lambda configuration for higher throughput
Lambda scaling with SQS standard queues

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
https://aws.amazon.com/de/blogs/compute/understanding-how-aws-lambda-scales-when-subscribed-to-amazon-sqs-queues/
https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#services-sqs-batchfailurereporting
•Increase the allocated memory for your Lambda
function
•Optimize batching behavior:
•by default, Lambda batches up to
10 messagesin a queue to process them
during a single Lambda execution. You can
increase this number up to 10,000messages,
or up to 6MBof messages in a single batch
for standard SQS queues
•If each payload size is 1MB(the maximum
message size for SQS), Lambda can only take
6messagesper batch, regardless of the batch
size setting
•Implement partial batch responses
Lambda scaling with SQS standard queues

Quota Description Value Adjustable
Throughput
per Standard
Queue
Standard queues support a nearly unlimited
number of transactions per second (TPS) per
API action.
Nearly
unlimited
Message size The size of a message 256KB ->
recently
increased
to 1MB
SQS (Standard) Important Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
The BatchWriteItemoperation puts
or deletes multiple items in one or
more tables.
A single call to BatchWriteItemcan
transmit up to 16MBof data over
the network, consisting of up to
25 item put or delete operations.
use BatchWriteItem
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html
Use BatchWriteItemrequest for storing to DynamoDB

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
AWS services such as S3, SNS, EventBridge
and others invoke Lambda functions
asynchronously.
Lambda uses an internal queue to store
events. A separate process reads events from
the queue and sends them to the function.
https://aws.amazon.com/blogs/compute/introducing-new-asynchronous-invocation-metrics-for-aws-lambda/
Asynchronous invocations of Lambda functions

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-seahttps://aws.amazon.com/blogs/compute/introducing-new-asynchronous-invocation-metrics-for-aws-lambda/
https://docs.aws.amazon.com/lambda/latest/dg/invocation-async.html
Asynchronous invocations of Lambda functions and retry
behavior
Lambda discards events from its event queue if the retry policy has
exceeded the number of configured retries or the event reached its
maximum age.
The event once discarded from the event queue goes to the destination or
DLQ, if configured.

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
More Serverless services, more service quotas ☺
•CloudFront
•EventBridge
•SNS
•Kinesis
•StepFunctions
Services Quotas of other Serveressservices

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
•Understand the concepts of distributed systems
•Token Bucket Algorithm
•How asynchronous invocation patterns work
•polling from the queue and synchronously invoke Lambda
function
•Retries with (exponential) backoff pattern and jitter, load-
shedding and how AWS services and AWS SDKs support them
•Understand individual service specific terms
•Concurrency, transactions per second (TPS)
•Throttle/Concurrency limit, burst limit
•Event Source Mapping (ESM)
Understand Technical Concepts

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
•Know, understand and observe the service quotas
•Architect with service quotas in mind
•AWS adjusts them from time to time
•In case I’d like to request the quota increase, provide a valid
justification for the new desired value
•Service quotas are valid per AWS account (per region)
•Use different AWS accounts for development and testing
•Use different AWS accounts for independent (micro-)services
•Separate AWS accounts on the team level
•Use AWS Organizations
General best practices for Service Quotas

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
It’s also about latency

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Challenges and lessons learned
Architect with AWS Serverless quotas and technical concepts in mind
General architectural decisions
SQS vs SNS vs Kinesis vs EventBridge
Aurora (Serverless) vs DynamoDB vs Aurora
DSQL
Challenging Serverless observability

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
When to use SQS, SNS, Kinesis and EventBridge
https://www.serverlessguru.com/tips/sqs-vs-sns-vs-kinesis-vs-eventbridge

Database Choice
DynamoDB Aurora Aurora
Serverless
vs vs vs
Aurora
DSQL

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Challenges and lessons learned
Architect with AWS Serverless quotas and technical concepts in mind
General architectural decisions
SQS vs SNS vs Kinesis vs EventBridge
Aurora (Serverless) vs DynamoDB vs Aurora
DSQL
Challenging Serverless observability

Current architecture our image storage solution
You need
•Observability (Logging, Monitoring, Tracing)
•Alerting
•Incident Management solution (PagerDuty)
https://aws.amazon.com/blogs/compute/introducing-new-asynchronous-invocation-metrics-for-aws-lambda/

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
The future of the software development (prioto LLMs)

Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea
Serverless reduces the need for (readily
available) ops skills but increases the
demand for (less readily available)
distributed system design skills.
https://architectelevator.com/cloud/serverless-illusion/
Serverless challenges

Ask Me Anything and Feedback
93

94
Thank you