Making sense of service quotas of AWS Serverless services and how to deal with them at AWS SLA London 2024
VadymKazulkin
65 views
66 slides
May 03, 2024
Slide 1 of 66
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
About This Presentation
There is a misunderstanding that everything is possible with the Serverless Services in AWS. For example, the misunderstanding that your Lambda function may scale without limitations. But each AWS service (not only Serverless) has a big list of quotas that everybody needs to be aware of, understand,...
There is a misunderstanding that everything is possible with the Serverless Services in AWS. For example, the misunderstanding that your Lambda function may scale without limitations. But each AWS service (not only Serverless) has a big list of quotas that everybody needs to be aware of, understand, and take into account during the development. In this talk, I'll explain the most important quotas (in terms of scaling, but not only that) of Serverless services like API Gateway, Lambda, DynamoDB, SQS, and Aurora Serverless and how to architect your solution with these quotas in mind.
Size: 14.31 MB
Language: en
Added: May 03, 2024
Slides: 66 pages
Slide Content
Making sense of service quotas of AWS Serverless services and how to deal with them Vadym Kazulkin, ip.labs , Serverless Architecture Conference, 9 April 2024
Contact Vadym Kazulkin ip.labs GmbH Bonn, Germany Co-Organizer of the Java User Group Bonn [email protected] @VKazulkin https://dev.to/vkazulkin https://github.com/Vadym79/ https://de.slideshare.net/VadymKazulkin/ https://www.linkedin.com/in/vadymkazulkin https://www.iplabs.de/
ip.labs https://www.iplabs.de/
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Agenda Start with some basic AWS Serverless application Look at the various Service Quotas from different perspectives : ( Hyper ) Scaling Other important quotas to be aware of Re- architect our application to be more scalable , performant and resilient
API Gateway Important Service Quotas Quota Description Value Adjustable Default throughput / Throttle rate The maximum number of requests per second that your APIs can receive 10.000 Throttle burst rate The maximum number of additional requests per second that you can send in one burst 5.000
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Service Quotas https://aws.amazon.com/de/blogs/compute/building-well-architected-serverless-applications-controlling-serverless-api-access-part-2/ https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html The throttle rate then determines how many requests are allowed per second The throttle burst determines how many additional requests are allowed per second API Gateway throttling-related settings are applied in the following order: Per-client or per-method throttling limits that you set for an API stage in a usage plan Per-method throttling limits that you set for an API stage Account-level throttling per Region AWS Regional throttling Token bucket algorithm
API Gateway Important Service Quotas Quota Description Value Adjustable Mitigation Max timeout The maximum integration timeout in milliseconds 29 sec API Payload size Maximum payload size for non WebSocket API 10 MB 1)The client makes an HTTP GET request to API Gateway, and the Lambda function generates and returns a presigned S3 URL 2)The client uploads the image to S3 directly, using the resigned S3 URL
Lambda Important Service Quotas Quota Description Value Adjustable Mitigation Concurrent executions / Concurrency limit The maximum number of events that functions can process simultaneously in the current region 1.000 Rearchitect Burst Concurrency Limit After the initial burst, concurrency scales by 1000 executions every 10 seconds up to your account concurrency limit. Each function within an account now scales independently from each other US West (Oregon), US East (N. Virginia), Europe (Ireland)= 3.000 Asia Pacific (Tokyo), Europe (Frankfurt), US East (Ohio)=1000 All other Regions =500 Use provisioned concurrency
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Concurrency and TPS https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html Concurrency is the number of in-flight requests your AWS Lambda function is handling at the same time
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Concurrency and TPS https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html Concurrency is the number of in-flight requests your AWS Lambda function is handling at the same time
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Concurrency and TPS https://aws.amazon.com/de/blogs/aws/aws-lambda-functions-now-scale-12-times-faster-when-handling-high-volume-requests/ Concurrency is the number of in-flight requests your AWS Lambda function is handling at the same time
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Concurrency and TPS https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/ Lambda concurrency limit is a limit on the simultaneous in-flight invocations allowed at the same time Transaction per second (TPS) = concurrency / function duration in seconds
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Burst Limit and Cold Start https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/ If there are sudden and steep spikes in the number of cold starts, it can put pressure on the invoke services that handle these cold start operations, and also cause undesirable side effects for your application such as increased latencies, reduced cache efficiency and increased fan out on downstream dependencies The burst limit exists to protect against such surges of cold starts, especially for accounts that have a high concurrency limit
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Burst Limit https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/ https://docs.aws.amazon.com/lambda/latest/dg/burst-concurrency.html The chart above shows the burst limit in action with a maximum concurrency limit of 3000 , a maximum burst(B) of 1000 and a refill rate(r) of 500/minute . The token bucket starts full with 1000 tokens, as is the available burst headroom
Lambda Important Service Quotas Quota Description Value Adjustable Mitigation TPS (Transaction per Second) The maximum number of TPS TPS = min(10 x concurrency , concurrency / function duration in seconds ) If the function duration is exactly 100ms (or 1/10th of a second), both terms in the min function are equal If the function duration is over 100ms, the second term is lower and TPS is limited as per concurrency/function duration If the function duration is under 100ms, the first term is lower and TPS is limited as per 10 x concurrency
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda TPS Limit https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/ https://www.linkedin.com/pulse/how-aws-lambda-works-underneath-shwetabh-shekhar/ The burst limit isn’t a rate limit on the invoke itself, but a rate limit on how quickly concurrency can rise. However, since invoke TPS is a function of concurrency, it also clamps how quickly TPS can rise.
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda TPS Limit https://aws.amazon.com/de/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/ https://www.linkedin.com/pulse/how-aws-lambda-works-underneath-shwetabh-shekhar/ The TPS limit exists to protect the Invoke Data Plane from the high churn of short-lived invocations. In case of short invocations of under 100ms, throughput is capped as though the function duration is 100ms (at 10 x concurrency). This implies that short lived invocations may be TPS limited, rather than concurrency limited.
Lambda Important Service Quotas Quota Description Value Adjustable Mitigation Concurrent executions / Concurrency limit The maximum number of events that functions can process simultaneously in the current region 1.000 Rearchitect Burst Concurrency Limit After the initial burst, concurrency scales by 1000 executions every 10 seconds up to your account concurrency limit. Each function within an account now scales independently from each other US West (Oregon), US East (N. Virginia), Europe (Ireland)= 3.000 Asia Pacific (Tokyo), Europe (Frankfurt), US East (Ohio)=1000 All other Regions =500 Use provisioned concurrency
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Function level Concurrency
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea General Best Practices for using Lambda Optimize for cost-performance Use AWS Lambda Power Tuning Reuse AWS Service clients/connections outside of the Lambda handler Use the newest version of AWS SDK of programming language of your choice Minimize dependencies and package size Import only dependencies that you need (especially from AWS SDK) Use a keep-alive directive to maintain persistent connections Implement (other) best practices to reduce cold starts https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html
Lambda Important Service Quotas Quota Description Value Adjustable Mitigation Function timeout The maximum timeout that you can configure for a function 15 min Synchronous payload The maximum size of an incoming synchronous invocation request or outgoing response 6 MB For the Request: use API Gateway service proxy to S3 use pre-signed S3 URL and upload directly to S3 For the Response: Use response streaming (with AWS Lambda Web Adapter ) https://theburningmonk.com/2020/04/hit-the-6mb-lambda-payload-limit-heres-what-you-can-do/
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Response Streaming https://aws.amazon.com/de/blogs/compute/introducing-aws-lambda-response-streaming/ You can use response streaming to send responses larger than Lambda’s 6 MB response payload limit up to a soft limit of 20 MB . Response streaming currently supports the Node.js 14.x and subsequent managed runtimes To indicate to the runtime that Lambda should stream your function’s responses, you must wrap your function handler with the streamifyResponse () decorator. This tells the runtime to use the correct stream logic path, allowing the function to stream responses exports.handler = awslambda. streamifyResponse ( async ( event , responseStream , context ) => { responseStream.setContentType (“ text / plain ”); responseStream.write (“Hello, world !”); responseStream.end ();});
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Response Streaming (with AWS Lambda Web Adapter) https://aws.amazon.com/de/blogs/compute/using-response-streaming-with-aws-lambda-web-adapter-to-optimize-performance/ The Lambda Web Adapter , written in Rust, serves as a universal adapter for Lambda Runtime API and HTTP API It allows developers to package familiar HTTP 1.1/1.0 web applications , such as Express.js, Next.js, Flask , SpringBoot , or Laravel , and deploy them on AWS Lambda This replaces the need to modify the web application to accommodate Lambda’s input and output formats , reducing the complexity of adapting code to meet Lambda’s requirements
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda Response Streaming (with AWS Lambda Web Adapter) https://github.com/awslabs/aws-lambda-web-adapter https://aws.amazon.com/de/blogs/compute/using-response-streaming-with-aws-lambda-web-adapter-to-optimize-performance/ Spring Boot 3 example with AWS Lambda Web Adapter: https://github.com/Vadym79/AWSLambdaJavaWithSpringBoot/tree/master/spring-boot-3.2-with-lambda-web-adapter
SQS (Standard) Important Service Quotas Quota Description Value Adjustable Throughput per Standard Queue Standard queues support a nearly unlimited number of transactions per second (TPS) per API action. Nearly unlimited
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda scaling with SQS standard queues https://aws.amazon.com/about-aws/whats-new/2023/11/aws-lambda-polling-scale-rate-sqs-event-source/?nc1=h_ls https://aws.amazon.com/blogs/compute/introducing-faster-polling-scale-up-for-aws-lambda-functions-configured-with-amazon-sqs/ When a Lambda function subscribes to an SQS queue, Lambda polls the queue as it waits for messages to arrive. It consumes messages in batches, starting with 5 functions at a time If there are more messages in the queue, Lambda adds up to 300 functions/concurrent executions per minute, up to 1,000 functions (or up to your account concurrency limit ) , to consume those messages from the SQS queue This scaling behavior is managed by AWS and cannot be modified To process more messages, you can optimize your Lambda configuration for higher throughput
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Lambda scaling with SQS standard queues https://aws.amazon.com/de/blogs/compute/understanding-how-aws-lambda-scales-when-subscribed-to-amazon-sqs-queues/ https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#services-sqs-batchfailurereporting Increase the allocated memory for your Lambda function Optimize batching behavior : by default, Lambda batches up to 10 messages in a queue to process them during a single Lambda execution. You can increase this number up to 10,000 messages , or up to 6 MB of messages in a single batch for standard SQS queues If each payload size is 256KB (the maximum message size for SQS), Lambda can only take 23 messages per batch, regardless of the batch size setting Implement partial batch responses
SQS (Standard) Important Service Quotas Quota Description Value Adjustable Throughput per Standard Queue Standard queues support a nearly unlimited number of transactions per second (TPS) per API action. Nearly unlimited In-Flight Messages per Standard Queue The number of in-flight messages ( received from a queue by a consumer, but not yet deleted from the queue ) in a standard queue 120.000 Message size The size of a message 256KB
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Use BatchWriteItem for storing to DynamoDB The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can transmit up to 16MB of data over the network, consisting of up to 25 item put or delete operations use BatchWriteItem https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea SQS FIFO https://aws.amazon.com/blogs/compute/solving-complex-ordering-challenges-with-amazon-sqs-fifo-queues/ https://jayendrapatil.com/aws-sqs-standard-vs-fifo-queue/ SQS Standard Queue SQS FIFO Queue Ordering Best Effort Ordering First In First Out Ordering within the Message Group Delivery At Least Once Exactly once
SQS (FIFO) Important Service Quotas Quota Description Value Adjustable Batched Message Throughput for FIFO Queues The number of batched transactions per second (TPS) for FIFO queues 3.000 In-Flight Messages per FIFO Queue The number of in-flight messages in a FIFO queue 20.000
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea SQS FIFO Message Groups and Multiple Consumers https://aws.amazon.com/blogs/compute/solving-complex-ordering-challenges-with-amazon-sqs-fifo-queues/ https://jayendrapatil.com/aws-sqs-standard-vs-fifo-queue/ The combination of increased messages and extra processing time for the new features means that a single consumer is too slow. The solution is to scale to have more consumers and process messages in parallel To work in parallel, only the messages related to a single Auction must be kept in order. FIFO can handle that case with a feature called message groups. Each transaction related to Auction A is placed by your producer into message group A, and so on
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea SQS FIFO Message Groups and Multiple Consumers https://aws.amazon.com/blogs/compute/solving-complex-ordering-challenges-with-amazon-sqs-fifo-queues/ https://jayendrapatil.com/aws-sqs-standard-vs-fifo-queue/
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea SQS FIFO High Throughput Mode https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/high-throughput-fifo.html High throughput for FIFO queues supports a higher number of requests per API, per second To increase the number of requests in high throughput for FIFO queues, you can increase the number of message groups you use. Each message group supports 300 requests per second Quota Description Value Adjustable Throughput for FIFO High Throughput mode Number of transactions per second (TPS) per API in the high throughput node of FIFO queue 2.400-9.000
DynamoDB Important Service Quotas Quota Description Value Adjustable Mitigation Table-level read / wrtie throughput limit The maximum number of read/write throughput allocated for a table or global secondary index 40.000 RCU/ 40.000 WCU Ask for quote increase Table-Level burst capacity for provisioned capacity mode During an occasional burst of read or write activity, these extra capacity units can be consumed quickly up to 300 seconds of unused RCUs and WCUs Partition-level read / write throughput The maximum number of read/write throughput allocated for a partition 3000 RCU /1000 WCU Use best practices to avoid hot partition https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html#default-limits-throughput-capacity-modes https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-throughput-bursting
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Recommendations for partition keys Use high-cardinality attributes These are attributes that have distinct values for each item, like email_id , employee_no , customer_id , session_id , order_id , and so on Use composite attributes Try to combine more than one attribute to form a unique key, if that meets your access pattern. For example, consider an orders table with customerid#productid#countrycode as the partition key and order_date as the sort key, where the symbol # is used to split different field Add random numbers or digits from a predetermined range for write-heavy use cases Suppose that you expect a large volume of writes for a partition key (for example, greater than 1000K writes per second). In this case, use an additional prefix or suffix (a fixed number from predetermined range, say 0–9) and add it to the partition key, like InvoiceNumber#Random (0-N) https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-uniform-load.html https://aws.amazon.com/de/blogs/database/choosing-the-right-dynamodb-partition-key/
DynamoDB Important Service Quotas Quota Description Value Adjustable Initial throughput for on-demand capacity mode Initial throughput for on-demand capacity mode See futher details https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html#default-limits-throughput-capacity-modes
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea DynamoDB On-Demand Capacity Mode On-Demand Capacity Mode ideal for: Unknown workloads Frequently idle workloads Have unpredictable application traffic Low management overhead (truly serverless mode) Pricing based on the actual data reads and writes your application performs on your tables
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Initial Throughput For DynamoDB On-Demand Capacity Mode Newly created table with on-demand capacity mode: enables newly created on-demand tables to serve up to 4,000 WCUs or 12,000 RCUs If you exceed double your previous traffic's peak within 30 minutes , then you might experience throttling One solution is to pre-warm the tables to the anticipated peak capacity of the spike by: Performing the load test Creating table in provisioned mode with high enough WCUs/RCUs and then switch to on-demand mode https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.InitialThroughput
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Initial Throughput For DynamoDB On-Demand Capacity Mode Existing table switched from provisioned to on-demand capacity mode: The previous peak is half the maximum write capacity units and read capacity units provisioned since the table was created or the settings for a newly created table with on-demand capacity mode, whichever is higher In other words, your table will deliver at least as much throughput as it did prior to switching to on-demand capacity mode https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.InitialThroughput
Aurora (Serverless v2) Important Service Quotas Quotas Quota Description Value Adjustable Data API requests per second The maximum number of requests to the Data API per second allowed The max_connections value for Aurora Serverless v2DB instances is based on the memory size derived from the maximum ACUs. However, when you specify a minimum capacity of 0.5 ACUs on PostgreSQL-compatible DB instances, the maximum value of max_connections is capped at 2,000. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.setting-capacity.html#aurora-serverless-v2.max-connections
Aurora (Serverless v2) Important Service Quotas Quotas https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.setting-capacity.html#aurora-serverless-v2.max-connections
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea DynamoDB vs Aurora (Serverless v2) DynamoDB DynamoDB + DAX Investment in Knowledge Understanding of NoSQL databases Understanding of single- table design principles Same Requires to put Lambda into VPC to access
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea DynamoDB vs Aurora (Serverless v2) Aurora Serverless v2 Aurora Serverless v2 + Data API Investment in Knowledge Relational databases are familiar to many Same Engine Support MySQL and PostgreSQL Currently only PostgreSQL Requires to put Lambda into VPC to access May require Amazon RDS Proxy for connection pooling https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Concepts.Aurora_Fea_Regions_DB-eng. Feature .ServerlessV2.html https://dev.to/aws-builders/data-api-for-amazon-aurora-serverless-v2-with-aws-sdk-for-java-part-1-introduction-and-set-up-of-the-sample-application-3g71
S3 Important Service Quotas Quota Description Value Adjustable Maximal number of requests per second Maximal number of requests per second per partitioned Amazon S3 prefix At least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD Maximal file size Individual Amazon S3 objects size Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 TB. The largest object that can be uploaded in a single PUT is 5 GB. For objects larger than 100 MB, customers should consider using the multipart upload capability. https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Other Optimizations: Caching Put DynamoDB Accelerator (DAX) in front of DynamoDB Requires putting Lambda behind VPC ElastiCache before Aurora Serverless Requires putting Lambda behind VPC No “pay as you go” pricing Enable API Gateway Caching Uses ElastiCache behind the scenes No “pay as you go” pricing for ElastiCache Use CloudFront (and its caching capabilities) in front of API Gateway
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Other Optimizations: Error Handling and Retries Set meaningful timeouts For API Gateway, Lambda Retry with exponential backoff and jitter AWS SDK supports them out of the box Implement idempotency AWS Lambda Powertools (Java, Python) supports idempotency module https://docs.aws.amazon.com/sdkref/latest/guide/feature-retry-behavior.html https://aws.amazon.com/builders-library/timeouts-retries-and-backoff-with-jitter/ https://aws.amazon.com/de/blogs/architecture/exponential-backoff-and-jitter/ https://aws.amazon.com/blogs/compute/handling-lambda-functions-idempotency-with-aws-lambda-powertools/
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea AWS “Virtual Waiting Room” Solution Open-source project written in Python that can be integrated into existing applications Source code available on GitHub Different CloudFormation Templates t o choose from (from minimal to e xtended solutions) E stimated costs for a 50,000-user and a 100,000-user waiting room with an event duration ranging 2-4 hours Virtual Waiting Room on AWS has been load tested with a tool called Locust . The simulated event sizes ranged from 10,000 to 100,000 clients https://docs.aws.amazon.com/solutions/latest/virtual-waiting-room-on-aws/architecture-overview.html https://docs.aws.amazon.com/pdfs/solutions/latest/virtual-waiting-room-on-aws/virtual-waiting-room-on-aws.pdf
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea Services Quotas of other Serveress services More Serverless services, more service quotas CloudFront EventBridge SNS Kinesis StepFunctions
Image: burst.shopify.com/photos/a-look-across-the-landscape-with-view-of-the-sea General Best Practices for Service Quotas Know, understand and observe the service quotas Architect with service quotas in mind AWS adjusts them from time to time In case I’d like to request the quota increase, provide a valid justification for the new desired value Service quotas are valid per AWS account (per region) Use different AWS accounts for development and testing Use different AWS accounts for independent (micro-)services Separate AWS accounts on the team level Use AWS Organizations