Apidays New York 2024 - The subtle art of API rate limiting by Josh Twist, Zuplo
APIdays_official
89 views
16 slides
May 23, 2024
Slide 1 of 16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
About This Presentation
The subtle art of API rate limiting
Josh Twist, Co-founder & CEO at Zuplo
Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024)
------
Check out our conferences at https://www.apidays.global/
Do you want to sponsor or talk at one of our conferences?
https://apiday...
The subtle art of API rate limiting
Josh Twist, Co-founder & CEO at Zuplo
Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024)
------
Check out our conferences at https://www.apidays.global/
Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8
Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io
Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/
Size: 4.09 MB
Language: en
Added: May 23, 2024
Slides: 16 pages
Slide Content
The subtle art of API rate
limiting
Josh Twist
CEO @zuplo
Me
Founded several services in Microsoft Azure:
API Management
Logic Apps
Mobile Services
Power Automate Pro
Head of Product at Stripe and Facebook
Been a vendor and a customer of API gateways /
management
Agenda
Why do we need this?
What’s the canonical approach?
Private or Public
Latency tradeoffs and optimizing the customer experience
Observability
Demo
Content
Am I actually going
to be attacked?
Am I actually going
to be attacked?
Yes, but not by who
you think.
Why do we need rate limiting?
Protect resource consumption
Protect resource starvation
Ensure good experience for all customers
Discourage abusers
Enforce business model
A canonical response
Status - 429
Status Text - Too Many Requests
Header - Retry-After: 3600
Body - Use Problem Details format (IETF RFC 7807)
A canonical response
Status - 429
Status Text - Too Many Requests
Header - Retry-After: 3600
Body - Use Problem Details format (IETF RFC 7807)
Do you want to do
this?
Private or public limits?
Disclosing limits allows malicious users to easily maximize their consumption
of your resources without hitting the limit
Not disclosing limits makes it hard for consumers to avoid hitting your rate limit
Informal Partnership (e.g. Free
Tier)
Formal Partnership (e.g.
Contracted Customer)
Hide limits
Disclose
limits
How to apply rate limiting
IP based?
Rate limiting in
distributed systems
is hard
The latency / accuracy
tradeoff
Being accurate means completing the rate
limit check before allowing the request to
proceed == a slower API for everyone
You incur the latency on every request,
whether an problem is afoot or not
Consider a more lenient approach where you
run the check asynchronously and cache
blocked consumers
Here’s how
Async rate limiting
Have a local cache to store
known limited consumers (store
the retry time also)
Check the rate limit in parallel with
performing the primary request
If rate limit check comes back
blocked, update the cache and, if
the primary request not compete,
override the response.
Otherwise, the next request will
be blocked anyway.
Observability?
RPS (requests per second)
report
See rate limited responses
Break down by bucket (IP,
user etc)
What are the minimum requirements
This content available online now
Blog post: zuplo.link/rate-limiting-blog
Video: zuplo.link/rate-limiting-video