AWS Lambda Reserved Concurrency
The most important new text box of re:Invent?
Lambda is a wonderful platform. However, it is still in its early years, and has a number of limitations and rough edges. One of those edges — that of the ability to overwhelm downstream systems with unfettered load — got a lot smoother at re:Invent 2017.
In Learning Lambda Part 9, I described Lambda's scaling behavior — that it will automatically scale a thousand or more instances wide to handle incoming load. Fantastic! Miraculous auto-scaling without effort! The only limit to Lambda's scaling is the cross-account concurrent execution limit.
This scaling is a double-edged sword though — Lambda can overwhelm downstream resources that do not have similar scaling properties. A thousand-times scaled Lambda could easily cause significant performance problems to a modest SQL database server. As I said in the article:
There is nothing we can do, from a lambda scaling point of view, to stop this. In other words there is no natural architectural backpressure with Lambda apart from the account wide concurrency limit.
That's a problem. For some people it means adapting their architecture, inserting extra layers of traditional non-scaling components to cause a slow-down of an event storm. For other teams it has meant Lambda is simply not a suitable technology for certain types of system.
But it turns out this wasn't the only problem in Lambda's concurrency model. As I also said in Learning Lambda 9:
Since the concurrency limit [of Lambda] is account-wide, one particularly important aspect to be aware of is that one Lambda function that has scaled particularly wide can impact the performance of every other Lambda function in the same AWS account + region pair. Because of this it is strongly recommended that, at the very least, you use separate AWS accounts for production and testing — deliberately DoS'ing (Denial-of-Servicing) your production application because of a load test against a staging environment is a particularly embarrassing situation to explain!
Could Amazon improve Lambda to aid us in these concerns?
Enter Reserved Concurrency
At re:Invent 2017 the Lambda team announced a new feature — Concurrency Limits on Individual AWS Lambda Functions, described briefly in the AWS press release here.
Every Lambda function now has a new configuration option — Reserved Concurrency. This is a single numerical value that may be assigned to a Lambda function's configuration, either through the web console, or via the SDK / CLI :
This configuration value has two effects if set:
- It limits the number of instances of your Lambda function that can be instantiated at any time to the value specified.
- It makes sure there is always at least enough concurrency capability available in the account to run the number of instances requested.
In the screenshot above I've specified that the reserved concurrency for my function is 20. This means that at most 20 instances of my Lambda will run at any given time.
Further, it means that of my account-wide concurrency limit (which defaults to 1000), 20 units worth will always be available for this function. That's not to say 20 instances will always be running — Lambda's load-based general scale-up and scale-down properties still apply — but that if 20 instances need to run then the Lambda platform will be able to do so, no matter what else is going on in our account.
Problems Solved! (?)
This single new addition solves the two problems I described.
First — we can restrict the negative impact our Lambda functions have on downstream systems. Say we only want to allow 10 instances of a Lambda function to be able to run, in order to load-protect a database that the function uses — we can now do that.
Second — say we have some critical production Lambda functions that we want to make sure always have capacity to run. We can now do that too, protecting ourselves from accidental DoS.
Great! So we're done now, right? No, we're never done, and there are some limitations to this solution.
Limitations of Reserved Concurrency
This new property works very well for the majority of Lambda event sources — specifically sources like SNS and S3 that are asynchronous and non-stream based. Say, for example, we get an unexpected flood of messages on an SNS topic. If we configure a Lambda function that is listening to that topic to be concurrency-restricted then once we reach that limit then Lambda will throttle the function, retrying the events later using its built-in backoff mechanism. I talk about throttling, again, in Part 9 of Learning Lambda.
However what happens if we're using a synchronous event source, like API Gateway? In this case once we hit the concurrency limit the throttling event will be passed back upstream to the caller as an error. It's up to the caller in that case to implement their own retry / backoff mechanism. (This problem exists without Reserved Concurrency too, but becomes more pronounced if we limit the scaling of our Lambda functions.)
‘Mr. Microservices’ Sam Newman had this to say about the new reserved concurrency:
In other words, while we've protected our downstream resources with the new Reserved Concurrency configuration we now need more smarts in our upstream clients to handle throttling. Also there's still nothing like a circuit breaker construct built into the platform yet.
But, again, asynchronous sources definitely get a boost with this update, and stream-based sources already had their own concurrency control, as I described in the Scaling article.
A second limitation of the new feature is that it ties together a solution for the two separate problems I described at the beginning of this article, and this might not be ideal. For instance, say you want to guarantee that your Lambda function can scale to at least 10 concurrent instances, but you don't want to specify an amount more than that because you have hundreds of Lambda functions in your account, and many of them will be configured similarly. The problem now is that you've limited your maximum scaling upper-bound to be equal to your guaranteed minimum scaling — if you guarantee to be able to go at least 10 wide, you'll never go more than 10 wide. This is a peculiar artifact of how AWS have solved this problem and we assume it's related to implementation, and likely isolation of functions. We hope in future to be able to control these two concerns separately.
A final aspect to be aware of in this new feature is that it requires configuring on every individual function, and each function's reservation is independent of the rest of the account. If we wanted to pool the concurrency of 5 functions, for example, that is not currently an option.
But still a big improvement
Limitations aside, this is still a big improvement for the Lambda platform. Especially for event-oriented systems, the ability to protect from event storms brings Lambda much more easily into architectural conversations. If you've ruled out Lambda before because of unbounded complexity we recommend taking a fresh look, with this new feature in play.