Defining Serverless — Part 2
Revolutionizing how we approach allocation, provisioning and scaling
This is Part 2 of a series defining Serverless Services in terms of five common traits, which John Chapin and I introduced in our free ebook from O'Reilly — What is Serverless?. If this is the first part you've come to I advise taking a look at Part 1 first, for context. In this part we discuss the second trait common to Serverless services.
Trait #2 — Self auto-scale and auto-provision, based on load
With traditional server-side architectures we are responsible for a number of activities around the resource usage of our applications:
- Resource planning — estimating what type of, and how many, resources our application requires. For instance what machine or instance types are best suited for the various components of our application, and how many of each of those instances we need to satisfy expected load.
- Resource allocation — mapping each component of our application to a resource we will be deploying it to
- Resource provisioning — the process of acquiring our machine instances and setting them up ready for deployment
- Resource scaling — adjusting all three previous activities based upon actual load our application needs to handle at any given time
There's no doubt that the cloud has made these activities easier, but we still need to perform them. The problem is that they all take time, and resource planning especially is far from a precise activity. We'll always need to adjust our plan, and sometimes that means that previous iterations of provisioning will lead to wasted effort or even wasted infrastructure commitment.
Typically what we do to avoid some of this churn is to plan for more load than we expect, leading us to over-provision our resources. This is often the right thing to do, but leads to a lack of efficiency — we have more resources than our system needs at any one time.
Serverless changes the management of resources entirely. No longer do we need to plan, allocate or provision server instances, or often any other resource. That's because a Serverless service self auto-provisions . In other words we just start using the service, and it will figure out how much, and what type of, resource is required and automatically provision them for our needs.
Let's look at Serverless Functions-as-a-Service / Serverless Compute as an example. When we create an AWS Lambda function we tie our code to an event, add some logical configuration, and that's that. When the triggering event occurs, the Lambda platform will automatically instantiate a container to execute our code within, and will itself allocate and provision a host for the container to run on. We have no part in the process.
This alone takes a huge amount of effort out of our deployment architecture process, but it doesn't finish there. Beyond auto-provisioning one unit of usage, Serverless services will continue to allocate and provision to handle the load at any given time to two, three, and more units. In other words Serverless services are self auto-scaling. Again, using our example of AWS Lambda, if another source event occurs while the original one is being processed, Lambda will create a second instance of our code, and will allocate and provision for that too. It will keep scaling up (within limits) to handle any load at any time, and then it will scale back down again when the load reduces.
Of course, we have auto-scaling systems outside of just Serverless architectures, but the joy of Serverless is that we don't have to do any work to manage auto-scaling — it is completely provided by the vendor and it is often more precise than traditional instance-based auto-scaling, as we'll see in the next part.
Furthermore Serverless enables auto-scaling for types of component that have typically been very hard to auto-scale ourselves, like databases.
Self auto-scaling and auto-provisioning save time and labor costs because of the resource management activities we no longer need to perform ourselves. This is especially great when we start developing a new product or component because of the reduction in time-to-market. Self auto-scaling and auto-provisioning in many contexts will also lead to saved infrastructure costs due to the precise, automatic, scaling we now have in comparison with most other resource management efforts.