Defining Serverless — Part 4
Considering the performance needs of Serverless applications
This is Part 4 of a series defining Serverless Services in terms of five common traits, which John Chapin and I introduced in our free ebook from O'Reilly — What is Serverless?. If this is the first part you've come to I advise taking a look at Part 1 first, for context. In this part we discuss the fourth trait common to Serverless services.
Trait #4 — Performance capabilities defined in terms other than host size / count
Back when we looked at trait #1 we talked about how with Serverless we are no longer concerned with server hosts. We no longer think about the number of server host instances we are using for a given component, and nor are we able to define what type or size of server host instances our application is to run on. This has a corollary — it means that any performance capabilities that a Serverless service exposes are not defined in terms of host size or count. Given this, how are we to define the performance requirements that we need in order to run our Serverless applications?
In some ways this is not a new problem. With virtualization we never get the full capacity of a physical server machine within a virtual machine (VM). When we divide up a physical machine into several VMs it's rare that we will peg one VM to a discrete set of physical resources that no other VM can use. Even if we do split up CPU cores in such a way it's likely that several VMs will share network resources, for instance.
With containerization this abstractness becomes even more engrained. When we use something like Kubernetes we can specify how many cores of CPU we want per ‘pod’, and we can also define how much memory we need. But there's nothing here to specify what type of CPU we want, nor can we specify what amount of network capacity we need.
Serverless services have a few different ways of handling this.
With AWS Lambda we have one performance dial — RAM. At time of writing we can specify the amount of RAM we need, from 128MB up to 1.5GB . This is a nice simple measure, that is also fairly easy to implement on the vendor side. However what we also get with Lambda is a ‘proportional’ increase in the amount of CPU and networking resource our function has available. A 1.5GB Lambda function will, roughly, get 12 times as much CPU resource as a 128MB function. That doesn't mean it's getting 12 times as many cores — all Lambda functions get 2 vCPUs — but it does mean that you'll get roughly 12 times as many CPU time slices. However I say ‘roughly’ since this is not guaranteed, and in fact the actual CPU performance you get can vary significantly with Lambda. I recommend John Chapin's article for more on this subject.
What you don't have the ability to do at all with AWS Lambda is say what overall speed CPU you want, that you want more cores, that you want a different amount of local disk storage, etc.
Backend-as-a-Service (BaaS) products all have very different ways of approaching configuration of performance. DynamoDB allows specification of throughput that you need. Kinesis lets you configure the number of shards, and consequently the amount of traffic that can be handled (and other capabilities.) Even S3 allows one coarse-grained performance switch — ‘Standard’ vs ‘Glacier’ storage classes.
However some BaaS services have no performance configuration at all, and what you see is what you get. If necessary you may need to re-architect your system to split up the load that you make to a BaaS service in order to satisfy your own performance needs.
One place where this topic can get a little fuzzy though, is where you do actually specify the more concrete resources that your Serverless service will use. Microsoft Azure Functions, for example, let's you run your Functions-as-a-Service (FaaS) functions on dedicated instances using Azure App Service Plans. Further, if you are using an open-source implementation of a FaaS platform and running it on your own hardware, or on Infrastructure-as-a-Service (IaaS) hosts, then you consequently have a lot more control over performance. I would argue, however, that these are specific cases and not ‘absolute’ Serverless situations, if you are both responsible for managing the underlying hosts and the application running on them. As soon as you are concerned about the state of long-lived components of any kind you are stepping out of the benefits of not having to tend to such components.
Finally, I think this is also an area where Serverless shows its immaturity in comparison with IaaS. AWS EC2 has (at time of writing) 76 different instance types. These support specialization across a vast number of use cases, offering variability of CPU capacity, RAM, disk space, IO performance and specialist hardware like GPUs and FPGAs. Very little of this customization is available to Lambda functions — does this mean that Lambda functions don't need such customization? No, it just means Lambda is still in early days. I fully expect that Lambda, and other FaaS and BaaS platforms, will evolve over the coming years to offer flexibility of performance needs, and will do so mostly in different ways to IaaS due to the abstraction of hosts. For instance I think it's likely that GPU and FPGA resources will be available elastically to attach to Lambda functions, as we do already see with Elastic GPUs being able to be attached to non-specialized EC2 instances.
In summary, since with Serverless we are abstracted from thinking about server hosts, Serverless services don't typically offer performance capabilities in terms of host size or count. However many Serverless services offer alternative ways to configure performance requirements, and I expect that these types of configuration will expand in the future.