So what is serverless?

“Serverless? but there is a physical server underneath”

You might have encountered some cynical remarks on the term “Serverless”. I believe this actually comes from some confusion about which type server is being removed, and also from different people having a different cloud background.

Example
Let’s say you have an app that exposes a REST API. It runs all the time, waiting for requests and processes them as they arrive. It may be that this daemon exposes a dozen REST endpoints. It likely has a main loop that waits for HTTP requests from clients, runs the corresponding business logic, and returns a JSON response or some HTTP error.

This app acts as a server, as in “client/server” architecture.
Serverless is about removing that server layer (the always-on “main loop”), which means wiring client requests directly to the business logic functions, without constantly paying for an up-and-running server. You “pay-as-you-go” for the actual resource consumption of your business logic code, not for reserving capacity by the hour.

How does it work?

Technically, if you convert the above app to serverless, you will provide:

  1. The “business logic” functions
    Instead of packaging your entire app, you package each of your inner business logic functions, and upload (only) them to your function service (sometimes called FaaS / Function as a Service), for example to the AWS Lambda service.
  2. The REST glue
    You declare a mapping between HTTP endpoints (and methods) and the functions, so calling the endpoint triggers the function. For example, a POST of myapp.com/api/v1/customers may invoke the create_customer() function. The details varies between cloud providers, but on AWS you set it using the Amazon API Gateway service.

Behind the scenes, your cloud provider will transparently create servers to run your functions – for example, creating JVMs with their own “main()” for Java functions. It will automatically add or remove such servers based on the current number of concurrent executions of your functions, will reuse those servers to save startup time, will stop them all when a function was not called at all for a while etc etc.

For the curious – there are plenty of technical serverless benchmarks of all sorts. Here are some recent ones – with isolated / CPU-intensive workloads and with a more realistic workload interacting with a persistency service (both posts also compare the three major cloud providers). Also there are many other ways to invoke functions, outside the scope of this example.

Serverless vs. code on IaaS (unmanaged code)

If that app was previously running directly on cloud instances, you used to:

  • Provision an instance with your app, paying for it for as long it as runs…
  • No, better provision two instances for HA…
  • Of course, also provision a load balancer… and set auto scaling…
  • And test all this setup… and tune it… and monitor it… and patch the O/S and runtime as needed… etc etc

With serverless, you start by dropping all that complexity. Your functions will immediately auto-scale based on their current usage pattern, giving you great elasticity, without the explicit setup.

The other visible difference is the cost. When you ran on cloud instances, you paid for the resource reservation – for example, two cores, 8GB of RAM and some disk reserved for you by the hour. Switching to functions, you pay by the number of requests and by the resources used only during your function execution. Since typically servers have a low average CPU utilization (like 10%-20%), it can lead to a dramatic cost saving. Also, when a function is not called, you do not pay anything at all (sometimes called “scale to zero”), entirely removing the fixed cost of running your code.

Technically, the current pricing on AWS Lambda is 20 cents per million calls, plus a small fraction per “GB-second”, meaning how long the function ran (in 100ms increments) times the amount of RAM you asked for (in 64MB increments, starting from 128MB). On AWS, the CPU horsepower you get (CPU time share) is proportional to the amount of RAM you asked for, so it allows some cost-latency tuning.

Serverless vs. code on PaaS (managed code)

Some cloud-native apps are using managed app runtime services such as AWS Elastic Beanstalk or Google App Engine. You publish your app to these services, picking a specific runtime version (O/S and programming language), and they take care of deploying on instances, auto-scaling, monitoring etc.

Those services have already provided the reduction of complexity discussed above, but not the cost benefit – you still pay for the underlying instances “by the hour”. You can look at serverless at taking these services a (big) step forward.

Summary

With serverless, you pay only for the actual resources you have consumed, and not for a reserved capacity, dramatically reducing costs.

To run your code in a serverless mode, you remove the always-on “server” part of your code, wiring instead the client requests directly to your code functions. Functions / Lambdas are a type of managed runtime, which might be a nice surprise if you came from an IaaS background (basically allowing you to leapfrog a cloud architecture generation).

Let’s wrap it up with an example of a stateful serverless service. Amazon DynamoDB is a popular KV-store service. Its standard pricing (now called “provisioned capacity mode“) is based on the actual storage used, plus on a reserved capacity – how many read and write operations per sec should a table support. In other words, you needed to know in advance the peak workload, and paid hourly to reserve that capacity.
So far, that is a classical PaaS offering, not serverless.

Recently, Amazon have added a second pricing model for DynamoDB called “on-demand capacity mode“. In this serverless mode, you just pay per request, without a fixed hourly cost. As expected, Amazon will automatically ramp up or down the infrastructure that supports your table based on its current access pattern to guarantee low latency access. So, similar to our code example, you should get better elasticity, while paying only for your actual data access and data storage, without any fixed cost.

Questions? Use the comment section below

Leave a comment