Why AWS Lambda and .zip is a recipe for serverless success
I’ve been involved in numerous discussions in recent days and weeks mainly on Twitter, but elsewhere, around when and how serverless is going to be working with, and supporting, containers instead of just “code”.
It’s mainly container folks who are absolutely desperate to bring their container fu to the serverless world, and start to make use of event driven provisioning and having to avoid some of the issues around provisioning of their own containers and the management that comes with that.
The conversation often boils down to…
“Well you’re using containers behind the scenes, so why can’t I just give you my container and you run that instead?”
So let’s try and answer that by explaining why .zip files are awesome.
How AWS Lambda runs your code
It’s relatively simple. You create an AWS Lambda function and specify a .zip file with the code in it. Yes there are various different runtimes and you need to get the .zip file structure correct, but the package received is simply a .zip file.
It’s just a .zip file.
And .zip files are simply a directory structure of files and folders that has been compressed into a standard format.
And that’s been around for a long time (1989 believe it or not).
That .zip file with code in it gets put into S3 and is linked to the Lambda function and at some point after it’s been uploaded, the Lambda function is invoked, and a cold start happens.
Then the magic happens (well it’s not actual magic, but it’s pretty clever).
In the background, we run an optimised execution environment for the runtime and version your function has specified, and we load your code into from the .zip file.
Then we execute your code (invoke the function) with the data in the event payload that has been sent to the function.
Simple isn’t it?
- Create function code
- Zip the function code up
- Create an AWS Lambda function
Now you have a function. What happens on an event?
- Download the function code
- Start an execution environment
- Execution environment gets function code (the data in the .zip file) and bootstraps the runtime
- Execute the code with the event data payload
That’s a cold start. More in-depth explanation can be found on the Become a Serverless Blackbelt video on youtube.
Note: slow cold starts are often due to overly complex frameworks and dependencies. See https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html#function-code for more info e.g. for Java prefer dependency injection like Dagger or Guice, over Spring Framework.
A warm start is even simpler. It’s basically step 4 as steps 1 to 3 have been done already. The optimised execution environment is kept around for a while to be invoked, and if the event is triggered for which that function is required then the function is sent a new event payload and is invoked.
So it’s a “container”
There’s an execution environment of some sort in there behind the scenes running your code and AWS talked about that in the very beginning when Lambda was in preview (back then it was called a container).
https://aws.amazon.com/blogs/compute/container-reuse-in-lambda
With AWS Fargate you can build an actual container instead of just providing code in a .zip file and run it yourself, but you need to provision it yourself.
On the surface it seems like you’d get a better deal by building the container for a serverless scenario as you can control more factors.
You get to choose the exact version of the operating system if you really want. (As an example, you can create ‘FROM scratch’ docker images https://docs.docker.com/develop/develop-images/baseimages/)
You get to choose the exact libraries you want.
You get to choose the exact build versions and security patches.
You get to choose the security inside it as well.
You can scale with AWS Fargate as well and while you’re not managing servers, you’re still managing containers. And when scaling you’ll still have to consider the problem of cold starts. How fast does your container start up?
Warm starts are less of an issue though with containers as they are already running, but then you are paying for idle, which doesn’t happen with AWS Lambda. You only pay for invocations at that point.
But you also need to be able to manage and maintain the containers themselves. If somebody decides to build a badly designed container (inadvertently or due to inexperience) then the cold start on scaling may be compromised.
You also have to take into account that every single time there is a security patch needed to a container, that everybody needs to patch their own containers.
Remember Meltdown?
Adrian Cockroft had it right. You had to patch your containers and your instances/servers, but you didn’t have to patch Lambda functions.
Why?
Because you don’t control the AWS Lambda execution environment or server.
AWS controls it.
AWS patched it all for you.
Think about it for a moment.
You only gave them code in a .zip file.
Now with AWS Lambda, you still have to worry about library code that you put into functions needing security patches, but that’s a whole lot easier than having to worry about what actually executes that code.
And if you were going to have the same code as in the Lambda function anyway, but run it in your own container, then letting AWS run it in a Lambda function is removing the ops burden of managing the container.
But why .zip files?
Because .zip files are easy to create, easy to read (programmatically) and easy to work with.
In fact, they are ridiculously simple. Every developer can write code and make a zip file in several different ways.
But not only that, they are quick to extract and use.
And thinking about the build and deploy process, it’s far far easier to have a build process that looks like:
- Write code
- Test code
- .zip code
- Deploy code
Comparing it to “creating a Docker container” you have to create your Dockerfile and essentially create a series of commands to generate a server to run your code. While you can boilerplate this to an extent, the Dockerfile becomes a part of the system that needs to be maintained and managed, introducing both security issues and likely errors and bugs (especially if deployed over a period of time).
In terms of .zip files, step 3 is trivial, well understood, and very stable (nearly 30 years old). It is a step that is automatable by junior developers, and that makes development processes easier to understand, easier to teach, and very difficult to get wrong.
Assuming that the code is the same in both, why introduce the complexity of the container unless you have to?
Using .zip files as the unit of deployment allows developers to focus on what they should be focusing on: the business logic and the value it provides to the business.
Serverless needs containers, but we don’t need to worry about them
You know the joke about “Serverless — You know there are still servers right?”
There’s another one…
“Serverless — you know there are still containers right?”
You could say, we’re “containerless” just as much as we’re serverless.
Part of the purpose of serverless as a concept is that the whole team can stop worrying about the complexity and management of deployment and simply focus on development.
Containers definitely have their place and value, and don’t think that we in the serverless world are thinking that you’re all wrong and don’t know what you’re talking about. The serverless world wouldn’t exist without containers and it’s really important to understand that and realise that we know that too.
But the thing is that with AWS Lambda and serverless, we don’t even need to worry about what a container is, or why it matters. AWS does the creation, the optimisation, the securing, the provisioning, the scaling and the patching of those execution environments for us.
It also means that AWS could build and deploy improvements to that technology and give customers those improvements without having to make any changes.
Because it’s just a .zip file.
Serverless people simply don’t have to worry about what a container is, or how it really works.
Containers just don’t matter in the serverless world.
Because we write code.
And it runs on demand.
And it scales.
And it’s quick.
And if we ever needed a container, we could figure it out and run it on AWS Fargate.
Disclaimer
I currently work for AWS as a Senior Developer Advocate for Serverless based in the UK and working in EMEA.
Opinions expressed in this blog are mine and may or may not reflect the opinions of my employer.