Shave 99.93% off your Lambda bill with this one weird trick

AWS Solutions Architects hate him.

Unprovisioned

AWS launched Provisioned Concurrency for Lambda at re:Invent 2019 last week — essentially a way to keep warm Lambdas provisioned for you so you don’t experience any cold start latency in your function invocations. It also may save you money if you happen to have the ideal workload for it, as it’s priced at $0.05/hr (for 1 GB of memory) instead of the usual $0.06/hr.

Fast Boot

The thing I noticed with Provisioned Concurrency Lambdas was related to global work done outside of the function handler, before it’s invoked — let’s call it the init stage. For Provisioned Lambdas, this is executed in the background whenever you config your provisioned concurrency settings, and then every hour or so after that. Work done during this stage seemed to be executing at the same performance as the work done during the handler function on invocation — plus, you’re also charged for this time. That would seem unsurprising if it weren’t for the fact I was reminded of: that normal Lambda “containers”, unlike Provisioned Lambdas, actually get a performance boost when they’re in the init stage. This is presumably to aid cold starts, especially in runtimes like Java and .NET that typically have slow process start times and large class assemblies to load.

No difference at 3008 MB
No difference at 1792 MB
Half the performance at 896 MB
Almost exactly 1/14th at 128 MB

128 MB init = 1792 MB performance

So essentially, we’ve established that the init stage has the same performance as a 1792 MB Lambda, even if we’re only running a 128 MB one.

But how

Hang on a minute, I hear you cry. Firstly, how are we supposed to do all of our work outside the handler, if any subsequent time we invoke that Lambda, it’s already warm and that code won’t even run? Secondly, how are we supposed to pass anything to the init stage if only the handler receives events? And finally, 1/14th is “only” a 92.86% cost saving, not the 99.93% you promised 💸

Always cold

Let’s tackle that first point. There are some basic ways to ensure we always hit a cold Lambda, such as modifying any aspect of the function that would cause existing warm containers to be out-of-date. We were doing exactly that when we were fiddling with the memory settings above — each time we change that number and invoke, it’ll be fresh containers that get hit. Modifying environment variables, deploying new code, and other function config settings would achieve the same thing. The APIs to do these are probably rate-limited at a fairly strict rate though, so YMMV.

Getting data in and out

To the second point, you basically can’t pass any events in outside of the handler. If you’re just doing some sort of fixed job that didn’t require events, then this isn’t a problem. You could try to do it via environment variables I guess, but you’d need to modify the function’s config with each invocation.

100x developers

Alright, here’s where it gets even more far-fetched. If you actually ran the code from earlier, you may have noticed another interesting thing: the billed duration didn’t match the entire duration of work done. In fact, the init duration isn’t included in the billed duration at all. The init stage is free.

Responsible disclosure

I first noticed this in Jan 2018 and I was a little worried as it wasn’t documented anywhere and I thought it may be a resource abuse vulnerability. I contacted AWS security (aws-security@amazon.com), was told the relevant teams would be contacted to investigate, and heard no more.

Takeaways

Obviously you shouldn’t code your app like this. It’s a proof of concept that involves lots of hoop-jumping and who knows, you may very well get a slap on the wrist from AWS if you start abusing it.

VP Research Engineering at Bustle, AWS Serverless Hero, creator of LambCI. github.com/mhart