Blambda!

A couple of weeks ago, I made a todo list for the summer. One of the items on there was to create an AWS Lambda custom runtime for Babashka. I actually accomplished that a few days later, and today I want to walk through the what, the why, and the how of that project.

Let's start by answering the question of what a custom runtime is. For anyone not familiar with AWS Lambda, it is basically a way to run a function in the AWS cloud without having to worry about how or where the function is actually executed. For me, cloud functions are the next natural step along the path of computing without caring about machines. First came servers that you had to host yourself, then came VMs that you could run on servers you hosted yourself, then came EC2, which gave you a VM hosted by somebody else, then came containers, then came managed container environments like Kubernetes, then came function as a service, which allowed you to provide a zip file that just ran somewhere. You can nitpick the order if you want, but this is more or less accurate. I'm also not claiming that serverless is right for all workloads, but when it is, it's pretty great.

So having explained what Lambda is, I'll crack on with explaining custom runtimes. Lambda comes out of the box with runtimes that support a great variety of programming languages: Python, Golang, .Net, Ruby, JavaScript, and Java. Those last two are of interest to Clojure programmers, since they allow us to write ClojureScript functions and execute them on the NodeJS runtime, or Clojure proper on the Java runtime (you could technically also execute Clojure programs on the .Net runtime using Clojure CLR, but I doubt many people are doing that). This is great, unless you need predictably low-ish latency, because the first time you invoke a lambda function, AWS need to spin up an execution environment, then execute your function. This is called a "cold start", and for the JVM, it can take a few thousand milliseconds, and that's before starting the Clojure runtime, which can take a few thousand more.

Clojure programmers have long known about this JVM startup delay, of course, which is why we tend not to write command line utilities in Clojure, since it is quite annoying for your utility to take 2-3 seconds just to tell you that you've misspelled one of the options (was it --dry-run or --dryrun?). And of course we've had ways around that for awhile as well, mostly based on ClojureScript (Lumo is one example). So one could write lambdas in ClojureScript and run them on the NodeJS runtime and not have to wait around for the JVM to start up (the NodeJS runtime has a cold start of a few hundred milliseconds instead of a few thousand for the JVM), though there are a few drawbacks with that as well:

In order for the NodeJS runtime to execute ClojureScript, it needs to be transpiled to JavaScript, which means you can't edit it in the lambda console, which makes troubleshooting those annoying problems that only seem to happen when you deploy the thing harder, since you have to add your println statements locally and then compile and then upload and then try again and then realise you need another println somewhere else... ugh! To be fair, JVM Clojure has the same issue.
You have to use NodeJS. Yuck.

Luckily, AWS has provided the ability to specify a custom runtime, which can be written in any language and just needs to be executable on a Linux system and implement a simple local invocation API. This means that you can now write lambda functions in any language!

Getting back to Clojure, the wonderful borkdude realised that if one could compile a Clojure program using GraalVM, it would start up fast, thus enabling command line programs in Clojure that didn't make you want to pull your hair out. "But why stop there?" borkdude presumably thought to himself. "Writing shell scripts in Bash kinda sucks, so what about writing them in Clojure instead? All I'd have to do is write a program that can interpret Clojure and compile it with GraalVM and then it could execute Clojure scripts or Clojure code passed on the command line and then my life would be complete." And this magical program, my friends, is called Babashka.

Since Babashka starts fast and can evaluate Clojure source code, we can build a custom runtime for Lambda that uses Babashka to execute our lambda functions, thus gaining the ability to edit source code in the lambda console and use Clojure instead of ClojureScript, both of which are very important to me.

This was the motivation behind building Blambda!, which is a custom runtime that can be deployed as a Lambda layer. Let's talk about how it works.

A custom runtime requires only one thing: an executable named bootstrap in the root level of your lambda function's archive. When your function is invoked for the first time, the Lambda runtime executes the bootstrap function, which is then expected to call Lambda's next invocation API, which returns the request body of whatever called your lambda function, which the runtime customarily hands off to the actual code implementing your lambda function and then feeds the return value of that to the Lambda invocation response API, and then waits for the next request and does the same thing all over again.

In the case of Blambda!, the custom runtime consists of three parts:

The statically linked Babashka executable itself
A Clojure program, bootstrap.clj, that implements the request handling loop described above. I borrowed this from an existing Babashka runtime, bb-lambda, which I decided not to use because it uses Docker, which makes me almost as sad as NodeJS. ;)
A bootstrap shell script which uses Babashka to evaluate the above Clojure program

Packaging this as a layer is as simple as downloading Babashka, then zipping it into an archive with the other two files, which you can see in the build task of Blambda!'s bb.edn.

To use Blambda!, you can build and deploy the custom runtime layer by cloning the repo and running:

bb deploy

You then create a lambda function that uses the "provided" runtime, includes the "blambda" layer that was created by the bb deploy command, and sets the handler to whatever function in your namespace that will handle function invocations. For example, if you have a namespace like this:

(ns hello)

(defn hello [{:keys [name] :or {name "Blambda"} :as event} context]
  (prn {:msg "Invoked with event",
        :data {:event event}})
  {:greeting (str "Hello " name "!")})

you can create a lambda function like this:

AWS Lambda console create function page showing the setting of runtime to
'provide your own bootstrap on Amazon Linux 2'

AWS Lambda console add layer page showing selecting a custom layer named 'blambda'

AWS Lambda console edit runtime settings page showing setting the handler to 'hello/hello'

and then test it with an event like this:

{
  "name": "jmglov"
}

The Lambda console will display something like this:

Test Event Name
hello

Response
{
  "greeting": "Hello jmglov!"
}

Function Logs
START RequestId: 4288f5e7-f4c9-41b2-a26f-b5d688c146ec Version: $LATEST
Loading babashka lambda handler:  hello/hello
Starting babashka lambda event loop
{:msg "Invoked with event", :data {:event {:name "jmglov"}}}
END RequestId: 4288f5e7-f4c9-41b2-a26f-b5d688c146ec
REPORT RequestId: 4288f5e7-f4c9-41b2-a26f-b5d688c146ec	Duration: 240.41 ms	Billed Duration: 669 ms	Memory Size: 128 MB	Max Memory Used: 97 MB	Init Duration: 427.73 ms

Request ID
4288f5e7-f4c9-41b2-a26f-b5d688c146ec

Stay tuned for future posts on Blambda! when I try to actually use it for something real. ;)

jmglov's blog

Blambda!