Learning Lambda — Part 4

Input, Output, Context

Mike Roberts
Mar 14, 2017

Input, Output, Context

This is Part 4 of Learning Lambda, a tutorial series about engineering using AWS Lambda. To see the other articles in this series please visit the series home page. To be alerted about future installments subscribe to our newsletter, follow Symphonia on twitter @symphoniacloud, and our blog at The Symphonium.

Welcome to Part 4 of Learning Lambda! If you didn’t read Part 3 you’ll probably want to do that before continuing here.

Last time we explored a little of what happens when you execute a Lambda — the lifecycle, how logging works, and the execution environment.

In this installment you’ll learn about what gets passed into Lambdas — both event input and context — and what we can return. To start off with we’ll look at input / output. I’m going to assume that by now you’re confident with building, setting up, configuring, deploying and testing a Lambda. In the examples I give here I’ll be showing code, example input and expected output, but mostly won’t be adding any more on the mechanics of how to build and run those Lambdas. I’ll also be leaving out null handling / error checking code, which you should consider for your own production Lambda functions.

Input / Output

When a Lambda is executed it is always passed some form of input. In our tests so far we’ve been hand coding a JSON object (specifically a string) in the AWS web console, which is passed as an argument every time we execute the Lambda.

In real use cases, the input to the Lambda function will be a JSON object that represents an event from some other component or system. For example, a representation of the details of an HTTP request, or an upload to the S3 storage service. We’ll be looking in detail at tying event sources to Lambda functions later in the series.

The JSON that we create in our test events, or that comes from event sources, is passed to the Java Lambda Runtime. In most use-cases we will have the Runtime deserialize this JSON payload for us, and we have several options of how to steer this.

As you may remember from Part 1 we can invoke a Lambda in two ways — synchronously and asynchronously. If we’re invoking it synchronously we can return some useful output from the Lambda function too.

There are various Java types we can use when defining the input and output of our Lambda function. So far we’ve seen that we can use a String for both input and output. For the most simple forms of I/O the JVM Lambda Runtime let’s us use this and other types. Let’s look at these now.

Basic types

Here’s a class with three different Lambda handler functions:

The first of these is what we’ve seen before. We can test this with the JSON object "world", returning "Hello, world". The second of these takes an integer (either a JSON integer, or a number in a JSON string, e.g. 5 or "5"), and returns an integer (105). Finally we take a boolean — any of the JSON values true, false, "true", or "false" — and return a boolean, the inverse of the input in this case.

We don’t need to match the output type to be the same as the input type — I’m just doing so here for ease of example. Also I’ve used the primitive types for integers and booleans, but you can use the boxed types of either if you prefer.

What’s happening in all of these cases is that the JVM Lambda runtime is deserializing the JSON input to a simple type on our behalf. If the event that is passed can’t be deserialized to the specified parameter type you’ll get a failure, with a message that starts as follows:

An error occurred during JSON parsing: java.lang.RuntimeException
…

Strings, integers and booleans are the only basic types that are explicitly documented as being supported, but with some experimentation we see other basic types, such as doubles and floats, are also included.

Lists and Maps

JSON also includes arrays and objects. The JVM Lambda Runtime will automatically deserialize those to Java Lists, and Maps, respectively, and will also serialize output Lists and Maps to JSON arrays and objects.

Calling the handler handlerList() with the JSON array [ 1, 2, 3 ] returns [ 101, 102, 103 ], calling the handler handlerMap() with the JSON object { "a" : "x", "b" : "y"} returns { "New Map -> a" : "x", "New Map -> b" : "y" }.

Furthermore you can use nested collections, as you would expect, e.g. calling handlerNestedCollection() with

[
    { "m" : 1, "n" : 2 },
    { "x" : 8, "y" : 9 }
]

returns

{
    "Nested at position 0": { "m" : 1, "n" : 2},
    "Nested at position 1": { "x": 8, "y" : 9}
}

Finally, you can also just use java.lang.Object as the type of the input parameter. While not particularly useful in production (unless you really don’t care about the input argument’s value) this can be handy at development time if you don’t know the precise format of an event, for example you can use .getClass() on the argument to find out what type it really is, print out the .toString() value, etc.

POJOs and Ecosystem Types

The previous input types work well for very fairly simple inputs. An alternative for more complex types is to use the Lambda Runtime’s automatic POJO (‘Plain Old Java Object’) serialization. Here’s an example where we use this for both input and output.

Obviously this is a very simple case, but it shows POJO serialization in action. We can execute this lambda with the input { "a" : "Hello Lambda" }, and it returns { "b" : "Input was Hello Lambda" } . Let’s look a little more at the code.

First of all we have our handler function, handlerPojo() . This takes as input the type PojoInput, which is a POJO class we’ve defined. POJO input classes can be static nested classes, as we’ve done here, or regular (outer) classes. The important thing is that they need to have an empty constructor, and can have field setters that follow the naming of the expected fields to be deserialized from the input JSON. If no JSON field is found with the same name as a setter (or field, see below) then the POJO field will be left null. Input POJO classes need to be mutable since the runtime will update them after they’ve been instantiated.

Our handler function interrogates the POJO object and creates a new instance of the PojoResponse class, which we pass back to the Lambda Runtime. The Lambda Runtime serializes it to JSON by reflecting over all the ‘get…’ methods. There are fewer limitations on POJO output classes — since they are not created or mutated by the Lambda runtime you are free to construct them as you please, and to make them immutable. And like input classes, POJO output classes can be static nested classes or regular (outer) classes.

For both POJO input and output classes you can nest further POJO classes, using the same rules, to serialize / deserialize nested JSON objects. Further, you can mix up POJOs and the collections types we discussed (Lists and Maps) in your input and output too.

The example we’ve given above follows most of the documentation you’ll see online: using a Java Bean convention for fields. However if you don’t want to use setters in your input class, or getters in your output class, you’re free to also use public fields. For instance, here’s another example:

We can execute this lambda with the input { "c" : "Hello Lambda" }, and it returns { "d" : "Input was Hello Lambda" } .

One of the main uses for POJO input deserialization is when you tie your Lambda function to one of the AWS ecosystem Lambda event sources. Here’s an example of a handler function that would process the event of an object being uploaded to the S3 storage service:

S3Event is a type that we can access from a AWS library dependency.

Note that the suggestion we had earlier about using an Object input type and calling .getClass() won’t help in finding the AWS event type, like this one — it will just give a combination of basic Java types. We’re going to go into detail about AWS event types later in this series.

Streams

The input / output types we’ve covered so far will be useful for you in many, and possibly all, of your use of Lambda in the real world. But what if you have a fairly dynamic and/or complicated structure that you can’t, or don’t want to, use any of the previous deserialization methods for?

The answer is the final input type supported by the JVM Lambda Runtime — a java.io.InputStream. This gives you access to the raw bytes passed to your Lambda function.

The signature for a Lambda using an InputStream is a little different in that it has a void return type, even for a function to be invoked synchronously. If you take an InputStream as an input parameter, you must also take a java.io.OutputStream as the second parameter. To return a result from such a handler function, you need to write to the OutputStream.

Here’s an example of a handler that can process streams.

If we execute this handler with the input "Hello World" it will write "HELLO WORLD" to the output stream.

You may well want to use your own JSON manipulation code if you’re using an InputStream, but we’ll leave that as an exercise to the reader. You should also practice good stream hygiene — error checking, closing, etc.

For the official documentation on input / output types in handler functions see http://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-req-resp.html#java-handler-io-type-stream

Context

So far in this series the only input we’ve taken for a Lambda handler function is that of the event that occurred. But that’s not the only information the handler can receive when it wants to do some processing. Additionally you can add a com.amazonaws.services.lambda.runtime.Context parameter to the end of any handler parameter list, and the runtime will pass in an interesting object that you can use. Let’s look at an example.

The Context object gives us information about the current Lambda invocation. We can use this information during the processing of a Lambda event.

Ignoring the brief mention of S3Event earlier, this is the first time we’ve explicitly used an AWS Library type. That means we need to add a new library to our classpath. Assuming you’re still using the Maven project we set up before, you can add the following to your <dependencies> section:

This will allow you to use the Context type.

When we execute the example handler we’ll get something like the following as a result :

{
  "getFunctionName":"ParamsLambda",
  "getLogStreamName":"2017/03/13/[$LATEST]af019...",
  "getInvokedFunctionArn":"arn:aws:lambda:us-west-2:39296...:function:ParamsLambda",
  "getIdentity":{
    "identityId":"","identityPoolId":""
  },
  "getRemainingTimeInMillis":"2988",
  "getLogGroupName":"/aws/lambda/ParamsLambda",
  "getFunctionVersion":"$LATEST",
  "getMemoryLimitInMB":"512",
  "getClientContext":null,
  "getAwsRequestId":"fcee7..."
}

All the different Context fields are described in the documentation at http://docs.aws.amazon.com/lambda/latest/dg/java-context-object.html , but I’m going to mention just a few of them worth taking special notice of.

FunctionName and FunctionVersion are useful for knowing more about how this particular instance of the handler function has been configured. We’ll get into more about invocation later, but for now it’s worth knowing that you can use the same Lambda function from multiple actual configured Lambdas, and these two fields will tell you which is being invoked at any particular time.

Most of these fields will stay the same whenever you call them during the processing of a particular event, but RemainingTimeInMillis is a notable exception. Lambda functions are subject to a configurable timeout (up to a maximum of 5 minutes), and calling this method will tell you how much time you have left at any given point during execution. This is useful if you are writing a fairly long-lived Lambda and want to save any state before the timeout occurs.

One quick note — when we defined the Maven dependency above on the aws-lambda-java-core library we specified it as <scope>provided</scope> . This is a small optimization that tells Maven not to include the library in the output package, and instead only to use the library at build time. Why would we do that? Because AWS automatically supply this library in the JVM Lambda Classpath, and we don’t need to include it in our distribution artifact.

Next time

That brings us to the end of Part 4 of Learning Lambda. We’ve taken a pretty exhaustive look at the various input and output options you have when programming a Lambda, as well as looking at a mechanism for getting more detail about the environment of a specific Lambda execution.

Next time we’re going to work on improving our toolbox, by looking at command line tools. You can read Part 5 here. To see the other articles in this series please visit the series home page. To be alerted about future installments subscribe to our newsletter, follow Symphonia on twitter @symphoniacloud, and our blog at The Symphonium.

Update 2017–06–15

Added detail about being able to use fields rather than setters / getters for POJO types
Added ‘provided’ scope for aws-lambda-java-core library
Fixed up ‘smart’ quoting

Need help with Lambda, or other Serverless technologies? We’re the experts! Contact us at Symphonia for expert advice, architectural review, training, and on-team development.

« Learning Lambda — Part 5 The Occasional Chaos of AWS Lambda Runtime Performance »