Learning Lambda — Part 6

Invocation and Event Sources

Symphonia
Mike Roberts
Aug 18, 2017 · 11 min read

This is Part 6 of Learning Lambda, a tutorial series about engineering using AWS Lambda. To see the other articles in this series please visit the series home page. To be alerted about future installments subscribe to our newsletter, follow Symphonia on twitter @symphoniacloud, and our blog at The Symphonium.

image

Welcome back to Learning Lambda! The first five parts gave a solid introduction to programming AWS Lambda, using the Java language, and now we'll start expanding our scope a little.

So far we've only looked at invoking Lambda functions directly, in a synchronous way. This is not so useful for building the kind of architectures I described in Part 1, to implement web APIs and file processing. So in this part we'll look at how you take Lambda functions and form them into applications by using other AWS services as event sources. But first of all we need to get a little more clarity around how Lambda functions can be invoked.

Synchronous vs Asynchronous Invocation

As I just mentioned so far we've only looked at synchronous invocation and we've done that by specifying the RequestResponse invocation-type when using the aws lambda invoke CLI command. This is also the form of invocation used when we test a Lambda function from the web console. When we invoke a Lambda function as RequestResponse the Lambda runtime waits until the handler function has finished executing, and then passes the value that the function returns back to the original caller.

With asynchronous invocation the Lambda runtime does not wait for the function to complete, and instead returns immediately to the caller. We call a Lambda function asynchronously by using the Event invocation-type. Let's look at this in action.

First of all, let's reset our development environment. We'll create a new Lambda function named MyFunction:

Now we'll build and deploy this function. I restate the command to get the IAM role in case you've created a new terminal session since last time we did this:

$ mvn package

$ LAMBDA_ROLE_ARN=`aws iam get-role --role-name \
lambda_basic_execution --query Role.Arn --output text`

$ aws lambda create-function --function-name MyLambda \
--runtime java8 --role $LAMBDA_ROLE_ARN \
--handler io.symphonia.MyLambda::handler \
--zip-file fileb://target/learning-lambda-1.0-SNAPSHOT.jar \
--memory-size 512

And now let's invoke this as we've done before:

$ aws lambda invoke --invocation-type RequestResponse --function-name MyLambda --payload \"world\" outputfile.txt

This gives us the following on Standard Output (200 being the HTTP status code for ‘ok’):

{
    "StatusCode": 200
}

And we can look at the return value from the function as follows:

$ cat outputfile.txt
"Hello, world!"

Now let's invoke the function with the Event invocation type:

$ aws lambda invoke --invocation-type Event --function-name MyLambda --payload \"world\" outputfile.txt
{
    "StatusCode": 202
}

We see a different output code — 202 (Accepted) rather than 200. And if we look at the output file, we'll see it's empty. If we go and look at the CloudWatch logs for this invocation though we'll still be able to see the output we wrote to Standard Output.

So now we've called our function asynchronously, and you'll notice that we made no code changes, and we didn't even need to build or deploy our code in a different way. Our code still returns a value, therefore, but this value is discarded by the runtime. If we know that a function will only be used asynchronously we can use a void return type:

And package & deploy our code :

$ mvn package; aws lambda update-function-code --function-name MyLambda --zip-file fileb://target/learning-lambda-1.0-SNAPSHOT.jar

With this new code, we can still invoke our function asynchronously with the Event invocation type.

We mentioned in Part 1 that when building applications you use RequestResponse invocation where the caller needs a value generated by the function. A good example of this is to implement a server component to handle HTTP GET requests. On the other hand, you use Event invocation when the caller does not need a function to return it a value. This is the premise of event-driven applications, hence the name of the invocation type. In the rest of this installment we're going to look at a way of using Event invocation with an AWS event source — we'll look at RequestResponse invocation to implement an HTTP API later in the series.

Tying Lambdas to Event Sources

So far we've called Lambda functions directly. While great for testing, and some production scenarios, it is not the typical ‘real world’ Lambda use case.

Instead, most of the time for production usage we tie our Lambda functions to an upstream AWS service as an Event Source, otherwise known as a Trigger. Amazon have a large number of different event sources you can use, which are listed here. These range from object/file storage (S3), through message busses (SNS, Kinesis), HTTP servers (API Gateway), and more.

When you want to write a Lambda function tied to an event source you need to complete the following steps:

  1. Find the format of the event object that you will receive from the trigger
  2. Define a Lambda handler function, parsing the input event
  3. Write the handler function code to process the event
  4. If you're using a synchronous event source then complete your code by returning an appropriate object
  5. Build, deploy, test
  6. Configure your handler function to tie it to the event source

Let's see how this process works using S3 as an example.

Event format

First of all we need to know the format of the JSON that will be passed as input to our Lambda function. Annoyingly there's no one canonical place to go to in the AWS documentation to find this out. You may choose to bypass this step by using AWS’ Java event library, but we don't recommend it since, at time of writing at least, it hasn't been updated for a long time, doesn't include all event sources, and brings in a few too many other libraries for our liking (and that, in turn, increases cold-start time, which I hope to cover in a future installment.) This library can be used as a source of knowledge though.

So what do we do to figure out event formats? The best place to start is the collection of example test events in the Lambda Web console. To see these click on any existing Lambda function, then click on the Actions drop-down, and then click Configure Test Event. For our example we're interested in files that have been created in S3, so find the S3 Put example event. What you'll then see is a JSON object that gives all the fields you might receive when an object is placed in S3 bucket, and you configure that bucket event to invoke a Lambda function.

image

Let's imagine a super-simple example of a Lambda function — that whenever an object is PUT in an S3 bucket we log its path and name. In S3 parlance this is the key of the object, and if we look at the example JSON (just like the example above) we can see that what our Lambda function will receive is an object containing a list of Records, each of which is itself an object. Each record contains an object named s3, which contains a sub-object named (confusingly) object, and that object contains a field of type String named key. This is the value that we'll want to capture in our Lambda function.

Define a Lambda handler function

Now we know the data we want from the input event, we need to parse the object to retrieve it.

As we discussed in Part 4 (Input, Output, Context) we know that there are various ways to capture inputs in Java Lambdas, but that the most common is to use the runtime's inbuilt POJO deserialization. In our example, therefore, we can program a POJO class which represents the S3 PUT Event. But again, from what we learned in Part 4, we don't need a POJO to define the entire input object, we can just define the part of it that we need to access.

Once we've defined the POJO we can use it within the definition of our Lambda handler function, and in our case the function and the POJO look as follows:

With this code in place the Java Lambda runtime will parse the S3 PUT event for us.

Write the handler function code to process the event

Now that we've parsed the input event, we can write code to process it. In our case we want to log the filename, or key as it's more accurately named in S3. This will be as follows:

This is fairly simple code at this point, since the only complicated part is parsing the JSON object. We just need to read the field value out of the POJO instance.

Obviously real-world Lambda functions will typically be much more complicated, typically referring to multiple fields within the input event, or referring to the Context object, calling other external resources, etc.

If using a synchronous event source, complete your code by returning an appropriate object

In our example we're using S3 as an event source. This invokes our Lambda function asynchronously, and so we don't need to return a value. If we were using something like API Gateway, on the other hand, we would need to return an object, in the correct format for the event source. Discovering the format to this object is a little hit and miss — again there's no one place for documentation on this.

What is true is that you'll typically want to return a JSON object, and you can do this using the output serialization techniques we described in Part 4. I'll show an example of this using API Gateway later in the series.

Build, deploy, test

So now we have our Lambda function coded, and it's time to build, deploy and test it. Building is no different to what we've done before. When deploying for the first time with an event source I also recommend doing that just as we've done before, that is without initially configuring the event source, but as you get more confident with the Lambda platform you may deploy and configure the event source in one step. For now we'll keep those separated.

With the function deployed you can test it using the Lambda Web Console test event templates I showed above. This is a great way of debugging your input parsing, for example.

image

Configure your handler function to tie it to the event source

OK, now the final step — configuring our function to be tied to an event source. We're going to do this using the Web Console to start off with, but this can be automated.

Go to the Triggers tab of your function's configuration, and click Add Trigger. Click on the empty box, and select the system your event is coming from. In our case, that's S3. With that done, AWS will ask for secondary configuration. For S3 you'll always need to specify the source bucket, and then you'll also need to specify the Event Type. For our example this is PUT, under Object Created. Optionally you can add some path configuration too, but we'll ignore that for this example.

image

Choose whether you want to enable the trigger (we do), and then click Submit. That completes tieing our function to the event source.

Of course, we'll want to test this has worked!

In our case we can upload a file using the S3 console. Then we can switch to the logs for our Lambda by using the friendly link on the Monitoring tab of the Lambda Web Console. If everything has worked we should see a message about our file upload:

image

Like this!

As I said this can be automated, but honestly it's kind of a pain using the low level AWS CLI, and so I'm not going to cover it in this installment. One of the reasons that it's complicated is because of the Push Model of most AWS event sources, including S3. The Push Model refers partly to the fact that the link between the source system and Lambda is owned by the source system, not Lambda. This means that you need to configure the source system (not Lambda), and setup necessary cross-service permissions. The web console abstracts all, or most, of this multi service configuration from you, but the automated tooling does not. To learn more about push versus pull invocation modes take a look at the AWS documentation here.

You can see the necessary steps for configuring an S3 event source through the CLI here. Alternatively you can use the Serverless Application Model, which is a whole other topic, but if you want to dive in then the S3 documentation for that is here.

Next time

That brings us to the end of Part 6 of Learning Lambda. In this article we've explored the different ways of invoking Lambda functions, and have also examined the extremely important ability to tie Lambda functions to AWS event sources, using an S3 PUT trigger as an example.

So far we've been living in a happy place where our functions always work correctly, but of course that's a utopia that doesn't exist in the real world, and so in Part 7 we look at error handling, which you can read here, and can keep up to date with progress at https://blog.symphonia.io/ and our twitter!

Need help with Lambda, or other Serverless technologies? We're the experts! Contact us at Symphonia for expert advice, architectural review, training and on-team development.

Update 2017–11–14

  • Simplified S3 Example — Java Runtime now seems to be able to handle deserializing JSON keys that start with capital letters.