Skip to content

Batch Processing

The batch processing utility provides a way to handle partial failures when processing batches of messages from SQS queues, SQS FIFO queues, Kinesis Streams, or DynamoDB Streams.

stateDiagram-v2
    direction LR
    BatchSource: Amazon SQS <br/><br/> Amazon Kinesis Data Streams <br/><br/> Amazon DynamoDB Streams <br/><br/>
    LambdaInit: Lambda invocation
    BatchProcessor: Batch Processor
    RecordHandler: Record Handler function
    YourLogic: Your logic to process each batch item
    LambdaResponse: Lambda response
    BatchSource --> LambdaInit
    LambdaInit --> BatchProcessor
    BatchProcessor --> RecordHandler
    state BatchProcessor {
        [*] --> RecordHandler: Your function
        RecordHandler --> YourLogic
    }
    RecordHandler --> BatchProcessor: Collect results
    BatchProcessor --> LambdaResponse: Report items that failed processing

Key Features

  • Reports batch item failures to reduce number of retries for a record upon errors
  • Simple interface to process each batch record
  • Integrates with Java Events library and the deserialization module
  • Build your own batch processor by extending primitives

Background

When using SQS, Kinesis Data Streams, or DynamoDB Streams as a Lambda event source, your Lambda functions are triggered with a batch of messages. If your function fails to process any message from the batch, the entire batch returns to your queue or stream. This same batch is then retried until either condition happens first: a) your Lambda function returns a successful response, b) record reaches maximum retry attempts, or c) records expire.

journey
  section Conditions
    Successful response: 5: Success
    Maximum retries: 3: Failure
    Records expired: 1: Failure

This behavior changes when you enable Report Batch Item Failures feature in your Lambda function event source configuration:

  • SQS queues. Only messages reported as failure will return to the queue for a retry, while successful ones will be deleted.
  • Kinesis data streams and DynamoDB streams. Single reported failure will use its sequence number as the stream checkpoint. Multiple reported failures will use the lowest sequence number as checkpoint.

With this utility, batch records are processed individually – only messages that failed to be processed return to the queue or stream for a further retry. You simply build a BatchProcessor in your handler, and return its response from the handler's processMessage implementation. Exceptions are handled internally and an appropriate partial response for the message source is returned to Lambda for you.

Warning

While this utility lowers the chance of processing messages more than once, it is still not guaranteed. We recommend implementing processing logic in an idempotent manner wherever possible, for instance, by taking advantage of the idempotency module. More details on how Lambda works with SQS can be found in the AWS documentation

Install

We simply add powertools-batch to our build dependencies. Note - if you are using other Powertools modules that require code-weaving, such as powertools-core, you will need to configure that also.

1
2
3
4
5
6
7
8
9
<dependencies>
    ...
    <dependency>
        <groupId>software.amazon.lambda</groupId>
        <artifactId>powertools-batch</artifactId>
        <version>1.18.0</version>
    </dependency>
    ...
</dependencies>
1
2
3
4
5
6
7
    repositories {
        mavenCentral()
    }

    dependencies {
        implementation 'software.amazon.lambda:powertools-batch:1.18.0'
    }

Getting Started

For this feature to work, you need to (1) configure your Lambda function event source to use ReportBatchItemFailures, and (2) return a specific response to report which records failed to be processed.

You can use your preferred deployment framework to set the correct configuration while this utility, while the powertools-batch module handles generating the response, which simply needs to be returned as the result of your Lambda handler.

A complete Serverless Application Model example can be found here covering all of the batch sources.

For more information on configuring ReportBatchItemFailures, see the details for SQS, Kinesis,and DynamoDB Streams.

You do not need any additional IAM permissions to use this utility, except for what each event source requires.

Processing messages from SQS

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.SQSBatchResponse;
import com.amazonaws.services.lambda.runtime.events.SQSEvent;
import software.amazon.lambda.powertools.batch.BatchMessageHandlerBuilder;
import software.amazon.lambda.powertools.batch.handler.BatchMessageHandler;

public class SqsBatchHandler implements RequestHandler<SQSEvent, SQSBatchResponse> {

    private final BatchMessageHandler<SQSEvent, SQSBatchResponse> handler;

    public SqsBatchHandler() {
        handler = new BatchMessageHandlerBuilder()
                .withSqsBatchHandler()
                .buildWithMessageHandler(this::processMessage, Product.class);
    }

    @Override
    public SQSBatchResponse handleRequest(SQSEvent sqsEvent, Context context) {
        return handler.processBatch(sqsEvent, context);
    }


    private void processMessage(Product p, Context c) {
        // Process the product
    }

}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public class Product {
    private long id;

    private String name;

    private double price;

    public Product() {
    }

    public Product(long id, String name, double price) {
        this.id = id;
        this.name = name;
        this.price = price;
    }

    public long getId() {
        return id;
    }

    public void setId(long id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public double getPrice() {
        return price;
    }

    public void setPrice(double price) {
        this.price = price;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
    {
        "Records": [
        {
            "messageId": "d9144555-9a4f-4ec3-99a0-34ce359b4b54",
            "receiptHandle": "13e7f7851d2eaa5c01f208ebadbf1e72==",
            "body": "{\n  \"id\": 1234,\n  \"name\": \"product\",\n  \"price\": 42\n}",
            "attributes": {
                "ApproximateReceiveCount": "1",
                "SentTimestamp": "1601975706495",
                "SenderId": "AROAIFU437PVZ5L2J53F5",
                "ApproximateFirstReceiveTimestamp": "1601975706499"
            },
            "messageAttributes": {
            },
            "md5OfBody": "13e7f7851d2eaa5c01f208ebadbf1e72",
            "eventSource": "aws:sqs",
            "eventSourceARN": "arn:aws:sqs:eu-central-1:123456789012:TestLambda",
            "awsRegion": "eu-central-1"
        },
        {
            "messageId": "e9144555-9a4f-4ec3-99a0-34ce359b4b54",
            "receiptHandle": "13e7f7851d2eaa5c01f208ebadbf1e72==",
            "body": "{\n  \"id\": 12345,\n  \"name\": \"product5\",\n  \"price\": 45\n}",
            "attributes": {
                "ApproximateReceiveCount": "1",
                "SentTimestamp": "1601975706495",
                "SenderId": "AROAIFU437PVZ5L2J53F5",
                "ApproximateFirstReceiveTimestamp": "1601975706499"
            },
            "messageAttributes": {
            },
            "md5OfBody": "13e7f7851d2eaa5c01f208ebadbf1e72",
            "eventSource": "aws:sqs",
            "eventSourceARN": "arn:aws:sqs:eu-central-1:123456789012:TestLambda",
            "awsRegion": "eu-central-1"
        }]
    }

Processing messages from Kinesis Streams

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.KinesisEvent;
import com.amazonaws.services.lambda.runtime.events.StreamsEventResponse;
import software.amazon.lambda.powertools.batch.BatchMessageHandlerBuilder;
import software.amazon.lambda.powertools.batch.handler.BatchMessageHandler;

public class KinesisBatchHandler implements RequestHandler<KinesisEvent, StreamsEventResponse> {

    private final BatchMessageHandler<KinesisEvent, StreamsEventResponse> handler;

    public KinesisBatchHandler() {
        handler = new BatchMessageHandlerBuilder()
                .withKinesisBatchHandler()
                .buildWithMessageHandler(this::processMessage, Product.class);
    }

    @Override
    public StreamsEventResponse handleRequest(KinesisEvent kinesisEvent, Context context) {
        return handler.processBatch(kinesisEvent, context);
    }

    private void processMessage(Product p, Context c) {
        // process the product
    }

}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public class Product {
    private long id;

    private String name;

    private double price;

    public Product() {
    }

    public Product(long id, String name, double price) {
        this.id = id;
        this.name = name;
        this.price = price;
    }

    public long getId() {
        return id;
    }

    public void setId(long id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public double getPrice() {
        return price;
    }

    public void setPrice(double price) {
        this.price = price;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
    {
      "Records": [
        {
          "kinesis": {
            "partitionKey": "partitionKey-03",
            "kinesisSchemaVersion": "1.0",
            "data": "eyJpZCI6MTIzNCwgIm5hbWUiOiJwcm9kdWN0IiwgInByaWNlIjo0Mn0=",
            "sequenceNumber": "49545115243490985018280067714973144582180062593244200961",
            "approximateArrivalTimestamp": 1428537600,
            "encryptionType": "NONE"
          },
          "eventSource": "aws:kinesis",
          "eventID": "shardId-000000000000:49545115243490985018280067714973144582180062593244200961",
          "invokeIdentityArn": "arn:aws:iam::EXAMPLE",
          "eventVersion": "1.0",
          "eventName": "aws:kinesis:record",
          "eventSourceARN": "arn:aws:kinesis:EXAMPLE",
          "awsRegion": "eu-central-1"
        },
        {
          "kinesis": {
            "partitionKey": "partitionKey-03",
            "kinesisSchemaVersion": "1.0",
            "data": "eyJpZCI6MTIzNDUsICJuYW1lIjoicHJvZHVjdDUiLCAicHJpY2UiOjQ1fQ==",
            "sequenceNumber": "49545115243490985018280067714973144582180062593244200962",
            "approximateArrivalTimestamp": 1428537600,
            "encryptionType": "NONE"
          },
          "eventSource": "aws:kinesis",
          "eventID": "shardId-000000000000:49545115243490985018280067714973144582180062593244200961",
          "invokeIdentityArn": "arn:aws:iam::EXAMPLE",
          "eventVersion": "1.0",
          "eventName": "aws:kinesis:record",
          "eventSourceARN": "arn:aws:kinesis:EXAMPLE",
          "awsRegion": "eu-central-1"
        }
      ]
    }

Processing messages from DynamoDB Streams

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.DynamodbEvent;
import com.amazonaws.services.lambda.runtime.events.StreamsEventResponse;
import software.amazon.lambda.powertools.batch.BatchMessageHandlerBuilder;
import software.amazon.lambda.powertools.batch.handler.BatchMessageHandler;

public class DynamoDBStreamBatchHandler implements RequestHandler<DynamodbEvent, StreamsEventResponse> {

    private final BatchMessageHandler<DynamodbEvent, StreamsEventResponse> handler;

    public DynamoDBStreamBatchHandler() {
        handler = new BatchMessageHandlerBuilder()
                .withDynamoDbBatchHandler()
                .buildWithRawMessageHandler(this::processMessage);
    }

    @Override
    public StreamsEventResponse handleRequest(DynamodbEvent ddbEvent, Context context) {
        return handler.processBatch(ddbEvent, context);
    }

    private void processMessage(DynamodbEvent.DynamodbStreamRecord dynamodbStreamRecord, Context context) {
        // Process the change record
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
    {
      "Records": [
        {
          "eventID": "c4ca4238a0b923820dcc509a6f75849b",
          "eventName": "INSERT",
          "eventVersion": "1.1",
          "eventSource": "aws:dynamodb",
          "awsRegion": "eu-central-1",
          "dynamodb": {
            "Keys": {
              "Id": {
                "N": "101"
              }
            },
            "NewImage": {
              "Message": {
                "S": "New item!"
              },
              "Id": {
                "N": "101"
              }
            },
            "ApproximateCreationDateTime": 1428537600,
            "SequenceNumber": "4421584500000000017450439091",
            "SizeBytes": 26,
            "StreamViewType": "NEW_AND_OLD_IMAGES"
          },
          "eventSourceARN": "arn:aws:dynamodb:eu-central-1:123456789012:table/ExampleTableWithStream/stream/2015-06-27T00:48:05.899",
          "userIdentity": {
            "principalId": "dynamodb.amazonaws.com",
            "type": "Service"
          }
        },
        {
          "eventID": "c81e728d9d4c2f636f067f89cc14862c",
          "eventName": "MODIFY",
          "eventVersion": "1.1",
          "eventSource": "aws:dynamodb",
          "awsRegion": "eu-central-1",
          "dynamodb": {
            "Keys": {
              "Id": {
                "N": "101"
              }
            },
            "NewImage": {
              "Message": {
                "S": "This item has changed"
              },
              "Id": {
                "N": "101"
              }
            },
            "OldImage": {
              "Message": {
                "S": "New item!"
              },
              "Id": {
                "N": "101"
              }
            },
            "ApproximateCreationDateTime": 1428537600,
            "SequenceNumber": "4421584500000000017450439092",
            "SizeBytes": 59,
            "StreamViewType": "NEW_AND_OLD_IMAGES"
          },
          "eventSourceARN": "arn:aws:dynamodb:eu-central-1:123456789012:table/ExampleTableWithStream/stream/2015-06-27T00:48:05.899"
        }
      ]
    }

Handling Messages

Raw message and deserialized message handlers

You must provide either a raw message handler, or a deserialized message handler. The raw message handler receives the envelope record type relevant for the particular event source - for instance, the SQS event source provides SQSMessage instances. The deserialized message handler extracts the body from this envelope, and deserializes it to a user-defined type. Note that deserialized message handlers are not relevant for the DynamoDB provider, as the format of the inner message is fixed by DynamoDB.

In general, the deserialized message handler should be used unless you need access to information on the envelope.

1
2
3
4
5
6
7
8
9
public void setup() {
    BatchMessageHandler<SQSEvent, SQSBatchResponse> handler = new BatchMessageHandlerBuilder()
            .withSqsBatchHandler()
            .buildWithRawMessageHandler(this::processRawMessage);
}

private void processRawMessage(SQSEvent.SQSMessage sqsMessage) {
    // Do something with the raw message
}
1
2
3
4
5
6
7
8
9
public void setup() {
    BatchMessageHandler<SQSEvent, SQSBatchResponse> handler = new BatchMessageHandlerBuilder()
            .withSqsBatchHandler()
            .buildWitMessageHandler(this::processRawMessage, Product.class);
}

private void processMessage(Product product) {
    // Do something with the deserialized message
}

Success and failure handlers

You can register a success or failure handler which will be invoked as each message is processed by the batch module. This may be useful for reporting - for instance, writing metrics or logging failures.

These handlers are optional. Batch failures are handled by the module regardless of whether or not you provide a custom failure handler.

Handlers can be provided when building the batch processor and are available for all event sources. For instance for DynamoDB:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
BatchMessageHandler<DynamodbEvent, StreamsEventResponse> handler = new BatchMessageHandlerBuilder()
            .withDynamoDbBatchHandler()
            .withSuccessHandler((m) -> {
                // Success handler receives the raw message
                LOGGER.info("Message with sequenceNumber {} was successfully processed",
                    m.getDynamodb().getSequenceNumber());
            })
            .withFailureHandler((m, e) -> {
                // Failure handler receives the raw message and the exception thrown.
                LOGGER.info("Message with sequenceNumber {} failed to be processed: {}"
                , e.getDynamodb().getSequenceNumber(), e);
            })
            .buildWithMessageHander(this::processMessage);

Info

If the success handler throws an exception, the item it is processing will be marked as failed by the batch processor. If the failure handler throws, the batch processing will continue; the item it is processing has already been marked as failed.

Lambda Context

Both raw and deserialized message handlers can choose to take the Lambda context as an argument if they need it, or not:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
    public class ClassWithHandlers {

        private void processMessage(Product product) {
            // Do something with the raw message
        }

        private void processMessageWithContext(Product product, Context context) {
            // Do something with the raw message and the lambda Context
        }
    }