Skip to content

Parser (Pydantic)

The Parser utility simplifies data parsing and validation using Pydantic. It allows you to define data models in pure Python classes, parse and validate incoming events, and extract only the data you need.

Key features

  • Define data models using Python classes
  • Parse and validate Lambda event payloads
  • Built-in support for common AWS event sources
  • Runtime type checking with user-friendly error messages
  • Compatible with Pydantic v2.x

Getting started

Install

Powertools only supports Pydantic v2, so make sure to install the required dependencies for Pydantic v2 before using the Parser.

1
pip install aws-lambda-powertools[parser]

This is not necessary if you're installing Powertools for AWS Lambda (Python) via Lambda Layer/SAR

You can also add as a dependency in your preferred tool: e.g., requirements.txt, pyproject.toml, etc.

Data Model with Parser

You can define models by inheriting from BaseModel or any other supported type through TypeAdapter to parse incoming events. Pydantic then validates the data, ensuring that all fields conform to the specified types and maintaining data integrity.

Info

The new TypeAdapter feature provide a flexible way to perform validation and serialization based on a Python type. Read more in the Pydantic documentation.

Event parser

The @event_parser decorator automatically parses the incoming event into the specified Pydantic model MyEvent. If the input doesn't match the model's structure or type requirements, it raises a ValidationError directly from Pydantic.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from pydantic import BaseModel

from aws_lambda_powertools.utilities.parser import event_parser


class MyEvent(BaseModel):
    id: int
    name: str


@event_parser(model=MyEvent)
def lambda_handler(event: MyEvent, context):
    # if your model is valid, you can return
    return {"statusCode": 200, "body": f"Hello {event.name}, your ID is {event.id}"}
1
2
3
4
{
    "id": "12345",
    "name": "Jane Doe"
}

Parse function

You can use the parse() function when you need to have flexibility with different event formats, custom pre-parsing logic, and better exception handling.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from pydantic import BaseModel, ValidationError

from aws_lambda_powertools.utilities.parser import parse


# Define a Pydantic model for the expected structure of the input
class MyEvent(BaseModel):
    id: int
    name: str


def lambda_handler(event: dict, context):
    try:
        # Manually parse the incoming event into MyEvent model
        parsed_event: MyEvent = parse(model=MyEvent, event=event)
        return {"statusCode": 200, "body": f"Hello {parsed_event.name}, your ID is {parsed_event.id}"}
    except ValidationError as e:
        # Catch validation errors and return a 400 response
        return {"statusCode": 400, "body": f"Validation error: {str(e)}"}
1
2
3
4
{
    "id": "12345",
    "name": "Jane Doe"
}

Keys differences between parse and event_parser

The parse() function offers more flexibility and control:

  • It allows parsing different parts of an event using multiple models.
  • You can conditionally handle events before parsing them.
  • It's useful for integrating with complex workflows where a decorator might not be sufficient.
  • It provides more control over the validation process and handling exceptions.

The @event_parser decorator is ideal for:

  • Fail-fast scenarios where you want to immediately stop execution if the event payload is invalid.
  • Simplifying your code by automatically parsing and validating the event at the function entry point.

Built-in models

You can use pre-built models to work events from AWS services, so you don’t need to create them yourself. We’ve already done that for you!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from aws_lambda_powertools.utilities.parser import parse
from aws_lambda_powertools.utilities.parser.models import SqsModel
from aws_lambda_powertools.utilities.typing import LambdaContext


def lambda_handler(event: dict, context: LambdaContext) -> list:
    parsed_event = parse(model=SqsModel, event=event)

    results = []
    for record in parsed_event.Records:
        results.append(
            {
                "message_id": record.messageId,
                "body": record.body,
            },
        )
    return results
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
    "Records": [
        {
            "messageId": "059f36b4-87a3-44ab-83d2-661975830a7d",
            "receiptHandle": "AQEBwJnKyrHigUMZj6rYigCgxlaS3SLy0a...",
            "body": "Test message hello!",
            "attributes": {
                "ApproximateReceiveCount": "1",
                "SentTimestamp": "1545082649183",
                "SenderId": "AIDAIENQZJOLO23YVJ4VO",
                "ApproximateFirstReceiveTimestamp": "1545082649185"
            },
            "messageAttributes": {
                "testAttr": {
                    "stringValue": "100",
                    "binaryValue": "base64Str",
                    "dataType": "Number"
                }
            },
            "md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
            "eventSource": "aws:sqs",
            "eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:my-queue",
            "awsRegion": "us-east-2"
        }
    ]
}

The example above uses SqsModel. Other built-in models can be found below.

Model name Description
AlbModel Lambda Event Source payload for Amazon Application Load Balancer
APIGatewayProxyEventModel Lambda Event Source payload for Amazon API Gateway
ApiGatewayAuthorizerToken Lambda Event Source payload for Amazon API Gateway Lambda Authorizer with Token
ApiGatewayAuthorizerRequest Lambda Event Source payload for Amazon API Gateway Lambda Authorizer with Request
APIGatewayProxyEventV2Model Lambda Event Source payload for Amazon API Gateway v2 payload
ApiGatewayAuthorizerRequestV2 Lambda Event Source payload for Amazon API Gateway v2 Lambda Authorizer
BedrockAgentEventModel Lambda Event Source payload for Bedrock Agents
CloudFormationCustomResourceCreateModel Lambda Event Source payload for AWS CloudFormation CREATE operation
CloudFormationCustomResourceUpdateModel Lambda Event Source payload for AWS CloudFormation UPDATE operation
CloudFormationCustomResourceDeleteModel Lambda Event Source payload for AWS CloudFormation DELETE operation
CloudwatchLogsModel Lambda Event Source payload for Amazon CloudWatch Logs
DynamoDBStreamModel Lambda Event Source payload for Amazon DynamoDB Streams
EventBridgeModel Lambda Event Source payload for Amazon EventBridge
KafkaMskEventModel Lambda Event Source payload for AWS MSK payload
KafkaSelfManagedEventModel Lambda Event Source payload for self managed Kafka payload
KinesisDataStreamModel Lambda Event Source payload for Amazon Kinesis Data Streams
KinesisFirehoseModel Lambda Event Source payload for Amazon Kinesis Firehose
KinesisFirehoseSqsModel Lambda Event Source payload for SQS messages wrapped in Kinesis Firehose records
LambdaFunctionUrlModel Lambda Event Source payload for Lambda Function URL payload
S3BatchOperationModel Lambda Event Source payload for Amazon S3 Batch Operation
S3EventNotificationEventBridgeModel Lambda Event Source payload for Amazon S3 Event Notification to EventBridge.
S3Model Lambda Event Source payload for Amazon S3
S3ObjectLambdaEvent Lambda Event Source payload for Amazon S3 Object Lambda
S3SqsEventNotificationModel Lambda Event Source payload for S3 event notifications wrapped in SQS event (S3->SQS)
SesModel Lambda Event Source payload for Amazon Simple Email Service
SnsModel Lambda Event Source payload for Amazon Simple Notification Service
SqsModel Lambda Event Source payload for Amazon SQS
VpcLatticeModel Lambda Event Source payload for Amazon VPC Lattice
VpcLatticeV2Model Lambda Event Source payload for Amazon VPC Lattice v2 payload

Extending built-in models

You can extend them to include your own models, and yet have all other known fields parsed along the way.

Tip

For Mypy users, we only allow type override for fields where payload is injected e.g. detail, body, etc.

Example: custom data model with Amazon EventBridge Use the model to validate and extract relevant information from the incoming event. This can be useful when you need to handle events with a specific structure or when you want to ensure that the event data conforms to certain rules.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
from pydantic import Field, ValidationError

from aws_lambda_powertools.utilities.parser import parse
from aws_lambda_powertools.utilities.parser.models import EventBridgeModel


# Define a custom EventBridge model by extending the built-in EventBridgeModel
class MyCustomEventBridgeModel(EventBridgeModel):  # type: ignore[override]
    detail_type: str = Field(alias="detail-type")
    source: str
    detail: dict


def lambda_handler(event: dict, context):
    try:
        # Manually parse the incoming event into the custom model
        parsed_event: MyCustomEventBridgeModel = parse(model=MyCustomEventBridgeModel, event=event)

        return {"statusCode": 200, "body": f"Event from {parsed_event.source}, type: {parsed_event.detail_type}"}
    except ValidationError as e:
        return {"statusCode": 400, "body": f"Validation error: {str(e)}"}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
{
    "version": "0",
    "id": "abcd-1234-efgh-5678",
    "detail-type": "order.created",
    "source": "my.order.service",
    "account": "123456789012",
    "time": "2023-09-10T12:00:00Z",
    "region": "us-west-2",
    "resources": [],
    "detail": {
        "orderId": "O-12345",
        "amount": 100.0
    }
}

Advanced

Envelopes

You can use Envelopes to extract specific portions of complex, nested JSON structures. This is useful when your actual payload is wrapped around a known structure, for example Lambda Event Sources like EventBridge.

Envelopes can be used via envelope parameter available in both parse function and event_parser decorator.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from pydantic import BaseModel

from aws_lambda_powertools.utilities.parser import envelopes, event_parser
from aws_lambda_powertools.utilities.typing import LambdaContext


class UserModel(BaseModel):
    username: str
    parentid_1: str
    parentid_2: str


@event_parser(model=UserModel, envelope=envelopes.EventBridgeEnvelope)
def lambda_handler(event: UserModel, context: LambdaContext):
    if event.parentid_1 != event.parentid_2:
        return {"statusCode": 400, "body": "Parent ids do not match"}

    # If parentids match, proceed with user registration

    return {"statusCode": 200, "body": f"User {event.username} registered successfully"}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
{
    "version": "0",
    "id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718",
    "detail-type": "CustomerSignedUp",
    "source": "CustomerService",
    "account": "111122223333",
    "time": "2020-10-22T18:43:48Z",
    "region": "us-west-1",
    "resources": [
        "some_additional_"
    ],
    "detail": {
        "username": "universe",
        "parentid_1": "12345",
        "parentid_2": "6789"
    }
}

Built-in envelopes

You can use pre-built envelopes provided by the Parser to extract and parse specific parts of complex event structures.

Envelope name Behaviour Return
DynamoDBStreamEnvelope 1. Parses data using DynamoDBStreamModel. 2. Parses records in `NewImage` and `OldImage` keys using your model. 3. Returns a list with a dictionary containing NewImage and OldImage keys List[Dict[str, Optional[Model]]]
EventBridgeEnvelope 1. Parses data using EventBridgeModel. 2. Parses `detail` key using your model and returns it. Model
SqsEnvelope 1. Parses data using SqsModel. 2. Parses records in `body` key using your model and return them in a list. List[Model]
CloudWatchLogsEnvelope 1. Parses data using CloudwatchLogsModel which will base64 decode and decompress it. 2. Parses records in `message` key using your model and return them in a list. List[Model]
KinesisDataStreamEnvelope 1. Parses data using KinesisDataStreamModel which will base64 decode it. 2. Parses records in in `Records` key using your model and returns them in a list. List[Model]
KinesisFirehoseEnvelope 1. Parses data using KinesisFirehoseModel which will base64 decode it. 2. Parses records in in` Records` key using your model and returns them in a list. List[Model]
SnsEnvelope 1. Parses data using SnsModel. 2. Parses records in `body` key using your model and return them in a list. List[Model]
SnsSqsEnvelope 1. Parses data using SqsModel. 2. Parses SNS records in `body` key using `SnsNotificationModel`. 3. Parses data in Message key using your model and return them in a list. List[Model]
ApiGatewayEnvelope 1. Parses data using APIGatewayProxyEventModel. 2. Parses `body` key using your model and returns it. Model
ApiGatewayV2Envelope 1. Parses data using APIGatewayProxyEventV2Model. 2. Parses `body` key using your model and returns it. Model
LambdaFunctionUrlEnvelope 1. Parses data using LambdaFunctionUrlModel. 2. Parses `body` key using your model and returns it. Model
KafkaEnvelope 1. Parses data using KafkaRecordModel. 2. Parses `value` key using your model and returns it. Model
VpcLatticeEnvelope 1. Parses data using VpcLatticeModel. 2. Parses `value` key using your model and returns it. Model
BedrockAgentEnvelope 1. Parses data using BedrockAgentEventModel. 2. Parses `inputText` key using your model and returns it. Model

Bringing your own envelope

You can create your own Envelope model and logic by inheriting from BaseEnvelope, and implementing the parse method.

Here's a snippet of how the EventBridge envelope we demonstrated previously is implemented.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import json
from typing import Any, Dict, Optional, Type, TypeVar, Union

from pydantic import BaseModel

from aws_lambda_powertools.utilities.parser import BaseEnvelope, event_parser
from aws_lambda_powertools.utilities.parser.models import EventBridgeModel
from aws_lambda_powertools.utilities.typing import LambdaContext

Model = TypeVar("Model", bound=BaseModel)


class EventBridgeEnvelope(BaseEnvelope):
    def parse(self, data: Optional[Union[Dict[str, Any], Any]], model: Type[Model]) -> Optional[Model]:
        if data is None:
            return None

        parsed_envelope = EventBridgeModel.model_validate(data)
        return self._parse(data=parsed_envelope.detail, model=model)


class OrderDetail(BaseModel):
    order_id: str
    amount: float
    customer_id: str


@event_parser(model=OrderDetail, envelope=EventBridgeEnvelope)
def lambda_handler(event: OrderDetail, context: LambdaContext):
    try:
        # Process the order
        print(f"Processing order {event.order_id} for customer {event.customer_id}")
        print(f"Order amount: ${event.amount:.2f}")

        # Your business logic here
        # For example, you might save the order to a database or trigger a payment process

        return {
            "statusCode": 200,
            "body": json.dumps(
                {
                    "message": f"Order {event.order_id} processed successfully",
                    "order_id": event.order_id,
                    "amount": event.amount,
                    "customer_id": event.customer_id,
                },
            ),
        }
    except Exception as e:
        print(f"Error processing order: {str(e)}")
        return {"statusCode": 500, "body": json.dumps({"error": "Internal server error"})}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
{
    "version": "0",
    "id": "12345678-1234-1234-1234-123456789012",
    "detail-type": "Order Placed",
    "source": "com.mycompany.orders",
    "account": "123456789012",
    "time": "2023-05-03T12:00:00Z",
    "region": "us-west-2",
    "resources": [],
    "detail": {
        "order_id": "ORD-12345",
        "amount": 99.99,
        "customer_id": "CUST-6789"
    }
}

What's going on here, you might ask:

  • EventBridgeEnvelope: extracts the detail field from EventBridge events.
  • OrderDetail Model: defines and validates the structure of order data.
  • @event_parser: decorator automates parsing and validation of incoming events using the specified model and envelope.

Data model validation

Warning

This is radically different from the Validator utility which validates events against JSON Schema.

You can use Pydantic's validator for deep inspection of object values and complex relationships.

There are two types of class method decorators you can use:

  • field_validator - Useful to quickly validate an individual field and its value
  • model_validator - Useful to validate the entire model's data

Keep the following in mind regardless of which decorator you end up using it:

  • You must raise either ValueError, TypeError, or AssertionError when value is not compliant
  • You must return the value(s) itself if compliant

Field Validator

Quick validation using decorator field_validator to verify whether the field message has the value of hello world.

field_validator.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from pydantic import BaseModel, field_validator

from aws_lambda_powertools.utilities.parser import parse
from aws_lambda_powertools.utilities.typing import LambdaContext


class HelloWorldModel(BaseModel):
    message: str

    @field_validator("message")
    def is_hello_world(cls, v):
        if v != "hello world":
            raise ValueError("Message must be hello world!")
        return v


def lambda_handler(event: dict, context: LambdaContext):
    try:
        parsed_event = parse(model=HelloWorldModel, event=event)
        return {"statusCode": 200, "body": f"Received message: {parsed_event.message}"}
    except ValueError as e:
        return {"statusCode": 400, "body": str(e)}

If you run using a test event {"message": "hello universe"} you should expect the following error with the message we provided in our exception:

1
  Message must be hello world! (type=value_error)

Model validator

model_validator can help when you have a complex validation mechanism. For example finding whether data has been omitted or comparing field values.

model_validator.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
from pydantic import BaseModel, model_validator

from aws_lambda_powertools.utilities.parser import parse
from aws_lambda_powertools.utilities.typing import LambdaContext


class UserModel(BaseModel):
    username: str
    parentid_1: str
    parentid_2: str

    @model_validator(mode="after")  # (1)!
    def check_parents_match(cls, values):
        pi1, pi2 = values.get("parentid_1"), values.get("parentid_2")
        if pi1 is not None and pi2 is not None and pi1 != pi2:
            raise ValueError("Parent ids do not match")
        return values


def lambda_handler(event: dict, context: LambdaContext):
    try:
        parsed_event = parse(model=UserModel, event=event)
        return {
            "statusCode": 200,
            "body": f"Received parent id from: {parsed_event.username}",
        }
    except ValueError as e:
        return {
            "statusCode": 400,
            "body": str(e),
        }
  1. The keyword argument mode='after' will cause the validator to be called after all field-level validation and parsing has been completed.
Info

You can read more about validating list items, reusing validators, validating raw inputs, and a lot more in Pydantic's documentation.

String fields that contain JSON data

Wrap these fields with Pydantic's Json Type. This approach allows Pydantic to properly parse and validate the JSON content, ensuring type safety and data integrity.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
from __future__ import annotations

from typing import TYPE_CHECKING, Any

from pydantic import BaseModel, Json

from aws_lambda_powertools.utilities.parser import BaseEnvelope, event_parser
from aws_lambda_powertools.utilities.parser.functions import (
    _parse_and_validate_event,
    _retrieve_or_set_model_from_cache,
)
from aws_lambda_powertools.utilities.typing import LambdaContext

if TYPE_CHECKING:
    from aws_lambda_powertools.utilities.parser.types import T


class CancelOrder(BaseModel):
    order_id: int
    reason: str


class CancelOrderModel(BaseModel):
    body: Json[CancelOrder]


class CustomEnvelope(BaseEnvelope):
    def parse(self, data: dict[str, Any] | Any | None, model: type[T]):
        adapter = _retrieve_or_set_model_from_cache(model=model)
        return _parse_and_validate_event(data=data, adapter=adapter)


@event_parser(model=CancelOrderModel, envelope=CustomEnvelope)
def lambda_handler(event: CancelOrderModel, context: LambdaContext):
    cancel_order: CancelOrder = event.body

    assert cancel_order.order_id is not None

    # Process the cancel order request
    print(f"Cancelling order {cancel_order.order_id} for reason: {cancel_order.reason}")

    return {
        "statusCode": 200,
        "body": f"Order {cancel_order.order_id} cancelled successfully",
    }
1
2
3
{
    "body": "{\"order_id\": 12345, \"reason\": \"Changed my mind\"}"
}

Serialization

Models in Pydantic offer more than direct attribute access. They can be transformed, serialized, and exported in various formats.

Pydantic's definition of serialization is broader than usual. It includes converting structured objects to simpler Python types, not just data to strings or bytes. This reflects the close relationship between these processes in Pydantic.

Read more at Serialization for Pydantic documentation.

serialization_parser.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from pydantic import BaseModel

from aws_lambda_powertools.logging import Logger
from aws_lambda_powertools.utilities.parser import parse
from aws_lambda_powertools.utilities.typing import LambdaContext

logger = Logger()


class UserModel(BaseModel):
    username: str
    parentid_1: str
    parentid_2: str


def validate_user(event):
    try:
        user = parse(model=UserModel, event=event)
        return {"statusCode": 200, "body": user.model_dump_json()}
    except Exception as e:
        logger.exception("Validation error")
        return {"statusCode": 400, "body": str(e)}


@logger.inject_lambda_context
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    logger.info("Received event", extra={"event": event})

    result = validate_user(event)

    if result["statusCode"] == 200:
        user = UserModel.model_validate_json(result["body"])
        logger.info("User validated successfully", extra={"username": user.username})

        # Example of serialization
        user_dict = user.model_dump()
        user_json = user.model_dump_json()

        logger.debug("User serializations", extra={"dict": user_dict, "json": user_json})

    return result
Info

There are number of advanced use cases well documented in Pydantic's doc such as creating immutable models, declaring fields with dynamic values.

FAQ

When should I use parser vs data_classes utility?

Use data classes utility when you're after autocomplete, self-documented attributes and helpers to extract data from common event sources.

Parser is best suited for those looking for a trade-off between defining their models for deep validation, parsing and autocomplete for an additional dependency to be brought in.

How do I import X from Pydantic?

We recommend importing directly from Pydantic to access all features and stay up-to-date with the latest Pydantic updates. For example:

1
from pydantic import BaseModel, Field, ValidationError

While we export some common Pydantic classes and utilities through the parser for convenience (e.g., from aws_lambda_powertools.utilities.parser import BaseModel), importing directly from Pydantic ensures you have access to all features and the most recent updates.