Skip to content

Validation

This utility provides JSON Schema validation for events and responses, including JMESPath support to unwrap events before validation.

Key features

  • Validate incoming event and response
  • JMESPath support to unwrap events before validation applies
  • Built-in envelopes to unwrap popular event sources payloads

Getting started

Tip

All examples shared in this documentation are available within the project repository.

You can validate inbound and outbound events using validator decorator.

You can also use the standalone validate function, if you want more control over the validation process such as handling a validation error.

Tip: Using JSON Schemas for the first time?

Check this step-by-step tour in the official JSON Schema website.

We support any JSONSchema draft supported by fastjsonschema library.

Warning

Both validator decorator and validate standalone function expects your JSON Schema to be a dictionary, not a filename.

Install

This is not necessary if you're installing Powertools for AWS Lambda (Python) via Lambda Layer/SAR

Add aws-lambda-powertools[validation] as a dependency in your preferred tool: e.g., requirements.txt, pyproject.toml. This will ensure you have the required dependencies before using Validation.

Validator decorator

Validator decorator is typically used to validate either inbound or functions' response.

It will fail fast with SchemaValidationError exception if event or response doesn't conform with given JSON Schema.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
from dataclasses import dataclass, field
from uuid import uuid4

import getting_started_validator_decorator_schema as schemas

from aws_lambda_powertools.utilities import parameters
from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.utilities.validation import validator

# we can get list of allowed IPs from AWS Parameter Store using Parameters Utility
# See: https://awslabs.github.io/aws-lambda-powertools-python/latest/utilities/parameters/
ALLOWED_IPS = parameters.get_parameter("/lambda-powertools/allowed_ips")


class UserPermissionsError(Exception):
    ...


@dataclass
class User:
    ip: str
    permissions: list
    user_id: str = field(default_factory=lambda: f"{uuid4()}")
    name: str = "Project Lambda Powertools"


# using a decorator to validate input and output data
@validator(inbound_schema=schemas.INPUT, outbound_schema=schemas.OUTPUT)
def lambda_handler(event, context: LambdaContext) -> dict:

    try:
        user_details: dict = {}

        # get permissions by user_id and project
        if (
            event.get("user_id") == "0d44b083-8206-4a3a-aa95-5d392a99be4a"
            and event.get("project") == "powertools"
            and event.get("ip") in ALLOWED_IPS
        ):
            user_details = User(ip=event.get("ip"), permissions=["read", "write"]).__dict__

        # the body must be an object because must match OUTPUT schema, otherwise it fails
        return {"body": user_details or None, "statusCode": 200 if user_details else 204}
    except Exception as e:
        raise UserPermissionsError(str(e))
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
INPUT = {
    "$schema": "http://json-schema.org/draft-07/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "title": "Sample schema",
    "description": "The root schema comprises the entire JSON document.",
    "examples": [{"user_id": "0d44b083-8206-4a3a-aa95-5d392a99be4a", "project": "powertools", "ip": "192.168.0.1"}],
    "required": ["user_id", "project", "ip"],
    "properties": {
        "user_id": {
            "$id": "#/properties/user_id",
            "type": "string",
            "title": "The user_id",
            "examples": ["0d44b083-8206-4a3a-aa95-5d392a99be4a"],
            "maxLength": 50,
        },
        "project": {
            "$id": "#/properties/project",
            "type": "string",
            "title": "The project",
            "examples": ["powertools"],
            "maxLength": 30,
        },
        "ip": {
            "$id": "#/properties/ip",
            "type": "string",
            "title": "The ip",
            "format": "ipv4",
            "examples": ["192.168.0.1"],
            "maxLength": 30,
        },
    },
}

OUTPUT = {
    "$schema": "http://json-schema.org/draft-07/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "title": "Sample outgoing schema",
    "description": "The root schema comprises the entire JSON document.",
    "examples": [{"statusCode": 200, "body": {}}],
    "required": ["statusCode", "body"],
    "properties": {
        "statusCode": {
            "$id": "#/properties/statusCode",
            "type": "integer",
            "title": "The statusCode",
            "examples": [200],
            "maxLength": 3,
        },
        "body": {
            "$id": "#/properties/body",
            "type": "object",
            "title": "The body",
            "examples": [
                '{"ip": "192.168.0.1", "permissions": ["read", "write"], "user_id": "7576b683-295e-4f69-b558-70e789de1b18", "name": "Project Lambda Powertools"}'  # noqa E501
            ],
        },
    },
}
1
2
3
4
5
{
    "user_id": "0d44b083-8206-4a3a-aa95-5d392a99be4a",
    "project": "powertools",
    "ip": "192.168.0.1"
}
Note

It's not a requirement to validate both inbound and outbound schemas - You can either use one, or both.

Validate function

Validate standalone function is typically used within the Lambda handler, or any other methods that perform data validation.

You can also gracefully handle schema validation errors by catching SchemaValidationError exception.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import getting_started_validator_standalone_schema as schemas

from aws_lambda_powertools.utilities import parameters
from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.utilities.validation import SchemaValidationError, validate

# we can get list of allowed IPs from AWS Parameter Store using Parameters Utility
# See: https://awslabs.github.io/aws-lambda-powertools-python/latest/utilities/parameters/
ALLOWED_IPS = parameters.get_parameter("/lambda-powertools/allowed_ips")


def lambda_handler(event, context: LambdaContext) -> dict:
    try:
        user_authenticated: str = ""

        # using standalone function to validate input data only
        validate(event=event, schema=schemas.INPUT)

        if (
            event.get("user_id") == "0d44b083-8206-4a3a-aa95-5d392a99be4a"
            and event.get("project") == "powertools"
            and event.get("ip") in ALLOWED_IPS
        ):
            user_authenticated = "Allowed"

        # in this example the body can be of any type because we are not validating the OUTPUT
        return {"body": user_authenticated, "statusCode": 200 if user_authenticated else 204}
    except SchemaValidationError as exception:
        # SchemaValidationError indicates where a data mismatch is
        return {"body": str(exception), "statusCode": 400}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
INPUT = {
    "$schema": "http://json-schema.org/draft-07/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "title": "Sample schema",
    "description": "The root schema comprises the entire JSON document.",
    "examples": [{"user_id": "0d44b083-8206-4a3a-aa95-5d392a99be4a", "powertools": "lessa", "ip": "192.168.0.1"}],
    "required": ["user_id", "project", "ip"],
    "properties": {
        "user_id": {
            "$id": "#/properties/user_id",
            "type": "string",
            "title": "The user_id",
            "examples": ["0d44b083-8206-4a3a-aa95-5d392a99be4a"],
            "maxLength": 50,
        },
        "project": {
            "$id": "#/properties/project",
            "type": "string",
            "title": "The project",
            "examples": ["powertools"],
            "maxLength": 30,
        },
        "ip": {
            "$id": "#/properties/ip",
            "type": "string",
            "title": "The ip",
            "format": "ipv4",
            "examples": ["192.168.0.1"],
            "maxLength": 30,
        },
    },
}
1
2
3
4
5
{
    "user_id": "0d44b083-8206-4a3a-aa95-5d392a99be4a",
    "project": "powertools",
    "ip": "192.168.0.1"
}

Unwrapping events prior to validation

You might want to validate only a portion of your event - This is what the envelope parameter is for.

Envelopes are JMESPath expressions to extract a portion of JSON you want before applying JSON Schema validation.

Here is a sample custom EventBridge event, where we only validate what's inside the detail key:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import boto3
import getting_started_validator_unwrapping_schema as schemas

from aws_lambda_powertools.utilities.data_classes.event_bridge_event import (
    EventBridgeEvent,
)
from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.utilities.validation import validator

s3_client = boto3.resource("s3")


# we use the 'envelope' parameter to extract the payload inside the 'detail' key before validating
@validator(inbound_schema=schemas.INPUT, envelope="detail")
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    my_event = EventBridgeEvent(event)
    data = my_event.detail.get("data", {})
    s3_bucket, s3_key = data.get("s3_bucket"), data.get("s3_key")

    try:
        s3_object = s3_client.Object(bucket_name=s3_bucket, key=s3_key)
        payload = s3_object.get()["Body"]
        content = payload.read().decode("utf-8")

        return {"message": process_data_object(content), "success": True}
    except s3_client.meta.client.exceptions.NoSuchBucket as exception:
        return return_error_message(str(exception))
    except s3_client.meta.client.exceptions.NoSuchKey as exception:
        return return_error_message(str(exception))


def return_error_message(message: str) -> dict:
    return {"message": message, "success": False}


def process_data_object(content: str) -> str:
    # insert logic here
    return "Data OK"
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
INPUT = {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "https://example.com/object1660222326.json",
    "type": "object",
    "title": "Sample schema",
    "description": "The root schema comprises the entire JSON document.",
    "examples": [
        {
            "data": {
                "s3_bucket": "aws-lambda-powertools",
                "s3_key": "event.txt",
                "file_size": 200,
                "file_type": "text/plain",
            }
        }
    ],
    "required": ["data"],
    "properties": {
        "data": {
            "$id": "#root/data",
            "title": "Root",
            "type": "object",
            "required": ["s3_bucket", "s3_key", "file_size", "file_type"],
            "properties": {
                "s3_bucket": {
                    "$id": "#root/data/s3_bucket",
                    "title": "The S3 Bucker",
                    "type": "string",
                    "default": "",
                    "examples": ["aws-lambda-powertools"],
                    "pattern": "^.*$",
                },
                "s3_key": {
                    "$id": "#root/data/s3_key",
                    "title": "The S3 Key",
                    "type": "string",
                    "default": "",
                    "examples": ["folder/event.txt"],
                    "pattern": "^.*$",
                },
                "file_size": {
                    "$id": "#root/data/file_size",
                    "title": "The file size",
                    "type": "integer",
                    "examples": [200],
                    "default": 0,
                },
                "file_type": {
                    "$id": "#root/data/file_type",
                    "title": "The file type",
                    "type": "string",
                    "default": "",
                    "examples": ["text/plain"],
                    "pattern": "^.*$",
                },
            },
        }
    },
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
{
    "id": "cdc73f9d-aea9-11e3-9d5a-835b769c0d9c",
    "detail-type": "CustomEvent",
    "source": "mycompany.service",
    "account": "123456789012",
    "time": "1970-01-01T00:00:00Z",
    "region": "us-east-1",
    "resources": [],
    "detail": {
        "data": {
            "s3_bucket": "aws-lambda-powertools",
            "s3_key": "folder/event.txt",
            "file_size": 200,
            "file_type": "text/plain"
        }
    }
}

This is quite powerful because you can use JMESPath Query language to extract records from arrays, combine pipe and function expressions.

When combined, these features allow you to extract what you need before validating the actual payload.

Built-in envelopes

We provide built-in envelopes to easily extract the payload from popular event sources.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import boto3
import unwrapping_popular_event_source_schema as schemas
from botocore.exceptions import ClientError

from aws_lambda_powertools.utilities.data_classes.event_bridge_event import (
    EventBridgeEvent,
)
from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.utilities.validation import envelopes, validator


# extracting detail from EventBridge custom event
# see: https://awslabs.github.io/aws-lambda-powertools-python/latest/utilities/jmespath_functions/#built-in-envelopes
@validator(inbound_schema=schemas.INPUT, envelope=envelopes.EVENTBRIDGE)
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    my_event = EventBridgeEvent(event)
    ec2_client = boto3.resource("ec2", region_name=my_event.region)

    try:
        instance_id = my_event.detail.get("instance_id")
        instance = ec2_client.Instance(instance_id)
        instance.stop()

        return {"message": f"Successfully stopped {instance_id}", "success": True}
    except ClientError as exception:
        return {"message": str(exception), "success": False}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
INPUT = {
    "definitions": {},
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "https://example.com/object1660233148.json",
    "title": "Root",
    "type": "object",
    "required": ["instance_id", "region"],
    "properties": {
        "instance_id": {
            "$id": "#root/instance_id",
            "title": "Instance_id",
            "type": "string",
            "default": "",
            "examples": ["i-042dd005362091826"],
            "pattern": "^.*$",
        },
        "region": {
            "$id": "#root/region",
            "title": "Region",
            "type": "string",
            "default": "",
            "examples": ["us-east-1"],
            "pattern": "^.*$",
        },
    },
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
{
    "id": "cdc73f9d-aea9-11e3-9d5a-835b769c0d9c",
    "detail-type": "Scheduled Event",
    "source": "aws.events",
    "account": "123456789012",
    "time": "1970-01-01T00:00:00Z",
    "region": "us-east-1",
    "resources": [
        "arn:aws:events:us-east-1:123456789012:rule/ExampleRule"
    ],
    "detail": {
        "instance_id": "i-042dd005362091826",
        "region": "us-east-2"
    }
}

Here is a handy table with built-in envelopes along with their JMESPath expressions in case you want to build your own.

Envelope JMESPath expression
API_GATEWAY_HTTP powertools_json(body)
API_GATEWAY_REST powertools_json(body)
CLOUDWATCH_EVENTS_SCHEDULED detail
CLOUDWATCH_LOGS awslogs.powertools_base64_gzip(data) | powertools_json(@).logEvents[*]
EVENTBRIDGE detail
KINESIS_DATA_STREAM Records[*].kinesis.powertools_json(powertools_base64(data))
SNS Records[0].Sns.Message | powertools_json(@)
SQS Records[*].powertools_json(body)

Advanced

Validating custom formats

Note

JSON Schema DRAFT 7 has many new built-in formats such as date, time, and specifically a regex format which might be a better replacement for a custom format, if you do have control over the schema.

JSON Schemas with custom formats like awsaccountid will fail validation. If you have these, you can pass them using formats parameter:

custom_json_schema_type_format.json
1
2
3
4
5
6
{
    "accountid": {
        "format": "awsaccountid",
        "type": "string"
    }
}

For each format defined in a dictionary key, you must use a regex, or a function that returns a boolean to instruct the validator on how to proceed when encountering that type.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import json
import re

import boto3
import custom_format_schema as schemas

from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.utilities.validation import SchemaValidationError, validate

# awsaccountid must have 12 digits
custom_format = {"awsaccountid": lambda value: re.match(r"^(\d{12})$", value)}


def lambda_handler(event, context: LambdaContext) -> dict:
    try:
        # validate input using custom json format
        validate(event=event, schema=schemas.INPUT, formats=custom_format)

        client_organization = boto3.client("organizations", region_name=event.get("region"))
        account_data = client_organization.describe_account(AccountId=event.get("accountid"))

        return {
            "account": json.dumps(account_data.get("Account"), default=str),
            "message": "Success",
            "statusCode": 200,
        }
    except SchemaValidationError as exception:
        return return_error_message(str(exception))
    except Exception as exception:
        return return_error_message(str(exception))


def return_error_message(message: str) -> dict:
    return {"account": None, "message": message, "statusCode": 400}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
INPUT = {
    "definitions": {},
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "https://example.com/object1660245931.json",
    "title": "Root",
    "type": "object",
    "required": ["accountid", "region"],
    "properties": {
        "accountid": {
            "$id": "#root/accountid",
            "title": "The accountid",
            "type": "string",
            "format": "awsaccountid",
            "default": "",
            "examples": ["123456789012"],
        },
        "region": {
            "$id": "#root/region",
            "title": "The region",
            "type": "string",
            "default": "",
            "examples": ["us-east-1"],
            "pattern": "^.*$",
        },
    },
}
1
2
3
4
{
    "accountid": "200984112386",
    "region": "us-east-1"
}

Built-in JMESPath functions

You might have events or responses that contain non-encoded JSON, where you need to decode before validating them.

You can use our built-in JMESPath functions within your expressions to do exactly that to deserialize JSON Strings, decode base64, and decompress gzip data.

Info

We use these for built-in envelopes to easily to decode and unwrap events from sources like Kinesis, CloudWatch Logs, etc.


Last update: 2023-06-08