Automation
Continuous integration practices¶
We adhere to industry recommendations from the OSSF Scorecard project, among others.
Since all code changes require a pull request (PR) along with one or more reviewers, we automate quality and security checks before, during, and after a PR is merged to trunk (develop
).
We use a combination of tools coupled with peer review to increase its compound effect in detecting issues early.
This is a snapshot of our automated checks at a glance.
Pre-commit checks¶
Pre-commit checks are crucial for a fast feedback loop while ensuring security practices at the individual change level.
To prevent scenarios where these checks are intentionally omitted at the client side, we run at CI level too.
These run locally only for changed files
- Merge conflict check. Checks for merge strings in each individual change accidentally left unresolved to prevent breakage.
- Code linting. Linter checks for industry quality standards and known bad practices that could lead to abuse.
- CloudFormation linting.
cfn-lint
ensures best practices at our documentation examples. - Markdown linting. Primarily industry markdown practices at this stage.
- GitHub Actions linting.
actionlint
ensures workflows follow GitHub Actions security practices. It guards against numerous leading practices to prevent common configuration mistakes, insecure inline scripts, among many others. - Terraform linting. As of now, largely formatting until we increase our Terraform coverage in documentation examples.
- Secrets linting. Detects industry credentials that might be accidentally leaked in source code.
Pre-Pull Request checks¶
For an improved contributing experience, most of our checks can run locally. For maintainers, this also means increased focus on reviewing actual value instead of standards and security malpractices that can be caught earlier.
These are in addition to pre-commit checks.
- Static typing analysis.
mypy
checks for static typing annotations to prevent common bugs in Python that may or may not lead to abuse. - Tests. We run
unit
,functional
, andperformance
tests (see our definition). Besides breaking changes, we are investing in mutation testing to find additional sources of bugs and potential abuse. - Security baseline.
bandit
detects common security issues defined by Python Code Quality Authority (PyCQA). - Complexity baseline. We run a series of maintainability and cyclomatic checks to reduce code and logic complexity. This aids reviewers' cognitive overhead and long-term maintainers revisiting legacy code at a later date.
Pull Request checks¶
While we trust contributors and maintainers do go through pre-commit and pre-pull request due diligence, we verify them at CI level.
Checks described earlier are omitted to improve reading experience.
- Semantic PR title. We enforce PR titles follow semantic naming, for example
chore(category): change
. This benefits contributors with a lower entry bar, no need for semantic commits. It also benefits everyone looking for an useful changelog message on what changed and where. - Related issue check. Every change require an issue describing its needs. This enforces a PR has a related issue by blocking merge operations if missing.
- Acknowledgment check. Ensures PR template is used and every contributor is aware of code redistribution.
- Code coverage diff. Educates contributors and maintainers about code coverage differences for a given change.
- Contribution size check. Suggests contributors and maintainers to break up large changes (100-499 LOC) in smaller PRs. It helps reduce overlooking security and other practices due to increased cognitive overhead.
- Dependency vulnerability check. Verifies any dependency changes for common vulnerability exposures (CVEs), in addition to our daily check on any dependencies used (e.g., Python, Docker, Go, etc.)
- GitHub Actions security check. Enforces use of immutable 3rd-party GitHub Actions (_e.g.,
actions/checkout@<git-SHA>_
) to prevent abuse. Upgrades are handled by a separate automated process that includes a maintainer review to also prevent unexpected behavior changes.
After merge checks¶
Checks described earlier are omitted to improve reading experience.
We strike a balance in security and contribution experience. These automated checks take several minutes to complete. Failures are reviewed by a maintainer on-call and before a release.
- End-to-end tests. We run E2E with a high degree of parallelization. While it is designed to also run locally, it may incur AWS charges to contributors. For additional security, all infrastructure is ephemeral per change and per Python version.
- SAST check. GitHub CodeQL runs ~30m static analysis in the entire codebase.
- Security posture check. OSSF Scorecard runs numerous automated checks upon changes, and raises security alerts if OSSF security practices are no longer followed.
- Rebuild Changelog. We rebuild our entire changelog upon changes and create a PR for maintainers. This has the added benefit in keeping a protected branch while keeping removing error-prone tasks from maintainers.
- Stage documentation. We rebuild and deploy changes to the documentation to a staged version. This gives us safety that our docs can always be rebuilt, and ready to release to production when needed.
- Update draft release. We use Release Drafter to generate a portion of our release notes and to always keep a fresh draft upon changes. You can read our thoughts on a good quality release notes here (human readable changes + automation).
Continuous deployment practices¶
We adhere to industry recommendations from the OSSF Scorecard project, among others.
Releases are triggered by maintainers along with a reviewer - detailed info here. In addition to checks that run for every code change, our pipeline requires a manual approval before releasing.
We use a combination of provenance and signed attestation for our builds, source code sealing, SAST scanners, Python specific static code analysis, ephemeral credentials that last a given job step, and more.
This is a snapshot of our automated checks at a glance.
Lambda layer pipeline¶
Lambda Layer is a .zip file archive that can contain additional code, pre-packaged dependencies, data, or configuration files. It provides a way to efficiently include libraries and other resources in your Lambda functions, promoting code reusability and reducing deployment package sizes.
To build and deploy the Lambda Layers, we run a pipeline with the following steps:
- We fetch the latest PyPi release and use it as the source for our layer.
- We build Python versions ranging from 3.8 to 3.13 for x86_64 and arm64 architectures. This is necessary because we use pre-compiled libraries like Pydantic and Cryptography, which require specific Python versions for each layer.
- We provide layer distributions for both the x86_64 and arm64 architectures.
- For each Python version, we create a single CDK package containing both x86_64 and arm64 assets to optimize deployment performance.
Next, we deploy these CDK Assets to the beta account across all AWS regions. Once the beta deployment is complete, we run:
- Canary Tests: Run thorough canary tests to assess stability and functionality
- Successful?: Deploy previous CDK Asset to production across all regions
- Failure?: Halt pipeline to investigate and remediate issues before redeploying
graph LR
Fetch[Fetch PyPi release] --> P38[<strong>Python 3.8</strong>]
Fetch --> P39[<strong>Python 3.9</strong>]
Fetch --> P310[<strong>Python 3.10</strong>]
Fetch --> P311[<strong>Python 3.11</strong>]
Fetch --> P312[<strong>Python 3.12</strong>]
Fetch --> P313[<strong>Python 3.13</strong>]
subgraph build ["LAYER BUILD"]
P38 --> P38x86[build x86_64]
P38 --> P38arm64[build arm64]
P39 --> P39x86[build x86_64]
P39 --> P39arm64[build arm64]
P310 --> P310x86[build x86_64]
P310 --> P310arm64[build arm64]
P311 --> P311x86[build x86_64]
P311 --> P311arm64[build arm64]
P312 --> P312x86[build x86_64]
P312 --> P312arm64[build arm64]
P313 --> P313x86[build x86_64]
P313 --> P313arm64[build arm64]
P38x86 --> CDKP1[CDK Package]
P38arm64 --> CDKP1[CDK Package]
P39x86 --> CDKP2[CDK Package]
P39arm64 --> CDKP2[CDK Package]
P310x86 --> CDKP3[CDK Package]
P310arm64 --> CDKP3[CDK Package]
P311x86 --> CDKP4[CDK Package]
P311arm64 --> CDKP4[CDK Package]
P312x86 --> CDKP5[CDK Package]
P312arm64 --> CDKP5[CDK Package]
P313x86 --> CDKP6[CDK Package]
P313arm64 --> CDKP6[CDK Package]
end
subgraph beta ["BETA (all regions)"]
CDKP1 --> DeployBeta[Deploy to Beta]
CDKP2 --> DeployBeta
CDKP3 --> DeployBeta
CDKP4 --> DeployBeta
CDKP5 --> DeployBeta
CDKP6 --> DeployBeta
DeployBeta --> RunBetaCanary["Beta canary tests<br> <i>(all packages)</i>"]
end
subgraph prod ["PROD (all regions)"]
RunBetaCanary---|<strong>If successful</strong>|DeployProd[Deploy to Prod]
DeployProd --> RunProdCanary["Prod canary tests<br> <i>(all packages)</i>"]
end