Building Bank-Grade Infrastructure on AWS

When you're building a platform that connects to people's bank accounts, infrastructure isn't an afterthought — it's the product. Every architectural decision we make at Caspian starts with one question: would a bank trust this?

This post walks through the infrastructure choices behind Caspian — a family financial platform built entirely on AWS serverless primitives — and why we made them.

The Constraints That Shaped Everything

Open Banking in the UK comes with real regulatory weight. We process transaction data, store account balances, and hold consent tokens that grant ongoing access to financial data. That means:

Data must be encrypted at rest and in transit — no exceptions
Infrastructure must be auditable and reproducible — no clicking around in the AWS console
Secrets must never touch disk or source control — they live in managed stores only
The blast radius of any failure must be contained — one bad deploy can't take down the whole platform

These aren't nice-to-haves. They're table stakes for handling financial data responsibly.

Serverless by Default

Our entire API runs on AWS Lambda behind API Gateway v2. No EC2 instances. No ECS clusters. No servers to patch, scale, or babysit at 3am.

Lambda gives us something that's hard to replicate with containers: automatic scaling to zero. When nobody is using the platform, we're not paying for idle compute. When thousands of transactions come in during a morning bank sync, Lambda scales horizontally without us lifting a finger.

We use a single Lambda function running a Hono web framework via Lambda Web Adapter — the same framework serves our oRPC endpoints, webhook handlers, and AI chat routes. One function, many routes, zero cold-start surprises from function-per-route architectures.

The Database: Aurora Serverless v2

Financial data demands a proper relational database. We chose Amazon Aurora Serverless v2 (PostgreSQL)— and it's been one of our best decisions.

Aurora Serverless gives us:

Auto-scaling compute — capacity adjusts in fine-grained increments based on load, from 0.5 ACU to whatever we need
Encryption at rest via AWS KMS, enabled by default — every byte on disk is encrypted with keys we control
Automated backups with point-in-time recovery — we can restore to any second within our retention window
VPC isolation — the database sits in private subnets with no public internet access, reachable only from our Lambda functions

We access Aurora via the RDS Data API, which means our Lambda functions don't need to manage connection pools or sit inside a VPC. The Data API handles connection management for us, and communicates over HTTPS — encrypted in transit by default.

Secrets That Stay Secret

We store all application secrets in AWS Secrets Manager and SSM Parameter Store. API keys for our banking provider, AI services, and third-party integrations never appear in environment variables, config files, or source control.

At runtime, our Lambda function loads secrets from Secrets Manager on cold start and caches them in memory for the lifetime of the execution environment. This means:

Secrets are fetched over encrypted channels from a managed store
Rotating a secret is instant — update it in Secrets Manager and the next cold start picks it up
IAM policies control exactly which functions can access which secrets — principle of least privilege, enforced at the infrastructure level

Infrastructure as Code with Terraform

Every piece of our infrastructure is defined in Terraform. The VPC, subnets, security groups, Lambda functions, Aurora cluster, API Gateway, CloudFront distributions, DNS records — all of it lives in version-controlled HCL files.

This gives us:

Reproducibility — we can spin up an identical staging environment with a single command
Auditability — every infrastructure change goes through code review, just like application code
Drift detection — if something changes outside of Terraform, we know about it
Environment parity — staging and production use the same Terraform modules with different variable files

Zero-Downtime Deployments

Our deployment pipeline runs through AWS CodePipeline and CodeBuild. When code is merged, the pipeline builds the application, runs type checks and linting, then deploys to Lambda.

We use Lambda's alias and versioning system to achieve zero-downtime deploys. Every deployment publishes a new Lambda version and updates the live alias to point at it. API Gateway routes to the alias, not the function directly — so the cutover is atomic. If something goes wrong, rolling back is as simple as pointing the alias back to the previous version.

Edge Caching with CloudFront

Our API responses pass through CloudFront— not just for caching, but for TLS termination, DDoS protection, and geographic distribution. The marketing site you're reading this on is a static export served entirely from CloudFront's edge network.

For the API, CloudFront gives us automatic AWS Shield Standard protection and the ability to add WAF rules if needed — all without changing a line of application code.

Background Jobs Without Servers

Financial platforms need background processing — categorising transactions, detecting recurring payments, computing runway projections. We use Lambda-native workflow orchestration for all background jobs.

Long-running workflows (like processing a full bank sync) are broken into discrete steps that execute as separate Lambda invocations — so we never hit timeout limits and each step is independently retryable. Failed steps are automatically retried with exponential backoff, and workflow progress is tracked in the database so we always know exactly where things stand.

What This Adds Up To

The result is an infrastructure that is:

Encrypted everywhere — at rest (KMS), in transit (TLS), and in managed secret stores
Fully serverless — no servers to patch, automatic scaling, pay-per-use
Reproducible — every resource defined in Terraform, every change code-reviewed
Resilient — isolated blast radius, zero-downtime deploys, instant rollbacks
Cost-efficient— scales to zero when idle, no minimum spend on infrastructure we're not using

We're not a bank. But when people trust us with access to their financial data, we owe them infrastructure that takes that trust seriously. Every decision documented here is in service of that principle.

If you're building in fintech and want to compare notes, we'd love to hear from you — reach out at [email protected].