Skip to content

AWS Staging Infrastructure Plan

Purpose

This document defines the staging account infrastructure needed to validate the ECS migration before production cutover.

The staging environment should be realistic enough to prove the deployment model, but simpler and less resilient than production so it is cheaper and easier to reason about during migration.

Staging Objectives

  • Validate ECS task definitions and service discovery.
  • Validate container image builds and ECR publishing.
  • Validate secrets and configuration injection.
  • Validate app startup, health checks, and smoke tests.
  • Validate database connectivity with the production-style schema.
  • Validate Redis connectivity and caching behavior.
  • Validate deployment rollback.

Staging Scope

Staging should include:

  • ECS Fargate services for the application runtime
  • ALB for public traffic
  • single EC2 host that provides PostgreSQL, TimescaleDB, PostGIS, and staging Valkey
  • CloudWatch logs and alarms
  • Secrets Manager and Parameter Store
  • ECR repositories
  • private Anisette ECS service with EFS-backed persistent state

Staging should not include:

  • Patroni clustering
  • Multi-node database failover
  • Production traffic
  • Long-lived manual configuration on the host

Account Layout

Staging AWS Account

Use a dedicated AWS account for staging.

  • Account ID: 802732539686
  • Production account counterpart: 951665295205
  • Note: the staging account is shared with other projects, so this project needs its own VPC and isolated state.

That account should mirror production in:

  • region
  • network layout
  • IAM Identity Center access model
  • naming conventions
  • deployment workflow

But it should differ in:

  • database resilience
  • scale
  • traffic volume
  • cost footprint

The staging account should remain isolated from production at the account boundary. Within the staging account, this project should live in its own VPC so it does not overlap with other projects.

Network Design

VPC

Terraform should create a dedicated VPC for this project within the staging account with:

  • public subnets for ALB
  • private subnets for ECS tasks
  • private subnets or restricted access for the database host
  • routing rules that prevent direct public access to the database and Redis
  • VPC boundaries that do not overlap with other projects in the same staging account

The exact subnet sizing and address allocation are implementation details for Terraform rather than pre-work the team needs to solve by hand.

Security Groups

Define security groups for:

  • ALB
  • ECS application services
  • ECS worker services
  • database host
  • Redis service access

Rules should allow only the minimum traffic required for service operation.

DNS and TLS

Use the same domain management pattern as production where practical, but keep staging separate.

Recommended pattern:

  • tracker.staging.glimpse.technology
  • separate certificates from production
  • separate ALB listeners or host rules

Because glimpse.technology is hosted on Cloudflare, ACM can still be used for the staging ALB certificate by requesting the certificate in AWS and publishing the DNS validation CNAME records in Cloudflare.

Compute Layout

ECS Cluster

Create a dedicated ECS cluster in the staging account.

Use Fargate for:

  • API
  • frontend
  • admin panel
  • worker services

Task Definitions

Each deployable service should get its own task definition unless two containers must share a lifecycle and network namespace.

Recommended staging task groups:

  • api
  • frontend
  • admin
  • tracker-fetcher
  • unified-geofence
  • notification-service
  • materialized-view-service
  • anisette as a private support service that uses EFS-backed persistent storage

Database Plan

Staging Database

Use a single EC2 host in staging for the database runtime.

The staging database still needs:

  • TimescaleDB
  • PostGIS
  • application schema compatibility
  • a local Valkey service for cache/queue behavior

Use one self-managed PostgreSQL host on a single EC2 instance for staging.

This is the simplest option that supports the required extensions and lets staging mirror the production schema more closely. Because this is staging, it is acceptable for the database host to be built only in the staging account.

Staging Database Requirements

  • persistent storage
  • backup/snapshot capability
  • access only from the ECS private subnets
  • enough capacity to validate queries and background workers

Anisette Storage Requirement

If Anisette is added to staging, Terraform should create a persistent filesystem for it, not a disposable container volume.

Recommended shape:

  • EFS-backed /data mount
  • private-only ECS service
  • no public ALB exposure
  • service discovery for the internal Anisette hostname

Redis Plan

Staging Redis

Use a self-hosted Valkey service on the staging database host.

Production is running AWS MemoryDB Valkey, but staging is explicitly using the lower-cost EC2-hosted Valkey option.

That keeps:

  • Redis hostnames
  • TLS behavior
  • authentication behavior
  • cluster mode behavior close enough to the production deployment model for application validation.

Staging Redis Requirements

  • TLS enabled where required by the managed service
  • authentication enabled
  • private network access only
  • environment-specific credentials

Secrets and Configuration

Secrets Manager

Store sensitive values in Secrets Manager:

  • POSTGRES_PASSWORD
  • SECRET_KEY
  • REDIS_PASSWORD

Parameter Store

Store non-sensitive runtime configuration in Parameter Store:

  • API URLs
  • database hostnames
  • database ports
  • Redis hostnames
  • Redis ports
  • token expiry settings
  • algorithm names
  • cluster flags

Staging Naming Convention

Use a staging prefix consistently, such as:

  • tracker/staging/secret-key
  • tracker/staging/postgres-password
  • tracker/staging/redis-password
  • tracker/staging/api-url
  • tracker/staging/postgres-host
  • tracker/staging/redis-host

Logging and Monitoring

CloudWatch Logs

Send container stdout/stderr to CloudWatch.

Alarms

Create alarms for:

  • ECS task failures
  • ALB 5xx responses
  • unhealthy target groups
  • database connectivity failures
  • Redis connectivity failures

Useful Validation Signals

During staging validation, confirm:

  • ECS service events are clean
  • logs appear for every service
  • smoke tests pass after deploy
  • worker services process jobs
  • rollback succeeds without manual repair

Deployment Flow

Staging Deployment Steps

  1. Build images in CI.
  2. Push to ECR.
  3. Deploy to staging ECS.
  4. Run smoke tests.
  5. Verify logs and alarms.
  6. Validate worker processing.
  7. Validate rollback.

Staging Exit Criteria

Staging is ready when:

  • the application starts consistently from CI
  • the database and Redis are reachable
  • the main API passes smoke tests
  • workers process expected jobs
  • rollback works cleanly

Known Staging Tradeoffs

  • The staging database will not have production failover.
  • Staging cost should be lower than production.
  • Some scaling behavior will remain untested until production-like load is available.
  • Any extensions or host-level assumptions must be verified before ECS rollout.