AWS Staging Infrastructure Plan
Purpose
This document defines the staging account infrastructure needed to validate the ECS migration before production cutover.
The staging environment should be realistic enough to prove the deployment model, but simpler and less resilient than production so it is cheaper and easier to reason about during migration.
Staging Objectives
- Validate ECS task definitions and service discovery.
- Validate container image builds and ECR publishing.
- Validate secrets and configuration injection.
- Validate app startup, health checks, and smoke tests.
- Validate database connectivity with the production-style schema.
- Validate Redis connectivity and caching behavior.
- Validate deployment rollback.
Staging Scope
Staging should include:
- ECS Fargate services for the application runtime
- ALB for public traffic
- single EC2 host that provides PostgreSQL, TimescaleDB, PostGIS, and staging Valkey
- CloudWatch logs and alarms
- Secrets Manager and Parameter Store
- ECR repositories
- private Anisette ECS service with EFS-backed persistent state
Staging should not include:
- Patroni clustering
- Multi-node database failover
- Production traffic
- Long-lived manual configuration on the host
Account Layout
Staging AWS Account
Use a dedicated AWS account for staging.
- Account ID:
802732539686 - Production account counterpart:
951665295205 - Note: the staging account is shared with other projects, so this project needs its own VPC and isolated state.
That account should mirror production in:
- region
- network layout
- IAM Identity Center access model
- naming conventions
- deployment workflow
But it should differ in:
- database resilience
- scale
- traffic volume
- cost footprint
The staging account should remain isolated from production at the account boundary. Within the staging account, this project should live in its own VPC so it does not overlap with other projects.
Network Design
VPC
Terraform should create a dedicated VPC for this project within the staging account with:
- public subnets for ALB
- private subnets for ECS tasks
- private subnets or restricted access for the database host
- routing rules that prevent direct public access to the database and Redis
- VPC boundaries that do not overlap with other projects in the same staging account
The exact subnet sizing and address allocation are implementation details for Terraform rather than pre-work the team needs to solve by hand.
Security Groups
Define security groups for:
- ALB
- ECS application services
- ECS worker services
- database host
- Redis service access
Rules should allow only the minimum traffic required for service operation.
DNS and TLS
Use the same domain management pattern as production where practical, but keep staging separate.
Recommended pattern:
tracker.staging.glimpse.technology- separate certificates from production
- separate ALB listeners or host rules
Because glimpse.technology is hosted on Cloudflare, ACM can still be used for the staging ALB certificate by requesting the certificate in AWS and publishing the DNS validation CNAME records in Cloudflare.
Compute Layout
ECS Cluster
Create a dedicated ECS cluster in the staging account.
Use Fargate for:
- API
- frontend
- admin panel
- worker services
Task Definitions
Each deployable service should get its own task definition unless two containers must share a lifecycle and network namespace.
Recommended staging task groups:
apifrontendadmintracker-fetcherunified-geofencenotification-servicematerialized-view-serviceanisetteas a private support service that uses EFS-backed persistent storage
Database Plan
Staging Database
Use a single EC2 host in staging for the database runtime.
The staging database still needs:
- TimescaleDB
- PostGIS
- application schema compatibility
- a local Valkey service for cache/queue behavior
Recommended Deployment Option
Use one self-managed PostgreSQL host on a single EC2 instance for staging.
This is the simplest option that supports the required extensions and lets staging mirror the production schema more closely. Because this is staging, it is acceptable for the database host to be built only in the staging account.
Staging Database Requirements
- persistent storage
- backup/snapshot capability
- access only from the ECS private subnets
- enough capacity to validate queries and background workers
Anisette Storage Requirement
If Anisette is added to staging, Terraform should create a persistent filesystem for it, not a disposable container volume.
Recommended shape:
- EFS-backed
/datamount - private-only ECS service
- no public ALB exposure
- service discovery for the internal Anisette hostname
Redis Plan
Staging Redis
Use a self-hosted Valkey service on the staging database host.
Production is running AWS MemoryDB Valkey, but staging is explicitly using the lower-cost EC2-hosted Valkey option.
That keeps:
- Redis hostnames
- TLS behavior
- authentication behavior
- cluster mode behavior close enough to the production deployment model for application validation.
Staging Redis Requirements
- TLS enabled where required by the managed service
- authentication enabled
- private network access only
- environment-specific credentials
Secrets and Configuration
Secrets Manager
Store sensitive values in Secrets Manager:
POSTGRES_PASSWORDSECRET_KEYREDIS_PASSWORD
Parameter Store
Store non-sensitive runtime configuration in Parameter Store:
- API URLs
- database hostnames
- database ports
- Redis hostnames
- Redis ports
- token expiry settings
- algorithm names
- cluster flags
Staging Naming Convention
Use a staging prefix consistently, such as:
tracker/staging/secret-keytracker/staging/postgres-passwordtracker/staging/redis-passwordtracker/staging/api-urltracker/staging/postgres-hosttracker/staging/redis-host
Logging and Monitoring
CloudWatch Logs
Send container stdout/stderr to CloudWatch.
Alarms
Create alarms for:
- ECS task failures
- ALB 5xx responses
- unhealthy target groups
- database connectivity failures
- Redis connectivity failures
Useful Validation Signals
During staging validation, confirm:
- ECS service events are clean
- logs appear for every service
- smoke tests pass after deploy
- worker services process jobs
- rollback succeeds without manual repair
Deployment Flow
Staging Deployment Steps
- Build images in CI.
- Push to ECR.
- Deploy to staging ECS.
- Run smoke tests.
- Verify logs and alarms.
- Validate worker processing.
- Validate rollback.
Staging Exit Criteria
Staging is ready when:
- the application starts consistently from CI
- the database and Redis are reachable
- the main API passes smoke tests
- workers process expected jobs
- rollback works cleanly
Known Staging Tradeoffs
- The staging database will not have production failover.
- Staging cost should be lower than production.
- Some scaling behavior will remain untested until production-like load is available.
- Any extensions or host-level assumptions must be verified before ECS rollout.