AWS ECS Staging Rollout Sequence and Migration Order

This document captures the intended order for bringing the staging ECS stack online. The goal is to keep schema creation explicit, keep application runtime separate from infrastructure bootstrap, and make the deployment path easy to automate in CI.

Summary

The staging rollout should happen in this order:

Build the AWS foundation with Terraform.
Build and push immutable container images.
Run database migrations as a one-off ECS task.
Start the long-running app services.
Start worker and support services.
Smoke test the environment.

Why Migrations Are Separate

The database schema must exist before the application services and workers can start reliably.

Alembic migrations therefore run as a dedicated ECS task using the API image and the command alembic upgrade head.

That keeps the schema step:

tied to the same image artifact as the app
reproducible in CI
isolated from long-running services
easy to rerun or inspect if it fails

Deployment Order

1. Build The Foundation

Terraform creates the infrastructure that everything else depends on:

VPC, subnets, security groups, and VPC endpoints
ECR repositories
ECS cluster
database EC2 host
Redis or Valkey layer
EFS for shared worker storage
ALB, CloudWatch logs, KMS, and secrets

At this stage the platform exists, but the app is not yet running.

2. Build And Push Images

Build the deployable images and push them to ECR:

api
frontend
admin
services
anisette when enabled

Use immutable tags for release identity, with a human-readable convenience tag only if the repository policy allows it.

3. Run The Migration Task

Run a one-off ECS task from the API image with:

command: alembic upgrade head
purpose: create or update the database schema

This task should run before any long-running application service starts.

4. Start Long-Running App Services

Once the schema is in place, start the app-facing services:

api
frontend
admin

These should now be able to connect to the database without needing any special bootstrap logic.

5. Start Worker And Support Services

Start the background and support services after the core app is healthy:

tracker-fetcher-2
unified-geofence
notification-service
materialized-view-service
anisette when required by the fetcher workflow

tracker-fetcher-2 depends on anisette and shared state under /data, so it should only be enabled when that support path is ready.

6. Smoke Test

Confirm the staging stack is usable before considering it complete:

API health endpoint returns healthy
Login works with a known staging user
Database queries succeed
Worker services start successfully
Fetcher-to-anisette communication works when enabled
CloudWatch logs are visible
ALB target groups are healthy

Operational Rule

Keep migrations and service rollout separate:

schema changes are applied first
services are rolled forward second
support services are enabled only once their dependencies are ready

That keeps the staging deployment predictable and makes failures easier to isolate.