Tracker REST API — Terraform Deployment Guide

Single source of truth for the complete AWS infrastructure lifecycle: from first-time bootstrap through daily operations, container rollouts, database migrations, and seeding.

1. Architecture Overview

All services run inside a single VPC (10.40.0.0/24) in eu-west-2 (London). There are no publicly routable IPs on any service; all external traffic enters through the ALB. The database host is reached exclusively via SSM Session Manager.

This guide describes the current staging-shaped stack. For the account-separated staging and production requirements, see Terraform Environment Compatibility.

Internet
   │
   ▼
Cloudflare DNS
(tracker.staging.glimpse.technology)
   │
   ▼
WAF WebACL
   │
   ▼
Application Load Balancer (public subnets)
├── /api/* → ECS API service (port 8000)
├── tracker-admin.* → ECS Admin service (port 80)
└── default → ECS Frontend service (port 80)
   │
   ▼ (private subnets)
ECS Fargate Cluster
├── api              (FastAPI, 512 CPU / 1024 MB)
├── frontend         (Nginx SPA, 256 CPU / 512 MB)
├── admin            (Nginx SPA, 256 CPU / 512 MB)
├── anisette-v3 *    (Anisette server, 256 CPU / 512 MB)
├── tracker-fetcher-2 *
├── unified-geofence *
├── notification-service *
└── materialized-view-service *

* Conditional on enable_anisette / enable_workers

EFS (persistent NFS)
└── /anisette  → anisette-v3 service
└── /data      → worker services

EC2 Database Host (private subnet, t3.medium)
├── PostgreSQL 16 + TimescaleDB + PostGIS  (port 5432)
└── Valkey                                  (port 6379)

Key Properties

No SSH: The database EC2 instance has no SSH; access is via aws ssm start-session.
No Public IPs: ECS tasks and the database host run with assignPublicIp = DISABLED.
IMDSv2 enforced: EC2 metadata service requires signed tokens (hop limit 2).
Immutable image tags: ECR is configured with imageTagMutability = IMMUTABLE. Every push requires a new tag.
Single KMS key: One CMK encrypts EBS, ECR, EFS, S3 (ALB logs), CloudWatch Logs, Secrets Manager.
Managed secrets: PostgreSQL password, Redis/Valkey password, and the app SECRET_KEY are randomly generated by Terraform and stored in Secrets Manager. They are never in .tfvars or environment files.

2. Directory Layout

infra/
├── terraform-guide.md          ← this file
├── infrastructure.md           ← module reference
├── README.md                   ← quick-start summary
├── envs/
│   └── staging/
│       ├── main.tf                        # Module composition — the "wiring"
│       ├── locals.tf                      # resource_prefix, common_tags
│       ├── variables.tf                   # All input variable declarations
│       ├── outputs.tf                     # All stack outputs
│       ├── providers.tf                   # AWS provider (region, profile, default tags)
│       ├── versions.tf                    # terraform >= 1.7, aws ~> 5.0, random ~> 3.6
│       ├── backend.tf                     # S3 backend stub (config loaded from backend.hcl)
│       ├── backend.hcl.example            # Template — copy to backend.hcl, do not commit
│       ├── terraform.tfvars               # Non-secret variable values
│       ├── terraform.tfvars.example       # Template for the above
│       └── image-tags.auto.tfvars.json    # Auto-updated container image tag map
└── modules/
    ├── kms/          # Customer-managed KMS key
    ├── network/      # VPC, subnets, IGW, NAT, flow logs
    ├── security/     # Security groups (ALB, ECS, DB, Cache, EFS)
    ├── ecr/          # ECR repositories (api, frontend, admin, services, anisette)
    ├── acm/          # ACM TLS certificates (DNS validated)
    ├── alb/          # Application Load Balancer, listeners, target groups
    ├── waf/          # WAFv2 with AWSManagedRulesCommonRuleSet
    ├── database/     # EC2 host running PostgreSQL 16 + Valkey (bootstrapped via user-data)
    ├── ecs/          # ECS Fargate cluster, task definitions, services, job definitions
    ├── efs/          # EFS file systems for Anisette and worker persistent storage
    ├── logs/         # CloudWatch log groups for all services and jobs
    └── secrets/      # Secrets Manager secrets with randomly generated passwords

3. Module Reference

3.1 KMS (`modules/kms/`)

Creates one CMK for the entire stack.

Alias: alias/tracker-restapi-staging
Annual automatic rotation enabled
Key policy grants access to: root account, CloudWatch Logs service, ELB log delivery service
Output key_arn is passed to every other module that encrypts data at rest

3.2 Network (`modules/network/`)

VPC CIDR 10.40.0.0/24 split across two AZs:

Subnet	CIDR	AZ
Public AZ-a	`10.40.0.0/26`	eu-west-2a
Public AZ-b	`10.40.64.0/26`	eu-west-2b
Private AZ-a	`10.40.128.0/26`	eu-west-2a
Private AZ-b	`10.40.192.0/26`	eu-west-2b

NAT Gateway in Public AZ-a (single NAT for cost; upgrade to per-AZ for HA)
VPC Flow Logs → CloudWatch (/aws/vpc/tracker-restapi-staging-flow-logs, 365-day retention)
Interface VPC endpoints for ECR, Secrets Manager, CloudWatch Logs, SSM (keeps traffic inside AWS network)

3.3 Security (`modules/security/`)

Five security groups with least-privilege rules:

Group	Inbound	Outbound
ALB	TCP 80, 443 from `0.0.0.0/0`	TCP 80 → ECS SG; TCP 8000 → ECS SG
ECS	Service ports from ALB; port 6969 from ECS (Anisette internal)	DB 5432; Cache 6379; EFS 2049; HTTPS 443
Database	TCP 5432 from ECS SG	TCP 80, 443 (apt repos)
Cache/MemoryDB	TCP 6379 from ECS SG	None
EFS	TCP 2049 from ECS SG	None

Note: All groups have ignore_changes = [ingress, egress] — manual console changes are preserved across terraform apply.

3.4 ECR (`modules/ecr/`)

Five repositories, KMS-encrypted, scan-on-push, immutable tags, lifecycle: retain 30 most recent images.

Logical Key	Repository
`api`	`tracker-api`
`frontend`	`tracker-frontend`
`admin`	`tracker-admin`
`services`	`tracker-services`
`anisette`	`tracker-anisette`

3.5 ACM (`modules/acm/`)

Two certificates via DNS validation:

tracker.staging.glimpse.technology
tracker-admin.staging.glimpse.technology

Terraform outputs acm_validation_records — these CNAME records must be added to Cloudflare before the certificate can be issued. The aws_acm_certificate_validation resource waits for validation to complete before proceeding.

3.6 WAF (`modules/waf/`)

WAFv2 WebACL (regional, attached to ALB):

Priority	Rule	Blocks
10	`AWSManagedRulesCommonRuleSet`	SQLi, XSS, bad user-agents
20	`AWSManagedRulesKnownBadInputsRuleSet`	Log4Shell, SSRF, malformed input

Logs to aws-waf-logs-tracker-restapi-staging, 365-day retention.

3.7 ALB (`modules/alb/`)

Internet-facing ALB in both public subnets:

Port 80 → redirect 301 to HTTPS
Port 443 → route by host/path:

Priority	Condition	Target Group	Container Port
10	Path `/api/`, `/api/v1/`, `/docs`, `/redoc`, `/openapi.json`	`api`	8000
15	Host `tracker-admin.staging.glimpse.technology`	`admin`	80
20	Path `/admin`, `/health`	`admin`	80
—	Default (all others)	`frontend`	80

Access logs stored in S3 (glimpse-tracker-restapi-staging-alb-logs-{account-id}), 90-day expiry, KMS-encrypted. Deletion protection enabled.

3.8 Database (`modules/database/`)

Single EC2 instance (t3.medium) in the first private subnet. No RDS — the host bootstraps via user-data on first boot.

Software installed via user-data:

PostgreSQL 16 + postgresql-contrib
TimescaleDB 2 for PostgreSQL 16
PostGIS 3
Valkey (Redis-compatible, from Redis 7 lineage)

Bootstrap sequence:

Installs packages from official PostgreSQL apt repo
Fetches postgres_password and redis_password from Secrets Manager
Writes postgresql.conf:
shared_preload_libraries = 'timescaledb,pg_stat_statements'
max_connections = 100
shared_buffers = 256MB
effective_cache_size = 768MB
Writes pg_hba.conf — scram-sha-256 auth, VPC CIDR only
Creates role tracker and database tracker
Enables extensions: timescaledb, postgis, postgis_topology, pg_stat_statements
Configures Valkey: binds 0.0.0.0, port 6379, password auth, AOF persistence

Forcing a re-bootstrap: Increment bootstrap_revision in terraform.tfvars. This replaces the EC2 instance (new AMI + re-runs user-data). Use with caution — all data is on the instance's EBS volume and will be lost unless backed up first.

Access: No SSH. Use SSM:

aws ssm start-session \
  --profile glimpse-staging \
  --region eu-west-2 \
  --target $(terraform -chdir=infra/envs/staging output -raw database_instance_id)

3.9 ECS (`modules/ecs/`)

Fargate cluster with Container Insights enabled.

IAM roles:

Execution role — pulls images from ECR, fetches secrets from Secrets Manager, writes logs to CloudWatch
Task role — SSM exec access, EFS mount permissions (when enabled)

Long-running services:

Service	Image	CPU	Mem	Port	ALB
`api`	`tracker-api:{tag}`	512	1024 MB	8000	Yes
`frontend`	`tracker-frontend:{tag}`	256	512 MB	80	Yes
`admin`	`tracker-admin:{tag}`	256	512 MB	80	Yes
`anisette` *	`tracker-anisette:{tag}`	256	512 MB	6969	No (private DNS)
`tracker-fetcher-2` *	`tracker-services:{tag}`	256	512 MB	—	No
`unified-geofence` *	`tracker-services:{tag}`	256	512 MB	—	No
`notification-service` *	`tracker-services:{tag}`	256	512 MB	—	No
`materialized-view-service` *	`tracker-services:{tag}`	256	512 MB	—	No

* Conditional on enable_anisette / enable_workers

One-off job task definitions (pre-created, not auto-run):

Job	Image	Command
`migrations`	`tracker-api:{tag}`	`bash ./scripts/run_migrations.sh`
`seed-users`	`tracker-api:{tag}`	`python ./scripts/seed_default_users.py`

Jobs use the same ECS cluster, security group, and subnets as services. Their task definitions are updated whenever the api image tag changes.

Secrets injected as environment variables (via Secrets Manager valueFrom):

POSTGRES_PASSWORD — fetched at task start, not baked into image
SECRET_KEY — app signing key
REDIS_PASSWORD — Valkey auth token

3.10 EFS (`modules/efs/`)

Created when enable_workers || enable_anisette:

Instance	Root Dir	Used By
`anisette_storage`	`/anisette`	`anisette-v3` service
`worker_storage`	`/data`	`tracker-fetcher-2` service

Access points use UID/GID 1000, permissions 0750. Mount targets in both private subnets. KMS-encrypted.

3.11 Logs (`modules/logs/`)

CloudWatch log group per service and per job, all KMS-encrypted, 365-day retention.

Group	Path
api	`/aws/ecs/tracker-restapi-staging/api`
frontend	`/aws/ecs/tracker-restapi-staging/frontend`
admin	`/aws/ecs/tracker-restapi-staging/admin`
anisette	`/aws/ecs/tracker-restapi-staging/anisette`
migrations	`/aws/ecs/tracker-restapi-staging/migrations`
seed-users	`/aws/ecs/tracker-restapi-staging/seed-users`
tracker-fetcher-2	`/aws/ecs/tracker-restapi-staging/tracker-fetcher-2`
unified-geofence	`/aws/ecs/tracker-restapi-staging/unified-geofence`
notification-service	`/aws/ecs/tracker-restapi-staging/notification-service`
materialized-view-service	`/aws/ecs/tracker-restapi-staging/materialized-view-service`

3.12 Secrets (`modules/secrets/`)

Three secrets created with randomly generated 32-character passwords (mixed case, digits, specials):

Logical Key	Secret Name in Secrets Manager
`postgres_password`	`tracker/staging/postgres-password`
`secret_key`	`tracker/staging/secret-key`
`redis_password`	`tracker/staging/redis-password`

Passwords are generated once by Terraform. They are never stored in .tfvars, .env, or any repository file. Applications and the database bootstrap script fetch them from Secrets Manager at runtime.

4. Prerequisites

AWS CLI & Profile

Configure an AWS CLI profile named glimpse-staging:

aws configure --profile glimpse-staging
# AWS Access Key ID: ...
# AWS Secret Access Key: ...
# Default region: eu-west-2
# Default output: json

Verify access:

aws sts get-caller-identity --profile glimpse-staging

Terraform

Version >= 1.7.0. Install via tfenv or directly:

tfenv install 1.7.0
tfenv use 1.7.0
terraform --version

Docker Buildx

Required for multi-platform builds:

docker buildx version
# If missing: docker buildx install

S3 Backend Bootstrap (one-time, before first `terraform init`)

The S3 bucket and DynamoDB lock table must exist before Terraform can store state. Create them manually or with a bootstrap script:

aws s3api create-bucket \
  --profile glimpse-staging \
  --region eu-west-2 \
  --bucket glimpse-staging-tracker-restapi-tfstate \
  --create-bucket-configuration LocationConstraint=eu-west-2

aws s3api put-bucket-versioning \
  --profile glimpse-staging \
  --bucket glimpse-staging-tracker-restapi-tfstate \
  --versioning-configuration Status=Enabled

aws dynamodb create-table \
  --profile glimpse-staging \
  --region eu-west-2 \
  --table-name glimpse-staging-tracker-restapi-tflock \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST

5. First-Time Bootstrap

Step 1 — Prepare local config files

cd infra/envs/staging

# Backend config (never commit this file — contains account-specific paths)
cp backend.hcl.example backend.hcl

Edit backend.hcl:

bucket         = "glimpse-staging-tracker-restapi-tfstate"
key            = "staging/terraform.tfstate"
region         = "eu-west-2"
profile        = "glimpse-staging"
dynamodb_table = "glimpse-staging-tracker-restapi-tflock"
encrypt        = true

Edit or verify terraform.tfvars:

project_name    = "tracker-restapi"
environment     = "staging"
aws_region      = "eu-west-2"
vpc_cidr        = "10.40.0.0/24"
public_hostname = "tracker.staging.glimpse.technology"
admin_hostname  = "tracker-admin.staging.glimpse.technology"

enable_anisette = true
enable_workers  = true

secret_names = {
  postgres_password = "tracker/staging/postgres-password"
  secret_key        = "tracker/staging/secret-key"
  redis_password    = "tracker/staging/redis-password"
}

Step 2 — Set placeholder image tags

Before any images exist in ECR, use the bootstrap tag so task definitions are valid:

cat > image-tags.auto.tfvars.json <<'EOF'
{
  "image_tags": {
    "api":      "sha-bootstrap",
    "frontend": "sha-bootstrap",
    "admin":    "sha-bootstrap",
    "services": "sha-bootstrap",
    "anisette": "sha-bootstrap"
  }
}
EOF

Step 3 — Initialise Terraform

terraform init -backend-config=backend.hcl

Step 4 — Plan and apply

terraform plan -out=tfplan
terraform apply tfplan

The first apply provisions everything: KMS, VPC, subnets, security groups, ECR, ACM certificates, WAF, ALB, the database EC2 instance, EFS, log groups, Secrets Manager secrets, and ECS task definitions. Services do not start successfully until images exist in ECR (see next steps).

Expected apply time: 12–18 minutes (ACM DNS validation is the longest step).

Step 5 — Add DNS validation records to Cloudflare

After apply completes, get the validation records:

terraform output -json acm_validation_records
terraform output -json acm_admin_validation_records

Each record has name, type (CNAME), and value. Add both to Cloudflare for glimpse.technology. ACM validates automatically once the records propagate (usually < 5 minutes).

Step 6 — Add ALB hostname to Cloudflare

terraform output -raw alb_dns_name

Create a CNAME in Cloudflare:

tracker.staging → <alb_dns_name>
tracker-admin.staging → <alb_dns_name> (or a CNAME of the above)

Set proxy mode to DNS only (grey cloud) to avoid Cloudflare interfering with ALB SSL termination.

Step 7 — Authenticate with ECR and build images

For a shorter developer-oriented summary of this workflow, see Deployment Pipeline.

make staging-build

This command:

builds the current commit once
pushes the immutable tag to the staging ECR
mirrors the same tag into the production ECR
writes infra/envs/staging/image-tags.auto.tfvars.json

Or use the script directly:

python scripts/build_staging_images.py \
  --push \
  --aws-profile glimpse-staging \
  --mirror-aws-profile glimpse-prod \
  --target api

If you need the lower-level docker buildx command, the scripts pass these variables through. They set the registry prefix to the target environment's ECR host and pin the image with the immutable sha-<commit> tag.

Step 8 — Update image tags and re-apply

Apply Terraform in staging after the build script has refreshed the tag map:

terraform -chdir=infra/envs/staging apply -auto-approve

This registers new task definition revisions pointing at the real image tags. ECS rolling-deploys each service automatically (50% min healthy, 200% max).

Step 9 — Promote to production

Once staging is validated, promote the exact same tag map to production:

make prod-promote
terraform -chdir=infra/envs/prod apply -auto-approve

This keeps production on the same commit-tested image set as staging without rebuilding.

Step 10 — Direct production operations

make prod-build, make prod-plan, and make prod-apply remain available for manual production workflows, but the normal release path is staging-first promotion.

Step 11 — Run database migrations

On first deploy, run migrations before traffic reaches the API:

# Get required values from Terraform outputs
CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)
TASK_DEF=$(terraform -chdir=infra/envs/staging output -raw migration_task_definition_arn)
SUBNETS=$(terraform -chdir=infra/envs/staging output -json private_subnet_ids | jq -r 'join(",")')
SG=$(terraform -chdir=infra/envs/staging output -raw ecs_security_group_id)

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor the migration logs:

aws logs tail /aws/ecs/tracker-restapi-staging/migrations \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

Step 12 — Seed default users

SEED_TASK_DEF=$(terraform -chdir=infra/envs/staging output -json ecs_task_definition_arns | jq -r '."seed-users"')

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$SEED_TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor:

aws logs tail /aws/ecs/tracker-restapi-staging/seed-users \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

6. Daily Developer Workflow

Check current state

terraform -chdir=infra/envs/staging show | grep -A2 "image_tag"
# or
cat infra/envs/staging/image-tags.auto.tfvars.json
# or
cat infra/envs/prod/image-tags.auto.tfvars.json

View service logs

# API service
aws logs tail /aws/ecs/tracker-restapi-staging/api \
  --profile glimpse-staging --region eu-west-2 --follow

# Worker (swap group name as needed)
aws logs tail /aws/ecs/tracker-restapi-staging/tracker-fetcher-2 \
  --profile glimpse-staging --region eu-west-2 --follow

Connect to a running container (ECS Exec)

CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)

# Find the task ID
TASK_ID=$(aws ecs list-tasks \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --service-name tracker-restapi-staging-api \
  --query 'taskArns[0]' --output text | awk -F/ '{print $NF}')

aws ecs execute-command \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task "$TASK_ID" \
  --container api \
  --command "/bin/bash" \
  --interactive

Bootstrap an Anisette account with ECS Exec

Use ./scripts/bootstrap_anisette_account.sh when you need to create or refresh the stored Anisette account for the worker stack.

The script starts a temporary Fargate task from the tracker-fetcher-2 task definition with ECS Exec enabled, opens an interactive shell in the container, and then runs the bootstrap command inside that container.

By default it targets the staging Terraform root. If you export AWS_PROFILE=glimpse-prod, it will default to infra/envs/prod unless you override TERRAFORM_DIR explicitly.

For production use, the prod Terraform root must already have the worker services and Anisette enabled so the script can read ecs_task_definition_arns, ecs_security_group_id, private_subnet_ids, and anisette_service_url from state.

./scripts/bootstrap_anisette_account.sh

What the script does:

Reads the staging Terraform outputs from infra/envs/staging.
Starts a one-off tracker-fetcher-2 task with --enable-execute-command.
Waits for the ECS Exec agent to become ready.
Opens /bin/sh in the running container.
Inside the shell, exports:
ANISETTE_SERVER
ACCOUNT_STORE_PATH
Runs:

python scripts/authenticate_findmy.py

The default account file path is /data/account.json. If the bootstrap succeeds, the task can optionally trigger a fresh deployment of the worker service so it picks up the new account state.

Requirements:

aws, jq, and terraform available on the machine running the script
AWS credentials with permission to run ECS tasks, use ECS Exec, and stop the temporary task
ECS Exec enabled for the task definition and cluster
enable_anisette = true in the target Terraform environment

If you need to run the same steps manually, use the same temporary worker task pattern rather than the long-running API service. The bootstrap script depends on the container image that includes scripts/authenticate_findmy.py.

Connect to the database host

DB_INSTANCE=$(terraform -chdir=infra/envs/staging output -raw database_instance_id)

aws ssm start-session \
  --profile glimpse-staging \
  --region eu-west-2 \
  --target "$DB_INSTANCE"

# Once inside:
sudo -u postgres psql -d tracker

Check service health

# List all running tasks
aws ecs list-tasks \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster tracker-restapi-staging

# Describe specific service
aws ecs describe-services \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster tracker-restapi-staging \
  --services tracker-restapi-staging-api

Apply infrastructure changes (no image changes)

terraform -chdir=infra/envs/staging plan
terraform -chdir=infra/envs/staging apply

7. Deploying Container Updates

Build and push a new image

make staging-build

Update the image tag map

cat infra/envs/staging/image-tags.auto.tfvars.json

Or edit infra/envs/staging/image-tags.auto.tfvars.json directly:

{
  "image_tags": {
    "api": "sha-abc1234",
    "frontend": "sha-abc1234",
    "admin": "sha-abc1234",
    "services": "sha-abc1234",
    "anisette": "sha-bootstrap"
  }
}

Apply — registers new task definition revisions and triggers rolling deploy

terraform -chdir=infra/envs/staging apply -auto-approve

ECS rolls each service to the new task definition. Deployment settings: 50% minimum healthy, 200% maximum — so new tasks start before old ones stop.

If the new image requires a schema migration

Run migrations before or immediately after applying the new image tag. The migration job uses the same image as the API service:

# (re-run Step 9 from the bootstrap section)

Roll back to a previous image tag

ECR tags are immutable but old revisions remain. Update image-tags.auto.tfvars.json to the previous tag and re-apply:

jq '.image_tags.api = "sha-previoussha"' \
  infra/envs/staging/image-tags.auto.tfvars.json > /tmp/t.json \
  && mv /tmp/t.json infra/envs/staging/image-tags.auto.tfvars.json

terraform -chdir=infra/envs/staging apply -auto-approve

8. Running Database Migrations

Migrations run as a Fargate one-off task using the migrations job task definition. The task runs bash ./scripts/run_migrations.sh inside the tracker-api container.

When to run

After the first deploy (bootstrap)
After any deploy that includes a schema-altering change
After manually rolling back to a previous API version (check if a down-migration is needed)

Run command

CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)
TASK_DEF=$(terraform -chdir=infra/envs/staging output -raw migration_task_definition_arn)
SUBNETS=$(terraform -chdir=infra/envs/staging output -json private_subnet_ids | jq -r 'join(",")')
SG=$(terraform -chdir=infra/envs/staging output -raw ecs_security_group_id)

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor

aws logs tail /aws/ecs/tracker-restapi-staging/migrations \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

Verify success

Check the task exit code:

aws ecs describe-tasks \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --tasks <TASK_ARN>

A stopCode of EssentialContainerExited with exitCode: 0 means success.

9. Seeding Default Users

Seeds initial users/roles needed for the application to be usable. Runs python ./scripts/seed_default_users.py inside the tracker-api container.

When to run

Once after the first deploy
When adding a new environment that needs baseline data

Run command

CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)
SEED_TASK_DEF=$(terraform -chdir=infra/envs/staging output -json ecs_task_definition_arns | jq -r '."seed-users"')
SUBNETS=$(terraform -chdir=infra/envs/staging output -json private_subnet_ids | jq -r 'join(",")')
SG=$(terraform -chdir=infra/envs/staging output -raw ecs_security_group_id)

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$SEED_TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor

aws logs tail /aws/ecs/tracker-restapi-staging/seed-users \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

10. Destroying the Environment

The ALB has deletion protection enabled. Terraform will fail if you try to destroy without first disabling it.

# Step 1: Remove ALB deletion protection
terraform -chdir=infra/envs/staging apply \
  -target=module.alb.aws_lb.this \
  -var="enable_deletion_protection=false"

# Step 2: Destroy everything
terraform -chdir=infra/envs/staging destroy

Warning: Destroying the database module permanently deletes the EC2 instance and its EBS volume. Take a manual snapshot or pg_dump backup first if the data is needed.

11. Troubleshooting

ECS service stuck in PENDING or tasks keep stopping

Check the service events:

aws ecs describe-services \
  --profile glimpse-staging --region eu-west-2 \
  --cluster tracker-restapi-staging \
  --services tracker-restapi-staging-api \
  --query 'services[0].events[0:5]'

Check container logs for the stopped task — tasks log even when they exit immediately.
Common causes:
Image tag doesn't exist in ECR → push the image or fix the tag in image-tags.auto.tfvars.json
Secret ARN wrong → terraform output secret_arns and compare with task definition
Not enough capacity in subnets → unlikely with Fargate but check VPC endpoints

ACM certificate stuck in PENDING_VALIDATION

DNS CNAME records haven't been added to Cloudflare, or haven't propagated yet. Run:

dig CNAME <validation_record_name>

If no answer, check Cloudflare. ACM checks every few minutes once records are present.

Database bootstrap failed (instance running but PostgreSQL not available)

SSM into the instance and check the user-data log:

sudo cat /var/log/cloud-init-output.log | tail -100

If the aws secretsmanager get-secret-value call failed (IAM not ready at boot time), increment bootstrap_revision in terraform.tfvars and re-apply. This replaces the instance and re-runs user-data.

Terraform plan shows unexpected replacements

If Terraform wants to replace the database EC2 instance unexpectedly, check:

bootstrap_revision hasn't changed accidentally
AMI data source hasn't resolved to a new AMI (update the AMI ID to pin it)

Can't connect to database from ECS

Verify security group rules (ECS SG → DB SG on port 5432)
Check the POSTGRES_SERVER environment variable in the task definition matches database_private_dns_name output
SSM into the DB host and verify PostgreSQL is listening: sudo ss -tlnp | grep 5432

Worker services not processing tasks

Check Anisette is running (workers depend on it for Apple auth):

aws ecs describe-services \
  --profile glimpse-staging --region eu-west-2 \
  --cluster tracker-restapi-staging \
  --services tracker-restapi-staging-anisette

Verify ANISETTE_SERVER env var in the worker task definition points to http://anisette-v3.anisette-v3.local:6969
Check EFS mount is healthy — if the /data EFS mount fails, the task won't start

Viewing all Terraform outputs

terraform -chdir=infra/envs/staging output
# or a specific value:
terraform -chdir=infra/envs/staging output -raw alb_dns_name

Tracker REST API — Terraform Deployment Guide

Table of Contents

1. Architecture Overview

Key Properties

2. Directory Layout

3. Module Reference

3.1 KMS (modules/kms/)

3.2 Network (modules/network/)

3.3 Security (modules/security/)

3.4 ECR (modules/ecr/)

3.5 ACM (modules/acm/)

3.6 WAF (modules/waf/)

3.7 ALB (modules/alb/)

3.8 Database (modules/database/)

3.9 ECS (modules/ecs/)

3.10 EFS (modules/efs/)

3.11 Logs (modules/logs/)

3.12 Secrets (modules/secrets/)

4. Prerequisites

AWS CLI & Profile

Terraform

Docker Buildx

S3 Backend Bootstrap (one-time, before first terraform init)

5. First-Time Bootstrap

Step 1 — Prepare local config files

Step 2 — Set placeholder image tags

Step 3 — Initialise Terraform

Step 4 — Plan and apply

Step 5 — Add DNS validation records to Cloudflare

Step 6 — Add ALB hostname to Cloudflare

Step 7 — Authenticate with ECR and build images

Step 8 — Update image tags and re-apply

Step 9 — Promote to production

Step 10 — Direct production operations

Step 11 — Run database migrations

Step 12 — Seed default users

6. Daily Developer Workflow

Check current state

View service logs

Connect to a running container (ECS Exec)

Bootstrap an Anisette account with ECS Exec

Connect to the database host

Check service health

Apply infrastructure changes (no image changes)

7. Deploying Container Updates

Build and push a new image

Update the image tag map

Apply — registers new task definition revisions and triggers rolling deploy

If the new image requires a schema migration

Roll back to a previous image tag

8. Running Database Migrations

When to run

Run command

Monitor

Verify success

9. Seeding Default Users

When to run

Run command

Monitor

10. Destroying the Environment

11. Troubleshooting

ECS service stuck in PENDING or tasks keep stopping

ACM certificate stuck in PENDING_VALIDATION

Database bootstrap failed (instance running but PostgreSQL not available)

Terraform plan shows unexpected replacements

Can't connect to database from ECS

Worker services not processing tasks

Viewing all Terraform outputs

3.1 KMS (`modules/kms/`)

3.2 Network (`modules/network/`)

3.3 Security (`modules/security/`)

3.4 ECR (`modules/ecr/`)

3.5 ACM (`modules/acm/`)

3.6 WAF (`modules/waf/`)

3.7 ALB (`modules/alb/`)

3.8 Database (`modules/database/`)

3.9 ECS (`modules/ecs/`)

3.10 EFS (`modules/efs/`)

3.11 Logs (`modules/logs/`)

3.12 Secrets (`modules/secrets/`)

S3 Backend Bootstrap (one-time, before first `terraform init`)