Skip to content

Tracker REST API — Terraform Deployment Guide

Single source of truth for the complete AWS infrastructure lifecycle: from first-time bootstrap through daily operations, container rollouts, database migrations, and seeding.


Table of Contents

  1. Architecture Overview
  2. Directory Layout
  3. Module Reference
  4. Prerequisites
  5. First-Time Bootstrap
  6. Daily Developer Workflow
  7. Deploying Container Updates
  8. Running Database Migrations
  9. Seeding Default Users
  10. Destroying the Environment
  11. Troubleshooting

1. Architecture Overview

All services run inside a single VPC (10.40.0.0/24) in eu-west-2 (London). There are no publicly routable IPs on any service; all external traffic enters through the ALB. The database host is reached exclusively via SSM Session Manager.

This guide describes the current staging-shaped stack. For the account-separated staging and production requirements, see Terraform Environment Compatibility.

Internet
   │
   ▼
Cloudflare DNS
(tracker.staging.glimpse.technology)
   │
   ▼
WAF WebACL
   │
   ▼
Application Load Balancer (public subnets)
├── /api/* → ECS API service (port 8000)
├── tracker-admin.* → ECS Admin service (port 80)
└── default → ECS Frontend service (port 80)
   │
   ▼ (private subnets)
ECS Fargate Cluster
├── api              (FastAPI, 512 CPU / 1024 MB)
├── frontend         (Nginx SPA, 256 CPU / 512 MB)
├── admin            (Nginx SPA, 256 CPU / 512 MB)
├── anisette-v3 *    (Anisette server, 256 CPU / 512 MB)
├── tracker-fetcher-2 *
├── unified-geofence *
├── notification-service *
└── materialized-view-service *

* Conditional on enable_anisette / enable_workers

EFS (persistent NFS)
└── /anisette  → anisette-v3 service
└── /data      → worker services

EC2 Database Host (private subnet, t3.medium)
├── PostgreSQL 16 + TimescaleDB + PostGIS  (port 5432)
└── Valkey                                  (port 6379)

Key Properties

  • No SSH: The database EC2 instance has no SSH; access is via aws ssm start-session.
  • No Public IPs: ECS tasks and the database host run with assignPublicIp = DISABLED.
  • IMDSv2 enforced: EC2 metadata service requires signed tokens (hop limit 2).
  • Immutable image tags: ECR is configured with imageTagMutability = IMMUTABLE. Every push requires a new tag.
  • Single KMS key: One CMK encrypts EBS, ECR, EFS, S3 (ALB logs), CloudWatch Logs, Secrets Manager.
  • Managed secrets: PostgreSQL password, Redis/Valkey password, and the app SECRET_KEY are randomly generated by Terraform and stored in Secrets Manager. They are never in .tfvars or environment files.

2. Directory Layout

infra/
├── terraform-guide.md          ← this file
├── infrastructure.md           ← module reference
├── README.md                   ← quick-start summary
├── envs/
│   └── staging/
│       ├── main.tf                        # Module composition — the "wiring"
│       ├── locals.tf                      # resource_prefix, common_tags
│       ├── variables.tf                   # All input variable declarations
│       ├── outputs.tf                     # All stack outputs
│       ├── providers.tf                   # AWS provider (region, profile, default tags)
│       ├── versions.tf                    # terraform >= 1.7, aws ~> 5.0, random ~> 3.6
│       ├── backend.tf                     # S3 backend stub (config loaded from backend.hcl)
│       ├── backend.hcl.example            # Template — copy to backend.hcl, do not commit
│       ├── terraform.tfvars               # Non-secret variable values
│       ├── terraform.tfvars.example       # Template for the above
│       └── image-tags.auto.tfvars.json    # Auto-updated container image tag map
└── modules/
    ├── kms/          # Customer-managed KMS key
    ├── network/      # VPC, subnets, IGW, NAT, flow logs
    ├── security/     # Security groups (ALB, ECS, DB, Cache, EFS)
    ├── ecr/          # ECR repositories (api, frontend, admin, services, anisette)
    ├── acm/          # ACM TLS certificates (DNS validated)
    ├── alb/          # Application Load Balancer, listeners, target groups
    ├── waf/          # WAFv2 with AWSManagedRulesCommonRuleSet
    ├── database/     # EC2 host running PostgreSQL 16 + Valkey (bootstrapped via user-data)
    ├── ecs/          # ECS Fargate cluster, task definitions, services, job definitions
    ├── efs/          # EFS file systems for Anisette and worker persistent storage
    ├── logs/         # CloudWatch log groups for all services and jobs
    └── secrets/      # Secrets Manager secrets with randomly generated passwords

3. Module Reference

3.1 KMS (modules/kms/)

Creates one CMK for the entire stack.

  • Alias: alias/tracker-restapi-staging
  • Annual automatic rotation enabled
  • Key policy grants access to: root account, CloudWatch Logs service, ELB log delivery service
  • Output key_arn is passed to every other module that encrypts data at rest

3.2 Network (modules/network/)

VPC CIDR 10.40.0.0/24 split across two AZs:

Subnet CIDR AZ
Public AZ-a 10.40.0.0/26 eu-west-2a
Public AZ-b 10.40.64.0/26 eu-west-2b
Private AZ-a 10.40.128.0/26 eu-west-2a
Private AZ-b 10.40.192.0/26 eu-west-2b
  • NAT Gateway in Public AZ-a (single NAT for cost; upgrade to per-AZ for HA)
  • VPC Flow Logs → CloudWatch (/aws/vpc/tracker-restapi-staging-flow-logs, 365-day retention)
  • Interface VPC endpoints for ECR, Secrets Manager, CloudWatch Logs, SSM (keeps traffic inside AWS network)

3.3 Security (modules/security/)

Five security groups with least-privilege rules:

Group Inbound Outbound
ALB TCP 80, 443 from 0.0.0.0/0 TCP 80 → ECS SG; TCP 8000 → ECS SG
ECS Service ports from ALB; port 6969 from ECS (Anisette internal) DB 5432; Cache 6379; EFS 2049; HTTPS 443
Database TCP 5432 from ECS SG TCP 80, 443 (apt repos)
Cache/MemoryDB TCP 6379 from ECS SG None
EFS TCP 2049 from ECS SG None

Note: All groups have ignore_changes = [ingress, egress] — manual console changes are preserved across terraform apply.

3.4 ECR (modules/ecr/)

Five repositories, KMS-encrypted, scan-on-push, immutable tags, lifecycle: retain 30 most recent images.

Logical Key Repository
api tracker-api
frontend tracker-frontend
admin tracker-admin
services tracker-services
anisette tracker-anisette

3.5 ACM (modules/acm/)

Two certificates via DNS validation:

  • tracker.staging.glimpse.technology
  • tracker-admin.staging.glimpse.technology

Terraform outputs acm_validation_records — these CNAME records must be added to Cloudflare before the certificate can be issued. The aws_acm_certificate_validation resource waits for validation to complete before proceeding.

3.6 WAF (modules/waf/)

WAFv2 WebACL (regional, attached to ALB):

Priority Rule Blocks
10 AWSManagedRulesCommonRuleSet SQLi, XSS, bad user-agents
20 AWSManagedRulesKnownBadInputsRuleSet Log4Shell, SSRF, malformed input

Logs to aws-waf-logs-tracker-restapi-staging, 365-day retention.

3.7 ALB (modules/alb/)

Internet-facing ALB in both public subnets:

  • Port 80 → redirect 301 to HTTPS
  • Port 443 → route by host/path:
Priority Condition Target Group Container Port
10 Path /api/*, /api/v1/*, /docs, /redoc, /openapi.json api 8000
15 Host tracker-admin.staging.glimpse.technology admin 80
20 Path /admin*, /health* admin 80
Default (all others) frontend 80

Access logs stored in S3 (glimpse-tracker-restapi-staging-alb-logs-{account-id}), 90-day expiry, KMS-encrypted. Deletion protection enabled.

3.8 Database (modules/database/)

Single EC2 instance (t3.medium) in the first private subnet. No RDS — the host bootstraps via user-data on first boot.

Software installed via user-data:

  • PostgreSQL 16 + postgresql-contrib
  • TimescaleDB 2 for PostgreSQL 16
  • PostGIS 3
  • Valkey (Redis-compatible, from Redis 7 lineage)

Bootstrap sequence:

  1. Installs packages from official PostgreSQL apt repo
  2. Fetches postgres_password and redis_password from Secrets Manager
  3. Writes postgresql.conf:
  4. shared_preload_libraries = 'timescaledb,pg_stat_statements'
  5. max_connections = 100
  6. shared_buffers = 256MB
  7. effective_cache_size = 768MB
  8. Writes pg_hba.conf — scram-sha-256 auth, VPC CIDR only
  9. Creates role tracker and database tracker
  10. Enables extensions: timescaledb, postgis, postgis_topology, pg_stat_statements
  11. Configures Valkey: binds 0.0.0.0, port 6379, password auth, AOF persistence

Forcing a re-bootstrap: Increment bootstrap_revision in terraform.tfvars. This replaces the EC2 instance (new AMI + re-runs user-data). Use with caution — all data is on the instance's EBS volume and will be lost unless backed up first.

Access: No SSH. Use SSM:

aws ssm start-session \
  --profile glimpse-staging \
  --region eu-west-2 \
  --target $(terraform -chdir=infra/envs/staging output -raw database_instance_id)

3.9 ECS (modules/ecs/)

Fargate cluster with Container Insights enabled.

IAM roles:

  • Execution role — pulls images from ECR, fetches secrets from Secrets Manager, writes logs to CloudWatch
  • Task role — SSM exec access, EFS mount permissions (when enabled)

Long-running services:

Service Image CPU Mem Port ALB
api tracker-api:{tag} 512 1024 MB 8000 Yes
frontend tracker-frontend:{tag} 256 512 MB 80 Yes
admin tracker-admin:{tag} 256 512 MB 80 Yes
anisette * tracker-anisette:{tag} 256 512 MB 6969 No (private DNS)
tracker-fetcher-2 * tracker-services:{tag} 256 512 MB No
unified-geofence * tracker-services:{tag} 256 512 MB No
notification-service * tracker-services:{tag} 256 512 MB No
materialized-view-service * tracker-services:{tag} 256 512 MB No

* Conditional on enable_anisette / enable_workers

One-off job task definitions (pre-created, not auto-run):

Job Image Command
migrations tracker-api:{tag} bash ./scripts/run_migrations.sh
seed-users tracker-api:{tag} python ./scripts/seed_default_users.py

Jobs use the same ECS cluster, security group, and subnets as services. Their task definitions are updated whenever the api image tag changes.

Secrets injected as environment variables (via Secrets Manager valueFrom):

  • POSTGRES_PASSWORD — fetched at task start, not baked into image
  • SECRET_KEY — app signing key
  • REDIS_PASSWORD — Valkey auth token

3.10 EFS (modules/efs/)

Created when enable_workers || enable_anisette:

Instance Root Dir Used By
anisette_storage /anisette anisette-v3 service
worker_storage /data tracker-fetcher-2 service

Access points use UID/GID 1000, permissions 0750. Mount targets in both private subnets. KMS-encrypted.

3.11 Logs (modules/logs/)

CloudWatch log group per service and per job, all KMS-encrypted, 365-day retention.

Group Path
api /aws/ecs/tracker-restapi-staging/api
frontend /aws/ecs/tracker-restapi-staging/frontend
admin /aws/ecs/tracker-restapi-staging/admin
anisette /aws/ecs/tracker-restapi-staging/anisette
migrations /aws/ecs/tracker-restapi-staging/migrations
seed-users /aws/ecs/tracker-restapi-staging/seed-users
tracker-fetcher-2 /aws/ecs/tracker-restapi-staging/tracker-fetcher-2
unified-geofence /aws/ecs/tracker-restapi-staging/unified-geofence
notification-service /aws/ecs/tracker-restapi-staging/notification-service
materialized-view-service /aws/ecs/tracker-restapi-staging/materialized-view-service

3.12 Secrets (modules/secrets/)

Three secrets created with randomly generated 32-character passwords (mixed case, digits, specials):

Logical Key Secret Name in Secrets Manager
postgres_password tracker/staging/postgres-password
secret_key tracker/staging/secret-key
redis_password tracker/staging/redis-password

Passwords are generated once by Terraform. They are never stored in .tfvars, .env, or any repository file. Applications and the database bootstrap script fetch them from Secrets Manager at runtime.


4. Prerequisites

AWS CLI & Profile

Configure an AWS CLI profile named glimpse-staging:

aws configure --profile glimpse-staging
# AWS Access Key ID: ...
# AWS Secret Access Key: ...
# Default region: eu-west-2
# Default output: json

Verify access:

aws sts get-caller-identity --profile glimpse-staging

Terraform

Version >= 1.7.0. Install via tfenv or directly:

tfenv install 1.7.0
tfenv use 1.7.0
terraform --version

Docker Buildx

Required for multi-platform builds:

docker buildx version
# If missing: docker buildx install

S3 Backend Bootstrap (one-time, before first terraform init)

The S3 bucket and DynamoDB lock table must exist before Terraform can store state. Create them manually or with a bootstrap script:

aws s3api create-bucket \
  --profile glimpse-staging \
  --region eu-west-2 \
  --bucket glimpse-staging-tracker-restapi-tfstate \
  --create-bucket-configuration LocationConstraint=eu-west-2

aws s3api put-bucket-versioning \
  --profile glimpse-staging \
  --bucket glimpse-staging-tracker-restapi-tfstate \
  --versioning-configuration Status=Enabled

aws dynamodb create-table \
  --profile glimpse-staging \
  --region eu-west-2 \
  --table-name glimpse-staging-tracker-restapi-tflock \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST

5. First-Time Bootstrap

Step 1 — Prepare local config files

cd infra/envs/staging

# Backend config (never commit this file — contains account-specific paths)
cp backend.hcl.example backend.hcl

Edit backend.hcl:

bucket         = "glimpse-staging-tracker-restapi-tfstate"
key            = "staging/terraform.tfstate"
region         = "eu-west-2"
profile        = "glimpse-staging"
dynamodb_table = "glimpse-staging-tracker-restapi-tflock"
encrypt        = true

Edit or verify terraform.tfvars:

project_name    = "tracker-restapi"
environment     = "staging"
aws_region      = "eu-west-2"
vpc_cidr        = "10.40.0.0/24"
public_hostname = "tracker.staging.glimpse.technology"
admin_hostname  = "tracker-admin.staging.glimpse.technology"

enable_anisette = true
enable_workers  = true

secret_names = {
  postgres_password = "tracker/staging/postgres-password"
  secret_key        = "tracker/staging/secret-key"
  redis_password    = "tracker/staging/redis-password"
}

Step 2 — Set placeholder image tags

Before any images exist in ECR, use the bootstrap tag so task definitions are valid:

cat > image-tags.auto.tfvars.json <<'EOF'
{
  "image_tags": {
    "api":      "sha-bootstrap",
    "frontend": "sha-bootstrap",
    "admin":    "sha-bootstrap",
    "services": "sha-bootstrap",
    "anisette": "sha-bootstrap"
  }
}
EOF

Step 3 — Initialise Terraform

terraform init -backend-config=backend.hcl

Step 4 — Plan and apply

terraform plan -out=tfplan
terraform apply tfplan

The first apply provisions everything: KMS, VPC, subnets, security groups, ECR, ACM certificates, WAF, ALB, the database EC2 instance, EFS, log groups, Secrets Manager secrets, and ECS task definitions. Services do not start successfully until images exist in ECR (see next steps).

Expected apply time: 12–18 minutes (ACM DNS validation is the longest step).

Step 5 — Add DNS validation records to Cloudflare

After apply completes, get the validation records:

terraform output -json acm_validation_records
terraform output -json acm_admin_validation_records

Each record has name, type (CNAME), and value. Add both to Cloudflare for glimpse.technology. ACM validates automatically once the records propagate (usually < 5 minutes).

Step 6 — Add ALB hostname to Cloudflare

terraform output -raw alb_dns_name

Create a CNAME in Cloudflare:

  • tracker.staging<alb_dns_name>
  • tracker-admin.staging<alb_dns_name> (or a CNAME of the above)

Set proxy mode to DNS only (grey cloud) to avoid Cloudflare interfering with ALB SSL termination.

Step 7 — Authenticate with ECR and build images

For a shorter developer-oriented summary of this workflow, see Deployment Pipeline.

make staging-build

This command:

  • builds the current commit once
  • pushes the immutable tag to the staging ECR
  • mirrors the same tag into the production ECR
  • writes infra/envs/staging/image-tags.auto.tfvars.json

Or use the script directly:

python scripts/build_staging_images.py \
  --push \
  --aws-profile glimpse-staging \
  --mirror-aws-profile glimpse-prod \
  --target api

If you need the lower-level docker buildx command, the scripts pass these variables through. They set the registry prefix to the target environment's ECR host and pin the image with the immutable sha-<commit> tag.

Step 8 — Update image tags and re-apply

Apply Terraform in staging after the build script has refreshed the tag map:

terraform -chdir=infra/envs/staging apply -auto-approve

This registers new task definition revisions pointing at the real image tags. ECS rolling-deploys each service automatically (50% min healthy, 200% max).

Step 9 — Promote to production

Once staging is validated, promote the exact same tag map to production:

make prod-promote
terraform -chdir=infra/envs/prod apply -auto-approve

This keeps production on the same commit-tested image set as staging without rebuilding.

Step 10 — Direct production operations

make prod-build, make prod-plan, and make prod-apply remain available for manual production workflows, but the normal release path is staging-first promotion.

Step 11 — Run database migrations

On first deploy, run migrations before traffic reaches the API:

# Get required values from Terraform outputs
CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)
TASK_DEF=$(terraform -chdir=infra/envs/staging output -raw migration_task_definition_arn)
SUBNETS=$(terraform -chdir=infra/envs/staging output -json private_subnet_ids | jq -r 'join(",")')
SG=$(terraform -chdir=infra/envs/staging output -raw ecs_security_group_id)

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor the migration logs:

aws logs tail /aws/ecs/tracker-restapi-staging/migrations \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

Step 12 — Seed default users

SEED_TASK_DEF=$(terraform -chdir=infra/envs/staging output -json ecs_task_definition_arns | jq -r '."seed-users"')

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$SEED_TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor:

aws logs tail /aws/ecs/tracker-restapi-staging/seed-users \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

6. Daily Developer Workflow

Check current state

terraform -chdir=infra/envs/staging show | grep -A2 "image_tag"
# or
cat infra/envs/staging/image-tags.auto.tfvars.json
# or
cat infra/envs/prod/image-tags.auto.tfvars.json

View service logs

# API service
aws logs tail /aws/ecs/tracker-restapi-staging/api \
  --profile glimpse-staging --region eu-west-2 --follow

# Worker (swap group name as needed)
aws logs tail /aws/ecs/tracker-restapi-staging/tracker-fetcher-2 \
  --profile glimpse-staging --region eu-west-2 --follow

Connect to a running container (ECS Exec)

CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)

# Find the task ID
TASK_ID=$(aws ecs list-tasks \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --service-name tracker-restapi-staging-api \
  --query 'taskArns[0]' --output text | awk -F/ '{print $NF}')

aws ecs execute-command \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task "$TASK_ID" \
  --container api \
  --command "/bin/bash" \
  --interactive

Bootstrap an Anisette account with ECS Exec

Use ./scripts/bootstrap_anisette_account.sh when you need to create or refresh the stored Anisette account for the worker stack.

The script starts a temporary Fargate task from the tracker-fetcher-2 task definition with ECS Exec enabled, opens an interactive shell in the container, and then runs the bootstrap command inside that container.

By default it targets the staging Terraform root. If you export AWS_PROFILE=glimpse-prod, it will default to infra/envs/prod unless you override TERRAFORM_DIR explicitly.

For production use, the prod Terraform root must already have the worker services and Anisette enabled so the script can read ecs_task_definition_arns, ecs_security_group_id, private_subnet_ids, and anisette_service_url from state.

./scripts/bootstrap_anisette_account.sh

What the script does:

  1. Reads the staging Terraform outputs from infra/envs/staging.
  2. Starts a one-off tracker-fetcher-2 task with --enable-execute-command.
  3. Waits for the ECS Exec agent to become ready.
  4. Opens /bin/sh in the running container.
  5. Inside the shell, exports:
  6. ANISETTE_SERVER
  7. ACCOUNT_STORE_PATH
  8. Runs:
python scripts/authenticate_findmy.py

The default account file path is /data/account.json. If the bootstrap succeeds, the task can optionally trigger a fresh deployment of the worker service so it picks up the new account state.

Requirements:

  • aws, jq, and terraform available on the machine running the script
  • AWS credentials with permission to run ECS tasks, use ECS Exec, and stop the temporary task
  • ECS Exec enabled for the task definition and cluster
  • enable_anisette = true in the target Terraform environment

If you need to run the same steps manually, use the same temporary worker task pattern rather than the long-running API service. The bootstrap script depends on the container image that includes scripts/authenticate_findmy.py.

Connect to the database host

DB_INSTANCE=$(terraform -chdir=infra/envs/staging output -raw database_instance_id)

aws ssm start-session \
  --profile glimpse-staging \
  --region eu-west-2 \
  --target "$DB_INSTANCE"

# Once inside:
sudo -u postgres psql -d tracker

Check service health

# List all running tasks
aws ecs list-tasks \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster tracker-restapi-staging

# Describe specific service
aws ecs describe-services \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster tracker-restapi-staging \
  --services tracker-restapi-staging-api

Apply infrastructure changes (no image changes)

terraform -chdir=infra/envs/staging plan
terraform -chdir=infra/envs/staging apply

7. Deploying Container Updates

Build and push a new image

make staging-build

Update the image tag map

cat infra/envs/staging/image-tags.auto.tfvars.json

Or edit infra/envs/staging/image-tags.auto.tfvars.json directly:

{
  "image_tags": {
    "api": "sha-abc1234",
    "frontend": "sha-abc1234",
    "admin": "sha-abc1234",
    "services": "sha-abc1234",
    "anisette": "sha-bootstrap"
  }
}

Apply — registers new task definition revisions and triggers rolling deploy

terraform -chdir=infra/envs/staging apply -auto-approve

ECS rolls each service to the new task definition. Deployment settings: 50% minimum healthy, 200% maximum — so new tasks start before old ones stop.

If the new image requires a schema migration

Run migrations before or immediately after applying the new image tag. The migration job uses the same image as the API service:

# (re-run Step 9 from the bootstrap section)

Roll back to a previous image tag

ECR tags are immutable but old revisions remain. Update image-tags.auto.tfvars.json to the previous tag and re-apply:

jq '.image_tags.api = "sha-previoussha"' \
  infra/envs/staging/image-tags.auto.tfvars.json > /tmp/t.json \
  && mv /tmp/t.json infra/envs/staging/image-tags.auto.tfvars.json

terraform -chdir=infra/envs/staging apply -auto-approve

8. Running Database Migrations

Migrations run as a Fargate one-off task using the migrations job task definition. The task runs bash ./scripts/run_migrations.sh inside the tracker-api container.

When to run

  • After the first deploy (bootstrap)
  • After any deploy that includes a schema-altering change
  • After manually rolling back to a previous API version (check if a down-migration is needed)

Run command

CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)
TASK_DEF=$(terraform -chdir=infra/envs/staging output -raw migration_task_definition_arn)
SUBNETS=$(terraform -chdir=infra/envs/staging output -json private_subnet_ids | jq -r 'join(",")')
SG=$(terraform -chdir=infra/envs/staging output -raw ecs_security_group_id)

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor

aws logs tail /aws/ecs/tracker-restapi-staging/migrations \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

Verify success

Check the task exit code:

aws ecs describe-tasks \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --tasks <TASK_ARN>

A stopCode of EssentialContainerExited with exitCode: 0 means success.


9. Seeding Default Users

Seeds initial users/roles needed for the application to be usable. Runs python ./scripts/seed_default_users.py inside the tracker-api container.

When to run

  • Once after the first deploy
  • When adding a new environment that needs baseline data

Run command

CLUSTER=$(terraform -chdir=infra/envs/staging output -raw ecs_cluster_arn)
SEED_TASK_DEF=$(terraform -chdir=infra/envs/staging output -json ecs_task_definition_arns | jq -r '."seed-users"')
SUBNETS=$(terraform -chdir=infra/envs/staging output -json private_subnet_ids | jq -r 'join(",")')
SG=$(terraform -chdir=infra/envs/staging output -raw ecs_security_group_id)

aws ecs run-task \
  --profile glimpse-staging \
  --region eu-west-2 \
  --cluster "$CLUSTER" \
  --task-definition "$SEED_TASK_DEF" \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SG],assignPublicIp=DISABLED}"

Monitor

aws logs tail /aws/ecs/tracker-restapi-staging/seed-users \
  --profile glimpse-staging \
  --region eu-west-2 \
  --follow

10. Destroying the Environment

The ALB has deletion protection enabled. Terraform will fail if you try to destroy without first disabling it.

# Step 1: Remove ALB deletion protection
terraform -chdir=infra/envs/staging apply \
  -target=module.alb.aws_lb.this \
  -var="enable_deletion_protection=false"

# Step 2: Destroy everything
terraform -chdir=infra/envs/staging destroy

Warning: Destroying the database module permanently deletes the EC2 instance and its EBS volume. Take a manual snapshot or pg_dump backup first if the data is needed.


11. Troubleshooting

ECS service stuck in PENDING or tasks keep stopping

  1. Check the service events:
aws ecs describe-services \
  --profile glimpse-staging --region eu-west-2 \
  --cluster tracker-restapi-staging \
  --services tracker-restapi-staging-api \
  --query 'services[0].events[0:5]'
  1. Check container logs for the stopped task — tasks log even when they exit immediately.

  2. Common causes:

  3. Image tag doesn't exist in ECR → push the image or fix the tag in image-tags.auto.tfvars.json
  4. Secret ARN wrong → terraform output secret_arns and compare with task definition
  5. Not enough capacity in subnets → unlikely with Fargate but check VPC endpoints

ACM certificate stuck in PENDING_VALIDATION

DNS CNAME records haven't been added to Cloudflare, or haven't propagated yet. Run:

dig CNAME <validation_record_name>

If no answer, check Cloudflare. ACM checks every few minutes once records are present.

Database bootstrap failed (instance running but PostgreSQL not available)

SSM into the instance and check the user-data log:

sudo cat /var/log/cloud-init-output.log | tail -100

If the aws secretsmanager get-secret-value call failed (IAM not ready at boot time), increment bootstrap_revision in terraform.tfvars and re-apply. This replaces the instance and re-runs user-data.

Terraform plan shows unexpected replacements

If Terraform wants to replace the database EC2 instance unexpectedly, check:

  • bootstrap_revision hasn't changed accidentally
  • AMI data source hasn't resolved to a new AMI (update the AMI ID to pin it)

Can't connect to database from ECS

  1. Verify security group rules (ECS SG → DB SG on port 5432)
  2. Check the POSTGRES_SERVER environment variable in the task definition matches database_private_dns_name output
  3. SSM into the DB host and verify PostgreSQL is listening: sudo ss -tlnp | grep 5432

Worker services not processing tasks

  1. Check Anisette is running (workers depend on it for Apple auth):
aws ecs describe-services \
  --profile glimpse-staging --region eu-west-2 \
  --cluster tracker-restapi-staging \
  --services tracker-restapi-staging-anisette
  1. Verify ANISETTE_SERVER env var in the worker task definition points to http://anisette-v3.anisette-v3.local:6969

  2. Check EFS mount is healthy — if the /data EFS mount fails, the task won't start

Viewing all Terraform outputs

terraform -chdir=infra/envs/staging output
# or a specific value:
terraform -chdir=infra/envs/staging output -raw alb_dns_name