Skip to content

AWS ECS Shape Decision

Purpose

This document defines the target ECS layout for the tracker system migration.

It captures the decisions that are already known, the shape that follows from those decisions, and the places where we still need to make implementation choices before building the staging environment.

Already Decided

Production Data Plane

  • PostgreSQL runs as a Patroni cluster on AWS.
  • The database includes TimescaleDB and PostGIS.
  • Redis uses the AWS-managed Redis equivalent.

Staging Data Plane

  • Staging uses a single-instance database host instead of Patroni.
  • The staging database still needs TimescaleDB and PostGIS support.
  • Staging uses a self-hosted Valkey service on the staging database EC2 host.

These decisions mean ECS is only replacing the application runtime and worker orchestration, not the core database topology in production.

Target ECS Shape

Launch Model

Use ECS on Fargate for the first pass.

Reasoning:

  • The services are mostly stateless containers.
  • Fargate removes host management from the migration.
  • It fits the goal of moving from EC2 Docker host management to a managed control plane.

Service Boundaries

Keep the application split into separate ECS services rather than one large task.

Recommended service layout:

Component ECS Role Exposure Notes
API ECS service Public via ALB Primary application entrypoint
Frontend ECS service or static hosting Public via ALB Keep separate from API so it can scale independently
Admin Panel ECS service or static hosting Public via ALB, restricted if possible Prefer its own deployment boundary
Tracker Fetcher ECS service Private Background worker
Unified Geofence ECS service Private Background worker
Notification Service ECS service Private Background listener/processor
Materialized View Service ECS service Private Background maintenance worker
Docs Prefer CI-published static site Not required in ECS Only keep in ECS if you need runtime generation
ReDoc Prefer CI-published static artifact Not required in ECS Better generated during build than at runtime
Anisette ECS service Private Private support service with EFS-backed persistent state and service discovery

Grouping Strategy

Do not force unrelated services into a single ECS task just to match Compose.

Use these rules:

  • Group containers only when they need localhost communication and share the same lifecycle.
  • Otherwise keep them as separate ECS services.
  • Prefer separate scaling for the API and worker services.
  • Keep frontend and admin separate from the API unless there is a strong operational reason to combine them.

Public

  • API behind an Application Load Balancer
  • Frontend behind an Application Load Balancer
  • Admin panel behind an Application Load Balancer, ideally with tighter access controls

Private

  • Tracker fetcher worker
  • Unified geofence worker
  • Notification service
  • Materialized view service
  • Database
  • Redis equivalent

Networking Model

Internal Service Names

Inside ECS, internal service-to-service traffic should use DNS names or ECS Service Connect names, not Compose service names.

Recommended internal targets:

  • api
  • frontend
  • admin
  • tracker-fetcher
  • unified-geofence
  • notification-service
  • materialized-view-service
  • anisette-v3
  • db
  • redis

Access Pattern

  • External traffic enters through the ALB.
  • ECS services in private subnets talk to the database and Redis over internal networking.
  • Background services do not need public IPs.

What Changes From Compose

Docker-Only Assumptions To Remove

  • Bind mounts for application source code.
  • extra_hosts compatibility hacks.
  • Host port mappings as a deployment mechanism.
  • Shared local volumes for runtime state.

Config That Should Stay

  • Environment-driven hostnames
  • Secret injection through the deployment platform
  • Separate settings for staging and production
  • Health endpoints for app and worker readiness

Staging Shape

Staging Goal

Staging should prove the ECS deployment model with less infrastructure resilience than production, but enough realism to catch application and networking problems.

Staging Layout

  • ECS Fargate for application services.
  • Single-instance PostgreSQL host for the database.
  • Staging Valkey on the database host for cache/queue support.
  • ALB for public app traffic.
  • Private subnets for the database and worker services.

Staging Constraints

  • No Patroni cluster.
  • No multi-node database failover.
  • No production traffic.
  • Production-like image build and deploy flow.

Production Shape

Production Layout

  • ECS Fargate for application services.
  • Patroni cluster with TimescaleDB and PostGIS on AWS.
  • AWS Redis equivalent.
  • ALB in front of public-facing services.
  • Private subnets for workers and data services.

Production Constraints

  • Deployment process must support rollback.
  • Database migrations must be backward compatible where possible.
  • Secrets and config must be environment-specific.

Open Decisions

Frontend and Admin

We still need to decide whether the frontend and admin panel should:

  • stay as ECS services
  • move to static hosting
  • be combined with the API stack behind one ALB

Docs and ReDoc

We still need to decide whether to:

  • keep them as ECS services
  • generate them in CI and publish them as static artifacts
  • serve them through the main web stack

Worker Packaging

We still need to decide whether worker services should:

  • each get their own ECS service
  • share a smaller number of ECS task definitions by domain
  • be grouped by shared scaling profile

My recommendation is to start with one service per worker role, then merge only if operational pressure makes that necessary.

Decision Summary

If we translate the current system directly into ECS with the fewest surprises, the shape should be:

  • ECS Fargate for application runtime
  • ALB for public entrypoints
  • Self-hosted Valkey on the staging database host
  • Private Anisette ECS service with EFS-backed /data
  • Patroni/Timescale/PostGIS in production
  • Single-instance database host in staging
  • Separate private ECS services for the worker processes