TimescaleDB Features
This project uses TimescaleDB to efficiently store and query time-series data. TimescaleDB is an extension to PostgreSQL that provides specialized features for time-series data.
Overview
TimescaleDB is used in this project for:
- Hypertables: Partitioning time-series data for efficient storage and querying
- Continuous Aggregates: Pre-computing aggregated data for faster queries
- Materialized Views: Storing pre-computed query results for efficient access
- Geospatial Functions: Working with location data using PostGIS integration
Database Schema
The following tables use TimescaleDB features:
location_reports: Raw location data from trackers, stored in a hypertablelocation_history_hourly: Hourly aggregated location data (continuous aggregate)location_history_daily: Daily aggregated location data (continuous aggregate)location_history: Materialized view combining recent hourly data with older daily data
Applying TimescaleDB Features
Some TimescaleDB features (like continuous aggregates and materialized views) cannot be created inside a transaction. These features are not included in the regular Alembic migrations and need to be applied separately.
In ECS, the database migration task now runs scripts/run_migrations.sh, which applies alembic upgrade head and then executes scripts/apply_timescaledb_features.sh so the schema bootstrap is fully automated.
Prerequisites
Before applying TimescaleDB features, ensure that:
- The TimescaleDB extension is installed in your PostgreSQL database
- The regular database migrations have been applied
- PostgreSQL client tools (
psql) are installed if you're using the provided script
Using the Script
A script is provided to apply TimescaleDB features:
# With Docker (requires PostgreSQL client tools in the API container)
docker compose exec api ./scripts/apply_timescaledb_features.sh
# Without Docker (requires PostgreSQL client tools)
./scripts/apply_timescaledb_features.sh
If PostgreSQL client tools are not available in your environment, you can run the SQL script directly in the PostgreSQL container:
# With Docker, run directly in the PostgreSQL container
docker compose exec db psql -U postgres -d tracker -f /path/to/scripts/create_timescaledb_views.sql
For AWS ECS staging, the repeatable path is the helper script:
./scripts/run_staging_migrations.sh
That wrapper runs the staging migrations ECS task, waits for it to stop, and prints the CloudWatch logs for the migration container.
What the Script Does
The script performs the following operations:
- Creates a hypertable for the
location_reportstable using thetimestampcolumn for partitioning - Creates a continuous aggregate for hourly location data (
location_history_hourly) - Creates a continuous aggregate for daily location data (
location_history_daily) - Creates a view that combines recent hourly data with older daily data (
location_history_view) - Creates a materialized view based on that view (
location_history) - Creates functions for geofence matching and materialized view refreshing
Best Practices
The database schema follows these best practices:
- Using
TEXTinstead ofVARCHARfor string columns - In PostgreSQL, there's no performance difference between TEXT and VARCHAR(n)
- Both use the same internal storage mechanism
- TEXT is more flexible if requirements change (no need to alter column length)
- VARCHAR(n) is appropriate when there's a specific need to enforce a length constraint
- Using
TIMESTAMPTZ(timestamp with time zone) instead ofTIMESTAMPfor timestamp columns - TIMESTAMPTZ stores the time zone information, which is important for accurate time-series data
- This prevents issues with daylight saving time and time zone conversions
- Including the partitioning column (
timestamp) in the primary key of hypertables - This is required by TimescaleDB for efficient partitioning
- It allows for better query performance on time-based queries
- Using continuous aggregates for efficient querying of aggregated time-series data
- Pre-computed aggregates significantly improve query performance
- TimescaleDB automatically updates these aggregates as new data arrives
- Using materialized views for efficient access to pre-computed query results
- Materialized views store the results of complex queries
- They can be refreshed periodically to include new data
Maintenance
Refreshing Materialized Views
The materialized views need to be refreshed periodically to include new data. A function is provided for this purpose:
SELECT refresh_location_history();
This function can be called manually or scheduled to run periodically using a cron job or similar mechanism.
Geofence Matching
A function is provided for matching tracker locations with geofences:
SELECT match_geofences();
This function:
- Finds trackers with location history in the last 24 hours
- Checks if they are within a storage or delivery location geofence
- Updates the tracker status if they have been in a geofence for a sufficient amount of time
- Adds a status history entry for the tracker
This function can also be called manually or scheduled to run periodically.
Troubleshooting
Common Issues
-
"function create_hypertable does not exist": This error indicates that the TimescaleDB extension is not installed or not enabled in the database. Ensure that the extension is installed and enabled.
-
"relation does not exist": This error indicates that the tables haven't been created yet. Run the Alembic migrations first to create the tables.
-
"CREATE MATERIALIZED VIEW ... WITH DATA cannot run inside a transaction block": This error occurs when trying to create a continuous aggregate or materialized view inside a transaction. Use the provided script to apply these features outside of a transaction.
Checking TimescaleDB Status
To check if TimescaleDB is installed and enabled:
SELECT * FROM pg_extension WHERE extname = 'timescaledb';
To check the version of TimescaleDB:
SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';
To list all hypertables:
SELECT * FROM timescaledb_information.hypertables;
To list all continuous aggregates:
SELECT * FROM timescaledb_information.continuous_aggregates;