TimescaleDB Features
This project uses TimescaleDB to efficiently store and query time-series data. TimescaleDB is an extension to PostgreSQL that provides specialized features for time-series data.
Overview
TimescaleDB is used in this project for:
- Hypertables: Partitioning time-series data for efficient storage and querying
- Continuous Aggregates: Pre-computing aggregated data for faster queries
- Materialized Views: Storing pre-computed query results for efficient access
- Geospatial Functions: Working with location data using PostGIS integration
Database Schema
The following tables use TimescaleDB features:
location_reports: Raw location data from trackers, stored in a hypertablelocation_history_hourly: Hourly aggregated location data (continuous aggregate)location_history_daily: Daily aggregated location data (continuous aggregate)location_history: Materialized view combining recent hourly data with older daily data
Applying TimescaleDB Features
Some TimescaleDB features (like continuous aggregates and materialized views) cannot be created inside a transaction. These features are not included in the regular Alembic migrations and need to be applied separately.
Prerequisites
Before applying TimescaleDB features, ensure that:
- The TimescaleDB extension is installed in your PostgreSQL database
- The regular database migrations have been applied
- PostgreSQL client tools (
psql) are installed if you're using the provided script
Using the Script
A script is provided to apply TimescaleDB features:
# With Docker (requires PostgreSQL client tools in the API container)
docker compose exec api ./scripts/apply_timescaledb_features.sh
# Without Docker (requires PostgreSQL client tools)
./scripts/apply_timescaledb_features.sh
If PostgreSQL client tools are not available in your environment, you can run the SQL script directly in the PostgreSQL container:
# With Docker, run directly in the PostgreSQL container
docker compose exec db psql -U postgres -d tracker -f /path/to/scripts/create_timescaledb_views.sql
What the Script Does
The script performs the following operations:
- Creates a hypertable for the
location_reportstable using thetimestampcolumn for partitioning - Creates a continuous aggregate for hourly location data (
location_history_hourly) - Creates a continuous aggregate for daily location data (
location_history_daily) - Creates a view that combines recent hourly data with older daily data (
location_history_view) - Creates a materialized view based on that view (
location_history) - Creates functions for geofence matching and materialized view refreshing
Best Practices
The database schema follows these best practices:
- Using
TEXTinstead ofVARCHARfor string columns - In PostgreSQL, there's no performance difference between TEXT and VARCHAR(n)
- Both use the same internal storage mechanism
- TEXT is more flexible if requirements change (no need to alter column length)
- VARCHAR(n) is appropriate when there's a specific need to enforce a length constraint
- Using
TIMESTAMPTZ(timestamp with time zone) instead ofTIMESTAMPfor timestamp columns - TIMESTAMPTZ stores the time zone information, which is important for accurate time-series data
- This prevents issues with daylight saving time and time zone conversions
- Including the partitioning column (
timestamp) in the primary key of hypertables - This is required by TimescaleDB for efficient partitioning
- It allows for better query performance on time-based queries
- Using continuous aggregates for efficient querying of aggregated time-series data
- Pre-computed aggregates significantly improve query performance
- TimescaleDB automatically updates these aggregates as new data arrives
- Using materialized views for efficient access to pre-computed query results
- Materialized views store the results of complex queries
- They can be refreshed periodically to include new data
Maintenance
Refreshing Materialized Views
The materialized views need to be refreshed periodically to include new data. A function is provided for this purpose:
SELECT refresh_location_history();
This function can be called manually or scheduled to run periodically using a cron job or similar mechanism.
Geofence Matching
A function is provided for matching tracker locations with geofences:
SELECT match_geofences();
This function:
- Finds trackers with location history in the last 24 hours
- Checks if they are within a storage or delivery location geofence
- Updates the tracker status if they have been in a geofence for a sufficient amount of time
- Adds a status history entry for the tracker
This function can also be called manually or scheduled to run periodically.
Troubleshooting
Common Issues
-
"function create_hypertable does not exist": This error indicates that the TimescaleDB extension is not installed or not enabled in the database. Ensure that the extension is installed and enabled.
-
"relation does not exist": This error indicates that the tables haven't been created yet. Run the Alembic migrations first to create the tables.
-
"CREATE MATERIALIZED VIEW ... WITH DATA cannot run inside a transaction block": This error occurs when trying to create a continuous aggregate or materialized view inside a transaction. Use the provided script to apply these features outside of a transaction.
Checking TimescaleDB Status
To check if TimescaleDB is installed and enabled:
SELECT * FROM pg_extension WHERE extname = 'timescaledb';
To check the version of TimescaleDB:
SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';
To list all hypertables:
SELECT * FROM timescaledb_information.hypertables;
To list all continuous aggregates:
SELECT * FROM timescaledb_information.continuous_aggregates;