Continuous Aggregates Recovery

This document explains how to recover the TimescaleDB continuous aggregates that were lost during database restoration.

Background

The tracker application uses TimescaleDB continuous aggregates to efficiently store and query location history data. These continuous aggregates are:

location_history_hourly - Aggregates location data by hour
location_history_daily - Aggregates hourly data by day
location_history - A materialized view that combines data from both hourly and daily views

When the database is restored from a backup, these continuous aggregates may be lost, resulting in errors like:

Error refreshing views: relation "location_history_hourly" does not exist

Solution

Two scripts have been created to help recreate these continuous aggregates:

scripts/recreate_continuous_aggregates.sql - SQL script that defines the continuous aggregates
scripts/run_recreate_continuous_aggregates.py - Python script that executes the SQL and refreshes the views

The scripts have been updated to include the mac_address field that was added to the trackers table.

How to Use

To recreate the continuous aggregates, run the following command:

python scripts/run_recreate_continuous_aggregates.py

This will:

Create the location_history_hourly and location_history_daily continuous aggregates
Set up the appropriate policies for refreshing and retention
Create the location_history_view that combines data from both aggregates
Create the necessary functions and triggers for refreshing the views
Refresh the continuous aggregates with recent data

Verification

After running the script, you can verify that the continuous aggregates were created successfully by running:

python scripts/verify_continuous_aggregates.py

This script will:

Check if all continuous aggregates exist
Verify that they contain data
Confirm that the mac_address field is included in the views
Check if the continuous aggregate policies are set up correctly

Environment Configuration

Both scripts use the standard .env file for database connection settings. The following environment variables are required:

POSTGRES_SERVER=db
POSTGRES_USER=tracker
POSTGRES_PASSWORD=your_password
POSTGRES_DB=tracker
POSTGRES_PORT=5432

If you're running in a test environment, the scripts will automatically use the .env.test file instead.

Docker Environment Detection

The scripts automatically detect whether they are running inside a Docker container or on the host machine:

When running inside a Docker container, the scripts will use the db hostname as specified in the .env file.
When running outside of Docker (on the host machine), the scripts will automatically use localhost instead of db for the database hostname.

This automatic detection allows the scripts to work both inside and outside of Docker without requiring any configuration changes.

If you need to override this behavior, you can set the RUNNING_IN_DOCKER environment variable:

# Force the script to use the Docker hostname even when running outside Docker
RUNNING_IN_DOCKER=true python scripts/run_recreate_continuous_aggregates.py

Troubleshooting

If you encounter any issues, check the logs for error messages. Common issues include:

Permission errors: Ensure the database user has the necessary permissions to create and modify tables and views.
Missing TimescaleDB extension: Ensure the TimescaleDB extension is installed and enabled in the database.
Transaction errors: Some TimescaleDB operations cannot be run inside a transaction block. The script uses autocommit=True to avoid this issue.
Connection issues: Verify that the database connection settings in the .env file are correct.

If you continue to experience problems, you may need to manually run the SQL commands in scripts/recreate_continuous_aggregates.sql one by one to identify the specific issue.

Maintenance

The continuous aggregates are automatically refreshed according to the policies set in the SQL script:

location_history_hourly is refreshed every hour
location_history_daily is refreshed every day

You can manually refresh the aggregates by running the run_recreate_continuous_aggregates.py script again.