Skip to content

Location Aggregation Regression Fix

Problem Description

A code regression was identified in the location aggregation system where the database continuous aggregates reverted to using the old logic instead of the previously implemented fixes. This regression affected two critical aspects:

  1. Position Aggregation Method: The system reverted to using centroid averaging (ST_Centroid(ST_Union())) instead of using the most recent location for each tracker
  2. China Location Filtering: The geographic filter to exclude China locations (where trackers are manufactured) was missing

Impact

  • Incorrect Position Data: Trackers with multiple location reports were showing averaged positions instead of their actual last known location
  • Manufacturing Noise: Location data from China (where trackers are manufactured) was being included in the aggregated data, creating misleading location history

Root Cause

The regression was caused by the scripts/recreate_continuous_aggregates.sql file containing the old logic. This script likely got executed during database maintenance or restoration, overwriting the previously applied fixes.

Files Affected

Fixed Files

  1. scripts/recreate_continuous_aggregates.sql - Updated with correct logic
  2. scripts/verify_location_aggregation_fixes.sql - New verification script

Files with Correct Logic (Already Fixed)

  • scripts/apply_location_aggregation_fix.sql
  • scripts/create_timescaledb_views.sql
  • scripts/fix_database_views_simple_join.sql
  • scripts/fix_orphaned_location_reports_cleanup.sql
  • scripts/fix_orphaned_location_reports_final.sql
  • scripts/truncate_location_reports.sql

Technical Details

Before Fix (Incorrect Logic)

-- OLD: Centroid averaging - WRONG
ST_Centroid(ST_Union(lr.location::geometry)) AS location

-- Missing China filter
WHERE lr.confidence >= 2

After Fix (Correct Logic)

-- NEW: Most recent location - CORRECT
(array_agg(lr.location ORDER BY lr.timestamp DESC))[1] AS location

-- With China exclusion filter
WHERE lr.confidence >= 2
    -- Exclude China (approximate bounding box)
    -- China bounds: 73.6°E to 134.8°E longitude, 18.2°N to 53.5°N latitude
    AND NOT (
        ST_X(lr.location::geometry) BETWEEN 73.6 AND 134.8
        AND ST_Y(lr.location::geometry) BETWEEN 18.2 AND 53.5
    )

Verification Steps

1. Check Current Database State

Run the verification script to check if the fixes are properly applied:

PGPASSWORD=SuperSecretKey psql -U tracker --host 192.168.100.1 --port 5432 -d tracker -f scripts/verify_location_aggregation_fixes.sql

2. Apply Fixes if Needed

If the verification script shows errors, apply the comprehensive fix:

PGPASSWORD=SuperSecretKey psql -U tracker --host 192.168.100.1 --port 5432 -d tracker -f scripts/apply_location_aggregation_fix.sql

3. Verify Fix Application

Re-run the verification script to confirm all fixes are applied:

PGPASSWORD=SuperSecretKey psql -U tracker --host 192.168.100.1 --port 5432 -d tracker -f scripts/verify_location_aggregation_fixes.sql

Expected Verification Output

When fixes are properly applied, you should see:

NOTICE:  GOOD: China exclusion filter is present in location_history_hourly
NOTICE:  GOOD: Most recent location logic is present in location_history_hourly
NOTICE:  GOOD: Old centroid averaging logic is not present in location_history_hourly
NOTICE:  SUCCESS: location_history_hourly has all required fixes
NOTICE:  GOOD: Most recent location logic is present in location_history_daily
NOTICE:  GOOD: Old centroid averaging logic is not present in location_history_daily
NOTICE:  SUCCESS: location_history_daily has all required fixes
NOTICE:  DATA CHECK: Total location reports with confidence >= 2: [number]
NOTICE:  DATA CHECK: China location reports (should be excluded): [number]
NOTICE:  SAMPLE: Most recent location - Tracker: [id], Time: [timestamp], Coords: ([lon], [lat])
NOTICE:  GOOD: Sample location is outside China bounds
NOTICE:  === VERIFICATION COMPLETE ===

Prevention Measures

1. Script Consistency

All database scripts that create or recreate continuous aggregates must use the correct logic. The following scripts have been verified and updated:

  • scripts/recreate_continuous_aggregates.sql - FIXED
  • scripts/create_timescaledb_views.sql - Already correct
  • scripts/apply_location_aggregation_fix.sql - Already correct

2. Regular Verification

Include the verification script in regular database health checks:

# Add to monitoring/health check scripts
./scripts/verify_location_aggregation_fixes.sql

3. Code Review Process

  • All database scripts that modify location_history_* views must be reviewed for:
  • Use of most recent location logic: (array_agg(lr.location ORDER BY lr.timestamp DESC))[1]
  • China exclusion filter: ST_X(lr.location::geometry) BETWEEN 73.6 AND 134.8
  • Absence of centroid averaging: No ST_Centroid(ST_Union())

Database Objects Affected

Continuous Aggregates

  • location_history_hourly - Hourly aggregated location data
  • location_history_daily - Daily aggregated location data

Views

  • location_history_view - Combined view of hourly and daily data
  • location_history - Materialized view for API consumption

Functions

  • refresh_location_history() - Function to refresh the materialized view

API Impact

The location aggregation fixes affect the following API endpoints:

  • GET /api/v1/trackers/{id}/location-history - Returns tracker location history
  • GET /api/v1/trackers/{id} - Includes location history in tracker details
  • GET /api/v1/storage-locations/{id}/trackers - Shows trackers near storage locations
  • GET /api/v1/delivery-locations/{id}/trackers - Shows trackers near delivery locations

Testing

After applying fixes, test the following scenarios:

  1. Single Location Tracker: Verify location matches the actual report
  2. Multiple Location Tracker: Verify location is the most recent, not averaged
  3. China Manufacturing Data: Verify China locations are excluded from history
  4. API Responses: Verify API endpoints return correct location data

Monitoring

Monitor the following metrics to detect future regressions:

  • Location history data consistency
  • Absence of China coordinates in aggregated data
  • Tracker positions matching most recent reports (not averaged positions)