Skip to content

Root Cause: Geofence Event Type Bug

Problem Summary

Tracker 2645 has hundreds of ENTRY events at storage location 85 but no corresponding status_history records. This means the events were created via _create_geofence_event_only() instead of _update_tracker_status_atomically().

Evidence from Database

Events WITHOUT corresponding status history:
- 349289: entry at STORAGE 85 on 2025-11-20 13:19:20 - NO status_history
- 348184: entry at STORAGE 85 on 2025-11-20 11:48:21 - NO status_history
... (hundreds more)

Events WITH corresponding status history:
- 425497: exit on 2025-11-29 22:38:59 → status_history: IN_TRANSIT ✅
- 383737: exit on 2025-11-28 08:29:48 → status_history: IN_TRANSIT ✅

The Bug

In _determine_event_type()

def _determine_event_type(
    self, old_status: Optional[TrackerStatus], new_status: TrackerStatus
) -> GeofenceEventType:
    if old_status in [None, TrackerStatus.CREATED, TrackerStatus.IN_TRANSIT]:
        if new_status in [TrackerStatus.DELIVERED, TrackerStatus.IN_STORAGE]:
            return GeofenceEventType.ENTRY  # ✅ Correct
    elif old_status in [TrackerStatus.DELIVERED, TrackerStatus.IN_STORAGE]:
        if new_status == TrackerStatus.IN_TRANSIT:
            return GeofenceEventType.EXIT  # ✅ Correct
        elif new_status in [TrackerStatus.DELIVERED, TrackerStatus.IN_STORAGE]:
            return GeofenceEventType.ENTRY  # ❌ BUG!

    return GeofenceEventType.ENTRY  # ❌ DEFAULT IS WRONG

The Problem:

  • When old_status = IN_STORAGE and new_status = IN_STORAGE (no change)
  • The function returns GeofenceEventType.ENTRY
  • This causes _create_geofence_event_only() to create an ENTRY event
  • But it should create a DWELL event since the status didn't change!

In _create_geofence_event_only()

def _create_geofence_event_only(...):
    # Determine event type based on whether we're in a geofence
    if delivery_location_id or storage_location_id:
        # Staying in same geofence (DWELL)
        event_type = GeofenceEventType.DWELL  # ✅ Correct
    else:
        # Staying in transit (TRANSIT)
        event_type = GeofenceEventType.TRANSIT  # ✅ Correct

This function correctly creates DWELL/TRANSIT events, but it's only called when status DOESN'T change.

The Flow That Created The Bug

  1. Location report arrives for tracker 2645 at storage location 85
  2. Status determination: Tracker is in storage 85, determines new_status = IN_STORAGE
  3. Status comparison: current_status = IN_STORAGE, new_status = IN_STORAGE
  4. Decision: Status hasn't changed, call _handle_no_status_change()
  5. Event creation: _create_geofence_event_only() creates DWELL event ✅
  6. BUT WAIT: The geofence events show "entry" not "dwell"!

Wait... Let me re-check

Looking at the actual data more carefully:

Events WITHOUT corresponding status history:
event_id | tracker_id | event_type | timestamp | location
349289   | 2645       | entry      | 2025-11-20 13:19:20 | STORAGE: 85

These say event_type = 'entry' but have no status_history!

This means:

  1. The service determined status was changing (IN_TRANSIT → IN_STORAGE)
  2. It called _update_tracker_status_atomically()
  3. The function created the geofence event
  4. But the status_history creation FAILED
  5. The transaction rolled back status_history but NOT the geofence event

THIS VIOLATES ATOMICITY!

The Actual Bug: Non-Atomic Transaction

Looking at _update_tracker_status_atomically():

try:
    tracker.current_status = new_status
    tracker.current_state_start = location_report.timestamp

    # Create status history entry
    status_history.create(db, obj_in=status_create)

    # Create geofence event
    self._create_geofence_event(...)

    # Commit all changes atomically
    db.commit()
    return True
except Exception as e:
    db.rollback()
    return False

The bug must be:

  • The geofence event gets created and committed
  • But somehow the status and status_history don't get committed
  • This suggests the events are being created in a DIFFERENT transaction or session

Location Report Processing

Looking at process_location_reports_batch():

# Process each report in its own transaction to isolate failures
for report in location_reports:
    try:
        with get_db_context() as report_db:
            report_db.add(report)
            result = await self._process_single_report(report_db, report)
    except Exception as e:
        # Continue processing other reports

AH HA! Each report gets its own database session!

If the geofence_event was created in one session/transaction, and then the code tried to update the tracker in a different session, that could cause this!

But wait, they're both in the same _process_single_report() call, so same session...

The Real Culprit: Deferred Reports

Looking at the timestamps:

  • Event timestamp: 2025-11-20 13:19:20
  • Event created_at: 2025-11-27 01:00:46

7 DAYS LATER!

These are backfill/deferred processing! When reports are processed much later, the tracker status has already changed to something else.

So:

  1. Location report arrives on 2025-11-20 showing tracker at storage 85
  2. Service isn't running or fails to process
  3. 7 days later, backfill runs
  4. By now, tracker is in IN_TRANSIT (exited storage days ago)
  5. System correctly determines: "this old report shows storage, but tracker is now in transit"
  6. System creates ENTRY event to record historical fact
  7. But doesn't update current status because tracker has moved on

This is actually correct behavior for backfill! The bug is that the original real-time processing failed.

Summary

Not a code bug - the geofence service code is working as designed!

The root cause: Service downtime or processing failures caused reports to be processed days/weeks late via backfill, which correctly creates events but doesn't update "current" status for old data.

Solution: The SQL sync script fixes the current inconsistency, and we need to:

  1. Ensure the service stays running (monitoring/alerting)
  2. Process location reports in real-time to avoid backfill
  3. Run periodic SQL sync as a safety net