How CARN detects missing persons from the air using tri-model AI fusion, agentic tool orchestration, and SAR doctrine.
CARN runs a tri-model fusion pipeline on every video frame. A COCO-pretrained YOLOv8m base model detects all 9 SAR-relevant object classes, while two specialist models — RGB (fine-tuned on VisDrone) and thermal (fine-tuned on BIRDSAI) — boost person recall from aerial and infrared perspectives. Overlapping person detections are merged via NMS deduplication (IoU > 0.5 keeps the higher confidence), while non-person SAR classes pass through from the COCO model. In SAR, a missed detection can cost a life, so we optimize for recall over precision with a 0.15 confidence threshold.
Base64 JPEG from live drone feed, webcam, or uploaded video
80-class base model filtered to 9 SAR classes
Person, Car, Boat, Truck, Bus, Bicycle, Backpack, Handbag, SuitcaseYOLOv8s fine-tuned on VisDrone aerial imagery (42K annotations)
Person class onlyYOLOv8s fine-tuned on BIRDSAI infrared data (34K annotations)
Person class onlySpecialist person boxes merged with COCO person boxes. Overlapping detections (IoU > 0.5) keep the higher confidence score. Non-person classes pass through directly.
640px tiles with 20% overlap detect small/distant persons that full-frame inference misses
Persistent IDs across frames so the same person/vehicle is not re-counted
Operators toggle which SAR classes to display in real-time
Bounding boxes with class labels and confidence scores on the live feed
GPS-tagged detections on MapLibre with 4-tier confidence visualization
3-tier alerts (Critical/High/Medium) with push notifications and audio cues
Retrained on NVIDIA Tesla V100 (Tensorix.ai) with expanded VisDrone data. Precision +25% and recall +21% over v1 baseline. Dense aerial crowds remain challenging but the COCO base model compensates via tri-model fusion.
Exceeds all targets. BIRDSAI thermal aerial data matches SAR use case exactly — drone-mounted infrared over wilderness terrain.
Tianjin University
Aerial RGB images from drones over 14 Chinese cities at varying altitudes
LILA BC
Thermal infrared footage from conservation drones in African protected areas
Every detection is assigned a confidence tier that determines the alert severity. SAR doctrine demands recall over precision — we never discard a detection, even at low confidence.
Blocking modal alert with urgent alarm. Requires immediate confirm/reject from operator.
Persistent banner notification with chime. High priority for review.
Toast notification, auto-dismiss. Worth investigating if in priority search area.
Logged for post-mission review. Background noise level, but SAR doctrine says never discard.
CARN Intelligence is an agentic AI command centre powered by Claude Sonnet 4.5 via the Anthropic SDK. A multi-turn tool-use loop (max 3 iterations) lets Claude orchestrate 7 tools — querying the live database, planning missions, generating flight paths, reviewing detections, and producing operational briefings. All through natural language with real database operations, not simulated responses.
| Tool | Operation |
|---|---|
| plan_mission | ISRID search area + boustrophedon flight path |
| update_mission_status | Validate status transition chain |
| show_cases_on_map | Query active cases with LKP coordinates |
| show_detections_on_map | Filter by confidence + mission |
| generate_briefing | Full operational sitrep with recommendations |
| confirm_detection | Human-in-loop detection review |
| show_mission_on_map | Render search area + flight path overlay |
The Intelligence API streams events in real-time as tools execute. The frontend consumes these via fetch().body.getReader() and renders each step as an animated row with status, description, and execution time.
thinkingClaude is processing the query
tool_startTool execution begins
tool_resultTool completed with summary and duration
responseFinal message with text, mapData, actions
doneStream complete, connection closed
errorError with message, stream terminates
Streams dual RGB + thermal video via RTMP to the CARN server. Enterprise (Matrice 30T) or consumer (Mavic Mini 3 Pro) hardware.
Receives RTMP on port 1935, transcodes to HLS for low-latency browser playback
Next.js 16 + React 19 dashboard with HLS stream and real-time frame extraction for inference
COCO base + RGB + thermal specialists running with SAHI slicing, FP16, NMS dedup, and ByteTrack tracking
Detection persistence with GPS coords, WAL mode
Real-time broadcast to all connected clients
PWA and mobile push notifications
Claude Sonnet 4.5 with 7 tool definitions for mission planning, status management, detection review, and operational briefings via SSE streaming
Critical (>80%), High (60-80%), Medium (40-60%) with blocking modals, banners, toasts, and audio cues
Operator confirms or rejects detections. Confirmed locations dispatched to field teams via push notifications.
End-to-end latency from frame capture to rescue team notification. Every second matters in SAR — the pipeline is optimized for speed at every stage.
For the Claude Code Hackathon (Feb 10-16, 2026), consumer FPV hardware was used to demonstrate the system. In production, CARN supports enterprise drones (DJI Matrice 30T, M300 RTK) with thermal imaging, RTMP streaming, and 40+ minute flight times.
The same AI detection pipeline works seamlessly with both consumer and enterprise hardware.
Consumer drone with FPV video transmission to goggles
Goggles output via USB-C to HDMI adapter, then HDMI capture card to laptop
Captures from capture card and creates virtual webcam source for CARN
Selects OBS Virtual Camera from device dropdown, extracts frames for inference
Runs tri-model fusion on live frames with real-time bounding box overlay